fig1

Enhanced multi-tuple extraction for materials: integrating pointer networks and augmented attention

Figure 1. Workflow and framework of the proposed model for extracting and allocating entities. The upper section of the figure presents the workflow, beginning with the retrieval of full-text research articles from Elsevier, followed by the construction of a specialized corpus. Sentences are then extracted and annotated to obtain a JSON-formatted dataset, and the process concludes with model training and inference. The model is primarily composed of two components: entity extraction and entity allocation. (A) Entity Extraction: This component integrates MatSciBERT and a pointer network. MatSciBERT first tokenizes the input sentence and generates vector representations for each token. The pointer network then computes the probability of each token serving as the head or tail of a specific entity, thereby identifying entities based on these probabilities; (B) Entity Allocation: This component assesses whether entities of different types belong to the same tuple using an entity matching score matrix. During model inference, the matching likelihood of entities in corresponding order can be enhanced by multiplying the diagonal elements of the matrix by a parameter; (C) Entity Matching Score Matrix: Each element of the matrix represents a combination of six vectors. The first two vectors correspond to the vector representations of the two entities, while the remaining four vectors are derived from two attention mechanisms: intra-entity and inter-entity attention. Intra-entity attention focuses on the attention distribution among different types of entities, whereas inter-entity attention concentrates on the attention distribution within the same type of entity; (D) Inference: An example of four-tuple extraction is illustrated.

Journal of Materials Informatics
ISSN 2770-372X (Online)
Follow Us

Portico

All published articles are preserved here permanently:

https://www.portico.org/publishers/oae/

Portico

All published articles are preserved here permanently:

https://www.portico.org/publishers/oae/