Medical knowledge graph is the cornerstone of smart medical care, and it is expected to bring more efficient and accurate medical services; however, existing knowledge graph construction technologies generally have problems in the medical field, such as low efficiency, many limitations, and poor scalability.
In view of the characteristics of cross-language, strong professionalism, and complex structure of medical data, this paper focuses on a bottom-up comprehensive analysis of the key technologies for building medical knowledge graphs, covering medical knowledge representation, knowledge extraction, knowledge fusion and knowledge reasoning and Knowledge quality assessment consists of five parts.
1. Knowledge modeling
That is, the data model of the knowledge map is established, and the data model of the industry knowledge map defines the structure of the entire knowledge map, so it is necessary to ensure reliability.
1. Common methods
Convert based on existing industry standards.
Mapping from existing high-quality industry data sources such as business system database tables.
2. Use knowledge graphs to abstractly model data
Taking entities as the main target, it realizes the mapping and merging of data from different sources. (entity extraction and merging)
Attributes are used to represent entity descriptions in different data sources to form a comprehensive description of entities. (attribute mapping and merge)
Relationships are used to describe the associations between various types of abstractly modeled data into entities, thereby supporting association analysis. (relationship extraction)
Through entity linking technology, the associative storage of various types of data around entities is realized. (entity link)
Use the event mechanism to describe the dynamic development in the objective world, reflecting the relationship between events and entities; and use the time sequence to describe the development of events. (dynamic event description)
3. Modeling tool Protégé
Ontology editor.
Based on RDF(S), OWL and country email list other semantic web specifications.
Graphical interface.
An online version is available - WebProtégé.
Suitable for prototyping scenarios.
The acquisition of knowledge
Extract knowledge from data of different sources and structures to form knowledge and store it in the knowledge graph.
1. D2R tools for obtaining structured data
D2RQ: A platform for converting relational databases into virtual RDF databases, mainly including:
D2R Server: HTTP Server, which provides a query access interface for RDF data for upper-layer RDF browsers, SPARQL query clients and traditional HTML browsers to call;
D2RQ Engine: Use a customizable D2RQ Mapping file to convert data in relational databases into RDF format;
D2RQ Mapping Language: Defines the mapping rules for converting relational data into RDF format.
2. Semi-structured industry data source analysis
Configure corresponding wrappers for data of different structures.
Wrapper Configuration Tool:
input source settings;
preprocessing configuration;
Extract target configuration;
Extraction process configuration;
post-processing of results;