top of page

Knowledge Graphs

All knowledge graphs in the SemantEx system have the same basic structure and in common with most graph databases this means they have 'nodes' and 'links' (or 'edges'). The 'links' are relationships and the nodes are 'things'. In SemantEx documentation they are termed 'concepts' but 'entities' would be another appropriate term. So if the system reads "The cat sat on the mat", it will create two concepts, 'cat' and 'mat' and link them with a relationship labelled 'sat'. After reading a document the graphs can be quite complex. 

A dictionary definition for 'heart' is shown in the diagram below, which is compiled from perhaps 20 sentences. (The different coloured boxes highlight different grammar structures within the sentences). Clearly filtering, searching and link extraction techniques are required to make sense of the information in a large graph and examples of these in action can be found on the project pages.

heartDefn.png

Built on the basic node-link architecture of the knowledge graph in the SemantEx system, are other structures which reflect the system's focus on language processing. One of the most important of these is the use-mention distinction of words. The word 'cat' is a concept as well as a particular cat. Further than this of course is the possibility that there may be many instances of 'cat' in a dialog or text. The SemantEx system always creates an 'instance' of a concept, linked to the concept itself. Use- mention is further extended by the particular-general distinction, there are things that we might put in a knowledge graph that are about 'cats in general' as opposed to a particular instance of a cat. This is handled by the multiple graph structure of the SemantEx system.

Document structure is also automatically recorded in the SemantEx system (as far as this is available, but at least at the level of sentences and paragraphs). Concepts in texts are directly related to each other in language as described above, but also can be relevant to other concepts because of their proximity in a document, or co-occurrence under the same 'topic'. 

All individual text entries in the graph are recorded in an 'audit' section of the graph and all concept and link entries are timestamped. This assists with system management and traceability as well as temporal filtering and sequencing.

 

Other information is specifically graphed to assist in the language processing, most recently and most often mentioned concepts are used for pronoun resolution for instance (who do we mean by 'they' for instance) 

bottom of page