Welcome to the CB-NL Wiki » Introduction to the CB-NL

Introduction to the CB-NL

Last modified by Arjan Loeffen on 2015/05/29 14:19

Background

In 2012, the BIR decided to start the development of the CB-NL: the Dutch concept library. This decision was based on the pilot phase in which the need and objective, but also the configuration, requirements and workplan for the CB-NL were determined. This was reported in the document “Nederlandse conceptenbibliotheek (CB-NL) Hoofdrapportage Pilotfase”.

Technical approach

The Nederlandse Conceptenbibliotheek (CB-NL) is a library or collection of concepts for the building knowledge domain. This library is in fact an ontology, which defines concepts and semantic relations between these concepts.

The starting point for CB-NL is OWL2. CB-NL conforms with ISO 12006-3 and ISO 16354.

One aspect of the CB-NL ontology is its application as a regular controlled vocabulary: it contains terms and definitions in Dutch and English that users of the CB-NL agree to use. These terms function as labels or symbols for the concepts in the library. But this is only a part of the ontology. It also contains a taxonomy: the concepts are structured in a hierarchy. In this taxonomy concepts can be a subclass of more than one concept. The ontology also contains non-hierarchical relations between concepts, making it possible to capture all kinds of knowledge about the world.

The terms vocabulary and ontology are often confused and/or used to mean the same or similar things. Usually, the term controlled vocabulary is used for simple, usually informal collections of terms, while the term ontology is most often used when referring to more complex and formal (i.e. with a strictly applied structure) collections of terms and their interrelations (see W3C Vocabularies).

In this document, we use the terms to mean distinct things. The controlled vocabulary is seen as the simplest form of a collection of terms, to which a dictionary, taxonomy and ontology each  add a layer of information and complexity.

A vocabulary is a collection of terms  [ISO 16354] that users from a domain agree to use, or in other words, a set of domain-dependent predicates and functions that provide the basis for the statement of facts [Brachman et al 2004, p32, 209].: all the things that play a role in some domain, are named in the vocabulary. A vocabulary is just that: a list of names (labels) of things. Part of the CB-NL ontology is a vocabulary. The vocabulary contains terms in Dutch and/or English. 

A dictionary enriches the terms from a vocabulary with descriptions of their meanings.

A taxonomy is an ordering of concepts in a tree-like structure. The concepts have a name (vocabulary) and meaning (dictionary). A taxonomy is organized hierarchically, with the most general concepts at the top and the more specialized ones further down; a specialization hierarchy of classes. [Brachman et al 2004, p172]. The top concepts of the hierarchy are predefined in CB-NL and described later on in this guideline (see Top-level CB-NL).

An ontology, as the term is used in the field of Knowledge Representation, is a catalogue of the kinds of objects (constants, functions, relations) that are important in a domain (i.e. a sphere of knowledge, influence, or activity – Merriam Webster dictionary), the properties those objects will be thought to have, and the relationships among them. The kinds of objects in an ontology are also called concepts. The relationships in an ontology are not necessarily hierarchic. In CB-NL, relationships between concepts such as behaviour (function) are used for their definition.

Summarizing: 

  • Vocabulary: an agreed list of names of things
  • Dictionary: vocabulary + gives definitions to named things
  • Taxonomy: vocabulary + dictionary + orders things in a classification hierarchy
  • Ontology: vocabulary + dictionary + taxonomy + adds relationships and properties to things

The CB-NL contains names of things, definitions, a taxonomy, and relationships between concepts, making it an ontology.

Within the CB-NL ontology, many local decisions have been made. A Modeling Guideline is available which clarifies how the specification is constructed. The Modeling Guide is an elaboration of the modeling principles as described in the report of the pilot phase. Since then it has been updated based on new insights from the work being done to fill the CB-NL with concepts.

Quick intro to CB-NL terms

For a good understanding of CB-NL and this wiki it is important to explain a few key concepts. 

The first terms that need explanation are "Concept" and "Concept library". In the CB-NL, concepts are common things with their definitions, that exist within the Building and Infrastructure domains. For example, the concept 'Bridge'. In the CB-NL this is not a description of a physically existing bridge, but a commonly holding definition, which can in principle be true for all possible bridges in the world. In the CB-NL such a thing is called a 'concept' and not an 'object' Bridge, because the term 'object' is commonly associated with tangible things. For the same reason the CB-NL is a 'concept library' and not an 'object library'. The latter would by a lot of people be understood as something like a product catalogue. 

In the CB-NL, these concepts are ordered in a structure: a taxonomy. In this structure each concept has one or more higher, more abstract parent concepts (supertypes) and one or more lower, more concrete child concepts (subtypes). The supertype of 'Brug', for example, is
'Overbrugging' and subtypes of 'Brug' are, for example, 'Beweegbare brug', 'Vaste brug', 'Verkeersbrug' en 'Spoorbrug'. 

An important aspect of the CB-NL is that a subtype inherits all characteristics from its supertype. Besides these inherited characteristics, subtypes always have a characteristic that distinguishes it from its supertype, i.e. that makes it different. For example, 'Verkeersbrug' differs from its supertype 'Brug' in that it has the application 'Wegverkeer'.  This distinguishing characteristic is called a 'discriminator' in CB-NL. 

Of course, there is an end to the levels of supertypes that exist. The highest, most abstract levels are together called the 'Top level'. 

Concepts in the top level hold definitions in natural language. This is because such concepts are axiomatic: they cannot be explained in a systematic way, but must be understood to understand all lower level concepts. For example:

  • Activity : An activity is something happening or changing.

Lower level concepts do not have natural language definitions within the CB-NL. Rather, their definition is the accumulation of their relations and discriminators. For example:

  • Pump is a Flow moving device and has the discriminating property Function: Pumping. Flow moving device is a Generic product and has the discriminating property Function: Strengthen (aanwakkeren). Pumping  is a Action

These definitions may be created automatically from the concept properties, they are not part of the CB-NL.

Why OWL?

OWL is a complex specification, which may be hard to grasp at first. The decision to adhere to OWL in stead of for example SKOS is twofold.

  1. OWL allows concepts to be declared (as classes) and instances of these concepts to be created. Thus, properties of a particular bridge may be recorded (instance level), and then linked to the Bridge concept supplied by the CB-NL (class level). Assuming both are specified in a related technical manner, systems may certify that our bridge properties conform to those specified by the CB-NL.
  2. OWL supports inferencing, which is an interesting basis for validation of all statements that concern the concepts of the CB-NL. Using standard tools and techniques, founded on a solid standard on ontology construction,  we can certify that the CB-NL is actually inherently coherent and correct.

Core and contexts

The CB-NL Core is filled with concepts from a number of pre-existing sources. Semantic Concepts (SC), Cheobs, OTL Rijkswaterstaat, IMGeo and ETIM are primary sources for the information that is found in CB-NL. We call these libraries contexts.

Contexts are external concept libraries that are mapped to the CB-NL. These mappings are relatively simple. A concept in a context is either the same as an concept in the CB-NL, or a subtype of a CB-NL concept. It may also be said to be "related" to a CB-NL concept; however such relations are weak and should occur sparsely.

The mapping between a context and the core allows users and systems to associate things that are modeled in particular systems (CAD systems, building inventories, parts catalogs) to things that are common to all these systems. As such, a link between two or more systems may be created, while these systems do not explicitly link to each other. This is exactly the role of a translator, and CB-NL core in this sense may be seen as a translator in a communication. It is a servant in the background, allowing separate systems and models to co-operate.

While the CB-NL evolves into a mature and fully defined ontology, it already allows different users to bridge the gap between models and systems, a gap which currently exists and poses a problem. The technical bridge is made by an end-point for the OWL ontology. The "human" bridge is available as a web-based viewer that allows users to navigate through contexts and core.

Tags:
Created by Linda van den Brink on 2014/09/18 09:32

This wiki is licensed under a Creative Commons 2.0 license
XWiki Enterprise 5.3 - Documentation