Welcome to the CB-NL Wiki » Modeling Guide » Modeling Approach

Modeling Approach

Last modified by Linda van den Brink on 2015/04/29 09:09

4.1 Introduction

The modeling approach has three main components: (1) the modeling language chosen, (2) a set of modeling guidelines for using that language and (3) a top-level part of CB-NL with extra guidelines of using that top-level in the elaboration of CB-NL.

The choice of language is the first thing, but many modeling guidelines have to be specified on top of that. Example: the fact that we have ‘concepts’ and give them names follows from the chosen modeling language, but the fact that we start concept names with a Capital is a modeling guideline. The top level is part of CB-NL core, consisting of a small number of very generic concepts, not being part of the language but of the CB-NL contents, meant to kick-start and further guide the expert modelers.

The proposed modeling approach, as a combination of language, guidelines and CB-NL specific top-level, covers all prerequisites and principles we specified in the Requirements chapter. The Dutch contractors who work together in CB-NL have defined this modeling approach as a basis for information exchange.

This chapter describes the chosen modeling language and contains the rules for modeling concepts in CB-NL. The definition of CB-NL will be done fully complaint tot these rules. Deviations will lead to mutually incompatible parts of CB-NL.

The CB-NL top level is described in chapter Top level CB-NL.

4.2 Modeling Language

In CB-NL the Web Ontology Language (OWL) is used as the base modeling language. To be more precise, the language used is OWL Version 2 [OWL2]. This language is a good fit for the requirements and principles described in the chapter Requirements. Also, OWL is an emerging technology that is being adopted by more and more organizations.

This section describes the use of OWL as the CB-NL modeling language. OWL is developed by the World Wide Web Consortium (W3C) and is a mathematically/logically very well-defined, international and fully generic open standard for modeling ontologies (and hence ‘concept libraries’). It covers about 90% of our CB-NL requirements, while not even all the available OWL2 language constructs are needed. Only a subset of the OWL semantics is used in CB-NL; what this subset is will be elaborated below. OWL has various, mutually equivalent, syntax forms or ‘serializations’. We will use the most elegant and user-friendliest of them: Turtle.

The 10% of our language requirements not covered by OWL can be accommodated by defining a so-called upper ontology covering those missing items that are included/imported in any further CB-NL compliant end-user ontology. But before defining our own constructs, we reuse elements from existing upper ontologies as much as possible, such as Dublin Core DC, and SKOS (for recording definitions, notes, alternative labels etc).

CB-NL tightly connects to two other, secondary, worlds of library modeling from the construction domain. Both use other ways of modeling:

  • bSI’s IFD/bSDD
  • The Dutch COINS BIM (CBIM) initiative

Links to these worlds are in principle bi-directional. In future, we want to make CB-NL OWL ontologies available as IFD and/or COINS libraries and, the other way round, we want to be able to reuse existing information structures from these worlds as content in CB-NL. Links to these and other ontologies are specified in separate LinkSets. This ensures that the CBNL is not polluted with properties to numerous external resources.

4.3 Introduction Web Ontology Language (OWL)

OWL is the key result of the so-called ‘Semantic Web Activity’ within the World Wide Web Consortium (W3C), the organization also responsible for the open web standards that drive the World Wide Web on top of the Internet (HTTP, URI, HTML, XML, XSD, XPath, XQuery, etc. etc.).

As the name suggests it is a, fully open, international and generic, standard for modeling ontologies defined as abstract, simplified views of a part of the world that we wish to represent for some purpose. What in the construction domain is historically called “Libraries” (knowledge libraries, object libraries, product libraries, concept libraries) or even Dictionaries (like bSI’s bSDD) are in fact in OWL-speak “Ontologies”.

The basic principle of the semantic web according to its inventor Tim Berners-Lee is three-fold:

  • Anyone can say anything about anything
  • No one knows everything about anything
  • My system is most valuable because of its interconnection to its peers

These principles fit very nicely with the positioning of CB-NL in relation to other ontologies, as described in chapter 2, although we do not want to put the principle to work to the extreme: we want to incorporate many views in CB-NL, but in a controlled way. We also know these views will never be complete but will grow in time. Finally, the concepts in CB-NL become more meaningful when related to other views on the same concept.

The principle of the semantic web and its use of ontologies result in some differences with traditional modeling in EXPRESS, NIAM, UML etc. Users of these traditional modeling languages need to adjust their way of thinking when using OWL. We will therefore briefly explain the differences of this new school of modeling below.

4.3.1 Open World Assumption (OWA)

The basic idea of the OWA is that things not said/stated are “unknown”, instead of “not true/false”. In traditional modeling the Closed World Assumption (CWA) is used where things are assumed “false” if not stated “true”. If a CWA database does not state that I have children it is assumed I have no children, while in OWA I could have children but it is unknown. Other information could be provided at a later date or by someone else, adding to the knowledge about me. This approach, which is well suited for dynamic concept libraries/ontologies such as the CB-NL, results in an ever-growing and flexible knowledge base.

4.3.2 Classes/Sets instead of Types

Some object-oriented modeling approaches allow you to subtype in such a way that some super type properties are not valid anymore. In the semantic web approach, types are replaced by more rigorous classes or sets with a very clear semantic meaning. One consequence is that a subclass always inherits ALL properties of its superclass (since the subclass stands for a subset of the individuals of the superclass).

4.3.3 Properties are “first class citizens” (just like Classes)

Traditionally many modeling languages start with classes (or entities, or types, or..) as main modeling construct. Properties are assigned to these classes/entities/types as secondary concepts. In OWL, Classes and Properties are on equal footing and can be flexibly combined. Combined with OWA this means that any property can be in principle a property of any class unless constrained explicitly. 

4.3.4 Properties also cover Relationships

Traditionally modeling languages split between attribute types and relationship types. In OWL these are both properties: data type properties when the range of a property is a data type, e.g. text, a number or a boolean (these can be compared with the traditional attribute types) and object properties when the range of a property is not a data type but another class, referencing individuals of that class.

4.3.5 Individuals/instances first

All traditional modeling language start by making information structures and then instantiating them with instances/individuals which should be valid against that structure: classes come first. In the Semantic Web this is often not the typical way. Ontologies are developed, but the link between instances and the classes in the ontologies is often derived instead of explicitly stated: instances are “classified” based on their properties. However, the focus of CB-NL is the modeling of an ontology and not the individuals complying to this ontology. 

Besides these principles we have to discuss some special characteristics:

4.3.6 Triples as information atoms

All ontologies and data sets are basically sets of triples, and this is made explicit in RDF/OWL. Every triple functions as a kind of information atom that is, in principle, independent from any other triple. This makes it easier to split and merge ontologies and to have alternative views on the same domain. You could say that triples never change, they are just added or deleted. Merging ontologies is done by adding all triples together, deleting the double ones and making sure the result is consistent in itself.

4.3.7 Identification via globally unique web-based Uniform Resource Identifiers (URIs)

Identifiers are globally unique but not globally decided. Every party/authority can have its own namespace where their knowledge about concepts and individuals resides, using their own URI scheme. Explicit modeling constructs are available to say that concepts or individuals from different namespaces are actually the same or not the same (if this is not stated explicitly, the OWA applies and it is “unknown”).

4.4 OWL2 Subset used

OWL2 is a layered language itself: OWL uses RDFS (and RDF) and RDFS is using RDF. Defining a subset of OWL2 means that we actually define a subset of RDF, RDFS and OWL2 constructs.

cbnlowllayeredlanguage.png
Figure: OWL as a layered language

Important: In order to be able to verify the consistency of the CB-NL, we will use only those constructs that can be verified by an open source reasoner. Currently we support the reasoners that are shipped with Protege, i.e. Fact++ and Hermit as well as Pellet. OWL constructs of which the consistency cannot be verified, are not used in CB-NL.

Below are the constructs we use in the CB-NL:

  • owl:Class, reflecting our need to describe container definitions for individuals of ‘objects’.
  • owl:AnnotationProperty, reflecting our needs to add metadata and labels
  • owl:ObjectProperty, reflecting our needs for ‘interrelationships’. Currently, in CB-NL this is only used to relate classes to their discriminating characteristics. (in future versions of CB-NL this can also used e.g. for predefining decomposition)
  • owl:Restriction, reflecting our needs for defining ‘constraints’ on properties making use of constructs like owl:onProperty, owl:hasValue
  • owl: versionInfo, reflecting our need to specify the version of the construct
  • rdfs:subClassOf, reflecting our needs for ‘specialization’
  • rdfs:label reflecting our need to ‘name’ things. We combine this with skos:altLabel to be able to discriminate alternative names from preferred names

4.5 OWL2 extension for concept modeling

In addition to RDFS and OWL, we use a small number of additional upper ontologies. All of these are maintained by standards organizations or come from big, stable organizations. The upper ontologies used are:

  • dcterms: Dublin Core Terms, by DCMI.
    • dcterms:created to record date of creation
    • dcterms:creator to record identity that created the resource
    • dcterms:source for the identifier that links the concept to the source, if the concept was extracted from a pre-existing source
    • dcterms:subject, to refer to the name of a collection
    • dcterms:modified to record the date the concept was created or modified
    • dcterms:Agent to record owners of concepts as instances
    • dcterms:rightsHolder to record the owner of a concept. The dcterms:rightsHolder is an dcterms:Agent; the rightsholder for the core and each context is made available as an instance in advance. 
  • skos: we use a subset of the SKOS annotation properties:
    • skos:definition for the definition of the resource
    • skos:altLabel for alternative labels of the resource. The preferred name is recorded in rdfs:label. See: http://www.w3.org/TR/skos-reference/#labels
    • skos:note for internal comments relating to use cases and sources of concepts. These notes are not published in public releases.

For some constructs no ontology was yet found that could be reused, therefore we have created our own.

  • annotation property :status, for the status of the concept. Statuses are instances of the EnumerationClass "Status"
  • Enumeration Class 'Collection'
  • properties that define the type of discriminator value. These are owl:ObjectProperties. The property filler is the discriminator value which is of type owl:Class

4.6 MODELING GUIDELINES

4.6.1 Introduction

This chapter contains a more detailed description of the previous sections, with specific rules for the CB-NL ontology regarding its vocabulary, taxonomy, and modeling of discriminating properties.

Together with the language defined earlier in this chapter, modeling guidelines will enable/support the right ‘quality’ of CB-NL in terms of consistency and completeness. Besides using the same language, all collaborating expert modelers ‘filling’ CB-NL (under the top-level) should follow these guidelines for the best, compatible results.

The section on Design patterns shows how OWL is used in the CB-NL Core.

Sources for these rules are several existing standards:

  • ISO 16354: Guidelines for knowledge libraries and object libraries
  • ISO 19150-2 (draft): Geographic information - Ontology - Part 2: Rules for developing ontologies in the Web Ontology Language (OWL)

4.6.2 Guidelines

4.6.2.1 Classes

CB-NL consists of a collection of interrelated classes organised as a taxonomy.

All newly defined classes should be directly or indirectly a subclass of the predefined archetype classes of the top-level (see chapter: Top level CB-NL) .

Classes should be as generic as possible, i.e. independent of a certain domain, context or world view on the higher specialization levels. This gives room to alternative contexts/views, where subclasses of these generic levels can be defined.

Classes do not only cover tangible things like bridges and buildings but also intangible, nonmaterial things like types of information, and spatial things like planning zones. Characteristics, like 'width', 'fire resistance' are also modeled as classes in CB-NL. 

Each class is encoded as an OWL Class:

:VasteBrug
      a       owl:Class .
    

4.6.2.2 Class names

Rules for naming classes are part of the Practical guidelines for modeling CB-NL content. They are given in the section on Concept names

4.6.2.3 Metadata and versioning

The CBNL is released at ontology level. A new version of the ontology replaces the old version.

Version metadata and identifiers

Both concepts and properties are described with metadata in the CB-NL Core. 

Concepts and properties may have a lexical description: a free-text account of the concept. In addition, each class and property has attributes for version, version date, and owner. Owner information is encoded using Dublin Core DC elements. Concepts also have an attribute for status; properties do not.

Definition

Every concept in CB-NL has a definition, which is generated automatically based on the concept's place in the taxonomy and its discriminating properties. 

In addition, a natural-language definition of the concept can be added to concepts and properties in several languages and must always have a language tag. This description is stored in the property skos:definition.

For example, description of a fly-over:

skos:definition “The upper road (above ground) of a grade-separated junction"@en-gb , "De bovenste weg (boven maaiveld) van een vrije kruising."@nl-nl ;

Status

Status: indicates the stage of acceptance of the concept in the CB-NL ontology. Possible values (in typical workflow order) are:

  1. “draft”: the concept is entered in CB-NL but not yet checked and approved by the CB-NL team. This status can occur in the CB-NL 1.0, with concepts that are required by use cases. For concepts with status "draft" it is unknown whether they will remain part of the CB-NL. They could be removed at any time. 
  2. “approved”: the concept is an accepted/approved CB-NL member. This is the normal status for concepts.
  3. “dismissed”: the concept is not (no longer) valid to be used. In CB-NL 1.0 this status does not occur. 
  4. "deprecated": the concept will be removed from the CBNL in time, but is maintained for compatibility reasons. Concepts in final releases may have this status. In CB-NL 1.0 this status does not occur. 
  5. “issue”: the concept poses a problem, which is currently under investigation. No concept in any final release will have this status. – In CB-NL 1.0 this status does not occur. 

The status of every concept is at bootstrap (after conversion) set to “approved”; this means that only by changing the status to some other value the ontology is in transition (at that location). Th ontology is published when all statuses are set to “approved”.

There is no Dublin Core term for status metadata, so CB-NL defines its own:

:Bridge cb-nl:status “approved”;

Owner

Owner: the party responsible for making the concept or property available. dcterms:Agent is used to record the owners of concepts as instances. dcterms:rightsHolder is used to record the owner of a concept. 

dct:rightsHolder http://ont.cbnl.org/cb/id/agent/CBNL; 

4.6.2.4 Identification: URI-Strategy

The concepts defined in CB-NL will be re-used by many organizations for communication within the construction domain. While CB-NL is a central ontology, it is part of a distributed web of more specific ontologies. For these reasons, it is essential that each CB-NL concept (including its versions) is uniquely and reliably identifiable.

W3C recommends http-URI’s as identifiers. CB-NL adopts the draft URI strategy for the Netherlands as a pattern to create identifiers for concepts. The URI pattern prescribed by the draft NL URI strategy and applied in the CB-NL as shown below:

For individual objects (individuals/instances used as enumeration values):

http://{domain}/{type}/{concept}/{reference}

which translates to CB-NL URI format:

http://ont.cbnl.org/cb/id/{classname}/{reference}

Note: at the moment CB-NL does not contain /doc/ URIs .

For definitions of ontology terms:

http://{domain}/def/{concept}

which translates to CB-NL URI format, in which the concept name constitutes the last part of the URI:

http://ont.cbnl.org/cb/def/{concept}

The {domain} of the URI is: 

  • For CB-NL concepts http://ont.cbnl.org/cb/.
  • For CB-NL top level concepts http://ont.cbnl.org/top/.
More on {concept} and concept names

The concept name indicates the thing the URI identifies. Concepts in CB-NL are identified with a abbreviated form of the concept label, in Dutch.

An advantage of using concept names as identifiers is that it results in more readable, user-friendly URLs. A possible disadvantage is that different concepts could have the same name. If they are from a different context, this is not a problem because the context is part of the URI. If they are from the same context, care must be taken to distinguish the different concepts by giving them different names; e.g. not ‘bank’, but ‘river bank’ and ‘savings bank’. This “disambiguation” step can be taken when and where the issue arises. New concepts with an identical name can be disambiguated by, preferably, adding the name of the immediate superclass to the name. For example: bank becomes: RiverPart_Bank, as the superclass URI concept name is RiverPart.

4.6.2.5 Property guidelines

The OWL language has three types of properties: owl:AnnotationProperties, owl:ObjectProperty and owl:DatatypeProperty. 

AnnotationProperties record things that can be recognised as metadata. In the CB-NL Core these are used to record metadata of a concept, like a description and its status. 

Datatype Properties give information about a thing in the form of attribute values. Currently, these are not part of the CB-NL Core; characteristics of things, like 'width' or 'fire resistance' are modeled as Class. 

ObjectProperties are relations between two Classes. In the CB-NL Core ObjectProperties are used to record the relationship between classes and their discriminators. In addition there is an ObjectProperty 'is a role for' (see 5.9 Roles).

However, in CB-NL Core properties are modelled as OWL Classes. In the generic top-level ontology (the class taxonomy) of CB-NL there is a division between Objects and Properties. This means that in CB-NL there are "object-classes” and “property-classes”.

OWL properties are only used in CB-NL core to record metadata about concepts (see 4.6.2.3 Metadata and versioning), and to connect concepts to discriminator values (see Discriminators).

4.6.2.6 Specialization guidelines

In order to give concepts a place in the taxonomy, specialization relationships are used to connect more specific concepts to more general (increasingly abstract) ones. In this relationship, we call the general class superclass, and the specific class subclass. 

To distinguish a subclass from its superclass, a single property (function, material, or some other criterion) is used. This property is called the 'discriminator'. The CB-NL Core currently distinguishes four discriminator types, in order of importance: purpose / function, technology, application, and appearance. Discriminating properties of a superclass also hold for its subclasses.

The following guidelines must be taken into account when modeling specialization:

  • Every class must have at least one superclass (more general class).
  • Every user-defined class is directly or indirectly a subclass of one of the CB-NL top-level classes.
  • Superclasses can have zero or more subclasses and subclasses can have one or more superclass.
  • No loops: it is not allowed to model a superclass A with a subclass B, and a class C as subclass of B, when A is a subclass of C.
  • Exclusivity among classes (when members of a class A cannot be also members of class B, i.e. classes A and B are disjoint) is modeled explicitly where applicable

For the more practical guidelines and rules on creating subclasses, see Practical guidelines for modeling CB-NL content.

4.6.2.7 Collections

In the CB-NL concepts can be grouped conceptually using collections. Collections are named using a noun in plural form, whenever possible. The list of collections are instances of the class Collection which is part of the CB-NL Metamodel. 

To indicate that a concept is member of a particular collection, the following construction is used:

cbnlcore:Landingsbaan dcterms:subject id:Luchtvaart ;

4.6.3 Design patterns

Design patterns are fixed or preferred methods for doing things. The following design patterns apply; the list may grow in time.

  1. Design pattern: natural-language constructions are always given a language code. 
  2. Design pattern: We use broadly accepted forms of RDF, having a meaning as close as possible to the intended construction, and for which reasoning (axioma) is available to enforce the correct implementation.  
  3. Design pattern: For every intended construction, we apply only one corresponding form of RDF. This avoids double maintenance. 
  4. Design pattern: Whenever necessary, we add axioms which do not alter the original intention of the RDF form but strengthen reasoning and enable integrity checks. We do this also for external constructs, but accompanied by a clear explanation and in such a way that de-activation of these axioms does not alter the correctness and completeness of the ontology.
  5. Design pattern: Whenever possible, RDF forms are chosen that can be validated during editing by active reasoners. When impossible, validation is done later using SPARQL queries and reports. 

How to implement design patterns in OWL is described here.

4.7 CBNL QA Validation

How (parts of) CB-NL are qualitatively validated is described on this page.

Tags:
Created by Linda van den Brink on 2014/09/18 10:17

This wiki is licensed under a Creative Commons 2.0 license
XWiki Enterprise 5.3 - Documentation