Ontology

- Information Gathering & More

Useful Links

An OWL online tutorial, XML online tutorial, RDF online tutorial

A simple presentation.

Paper Reading (Summary)

1. Difference Graphs; Harry S. Delugach and Aldo de Moor.; 13th International Conference on Conceptual Structures, ICCS 2005 Kassel, Germany, July 2005. Kassel University Press, pp. 41-53. [Summary Notes ]

2. An XML Schema Integration and Query Mechanism System; Sanjay Madria, Kalpdrum Passi, Sourav Bhowmick;


Ontology

A formal vocabulary that describes the basic categories of being by defining entities, types of entities, and the relationships among them. (from UMR CS311 Course Slides)

Ontology must be expressed in a formal language.
An ontology language needs to allow specification of terms, relationships, properties

Elements of an Ontology:

Problems with Partonomy (classification based on part-of relation): inheritance may not hold; can lead to paradoxes


Other ways to represent knowledge

Controlled Vocabulary: classes only, no attributes, no relationships, no hierarchy
Anatomical Dictionary: classes and attributes, no hierarchy, no relationships
Taxonomy: classes, hierarchy, and relationships, no attributes


OWL
(Web Ontology Language)

To learn OWL, we need to know XML (EXtensible Markup Language) and RDF (Resource Description Framework).

XML (EXtensible Markup Language)
There is a quick and easy XML tutorial. (can finish in 1 day)

This is a link to a commercial XML editor (actually notepad will do), but this one makes it faster. XMLSpy (http://www.altova.com)

XML Examples / Notes:
0. XML is a metamarkup language for text documents. Data is included in XML documents as strings of text. It doesn't have a fixed set of tags and elements that are supposed to work for everybody in all areas of interest. Instead, XML allows developers to define the elements they need.
XML is NOT a presentation language.
XML is NOT a programming language.
XML is NOT a network transport protocol.
XML is NOT a database.
To store XML in a database, client software will send the XML data to the server using a network protocol such as TCP/IP. Server software will receive the XML data, parse it, and store it in the database. To retrieve an XML document from a database, you'll generally pass through some middleware product like Enhydra that makes SQL queries against the database and formats the result set as XML before returning it to the client. Indeed, some databases may integrate this software code into their core server or provide plug-ins to do it such as the Oracle XSQL servlet. XML serves very well as a ubiquitous, platform-independent transport format in these scenarios.

1. Format XML with CSS. 001_cd_catalog.xml file only => combine 001_cd_catalog.css file with 001_cd_catalog_with_css.xml file, Formatting XML with CSS is NOT the future of how to style XML documents. XML document should be styled by using the W3C's XSL standard!

2. Displaying XML with XSL (the eXtensible Stylesheet Language). 002_simple.xml file only => combine 002_simple.xsl file with 002_simplexsl.xml file.

3. XML Data Embedded in HTML. 003_cd_catalog.xml file only => use 003_cd_catalog.htm file to embed and formt the xml file. (The <span> tag allows the datafld attribute to refer to the XML element to be displayed. ) 003_more.htm (demonstrating <thead>, <tbody>, and <tfoot>.)

3.5. XML in real life: Using such a standard makes it easier for everyone to produce, receive, and archive any kind of information across different hardware, software, and programming languages.

4. XML Praser: To manipulate an XML document, you need an XML parser. The parser loads the document into your computer's memory. Once the document is loaded, its data can be manipulated using the DOM. The DOM treats the XML document as a tree.
Microsoft's XML parser is a COM component that comes with Internet Explorer 5 and higher.
Mozilla's XML parser supports all the necessary functions.
Parsing an XML File - Cross browser. 004_note.htm to parse 004_note.xml.
Parsing an XML String - Cross browser. 004_note2.htm.

5. Name confilcts: Since element names in XML are not predefined, a name conflict will occur when two different documents use the same element names.
Solving Name Conflicts Using a Prefix and Namespaces (any value will do): 005_table1.xml and 005_table2.xml. When we start using XSL, we will soon see namespaces in real use.

6. All text in an XML document will be parsed by the parser.
Only text inside a CDATA section will be ignored by the parser.

Escape Characters:

&lt; < less than
&gt; > greater than
&amp; & ampersand 
&apos; ' apostrophe
&quot; " quotation mark

A CDATA section starts with "<![CDATA[" and ends with "]]>":

Beware of encoding: <?xml version="1.0" encoding="ISO-8859-1"?>

7. XML can be generated on a server without installing any XML controls. (e.g. generate XML with ASP)
XML can be generated from a database without any installed XML software. (e.g. with ASP)

8. Examples
Add a navigation script. 008_navigation.htm. Get data from 001_cd_catalog.xml.

9. With an HTTP request, a web page can make a request to, and get a response from a web server - without reloading the page. Google Suggest is using the XMLHttpRequest object to create a very dynamic web interface: When you start typing in Google's search box, a JavaScript sends the letters off to a server and the server returns a list of suggestions.

10. Storing data in XML files is useful if the data is to be sent to applications on other OS platforms.

11. Some XML technologies. XML Editor.

12. XML and PHP (my interests). 2 methods of transforming XML in PHP: PEAR's XML_Transformer package and the W3C XML transformation language XSLT.

PEAR's main goal is to become a repository for PHP extensions and libraries. PEAR offers a wide variety of packages ready to use by PHP developers. One of these packages is the XML_Transformer. This package was created to help you transform existing XML files with the help of PHP code.

XSLT is a implementation of a transformation language for converting XML into either XML, HTML or simple text. PHP offers XSLT functionality at its core, making it easy to incorporate transformation features into existing code.

XML_Transformer lets you map PHP functionality to specified XML tags. It offers many possibilities of mapping XML tags.

XSLT is a stylesheet language that transforms XML documents by using a "transformation specification". This specification is a set of rules that match elements. These rules describe the output of each element, based on its contents. To start using XSLT directly from PHP, you will need an XSLT file and the XML document that you wish to transform.

While PEAR::XML_Transformer gives you greater flexibility through the use of PHP, XSLT is easier to use by non-programmers. XML_Transformer's approach lets you associate an XML element's opening and closing tags with specific functions. XSLT's transformation is tightly coupled with the XML tree.

RDF (Resource Description Framework)

0. Quick and easy RDF tutorial. (can finish in 1 day). A link to a commercial RDF, OWL Editor, SemanticWorks. (http://www.altova.com)

1. RDF identifies things using Web identifiers (URIs), and describes resources with properties and property values. Example

2. Another example. W3C's RDF Validation Service allows us to experiment with RDF files.

The example looks like this in the RDF parser (note: how elements are appended to the name space xmlns:cd; the links are fake)

Number Subject Predicate Object
1 http://www.recshop.fake/cd/Empire Burlesque http://www.recshop.fake/cd#artist "Bob Dylan"
2 http://www.recshop.fake/cd/Empire Burlesque http://www.recshop.fake/cd#country "USA"
3 http://www.recshop.fake/cd/Empire Burlesque http://www.recshop.fake/cd#company "Columbia"
4 http://www.recshop.fake/cd/Empire Burlesque http://www.recshop.fake/cd#price "10.90"
5 http://www.recshop.fake/cd/Empire Burlesque http://www.recshop.fake/cd#year "1985"
6 http://www.recshop.fake/cd/Hide your heart http://www.recshop.fake/cd#artist "Bonnie Tyler"
7 http://www.recshop.fake/cd/Hide your heart http://www.recshop.fake/cd#country "UK"
8 http://www.recshop.fake/cd/Hide your heart http://www.recshop.fake/cd#company "CBS Records"
9 http://www.recshop.fake/cd/Hide your heart http://www.recshop.fake/cd#price "9.90"
10 http://www.recshop.fake/cd/Hide your heart http://www.recshop.fake/cd#year "1988

A graph, (a tree for this example) was also generated.

3. <rdf:RDF> is the root element of an RDF document. It defines the XML document to be an RDF document. It also contains a reference to the RDF namespace:

4. The <rdf:Description> element identifies a resource with the about attribute.

5. The elements, artist, country, company, price, and year, are defined in the http://www.recshop.fake/cd# namespace. This namespace is outside RDF (and not a part of RDF). RDF defines only the framework.

The property elements can also be defined as attributes (instead of elements): example with same parsing output.

The property elements can also be defined as resources: example with similar parsing output.

6. RDF containers are used to describe group of things. <Bag>, <Seq>, and <Alt>.

The <rdf:Bag> element is used to describe a list of values that is intended to be unordered. example looks like this graph.

The <rdf:Seq> element is used to describe a list of values that is intended to be ordered (For example, in alphabetical order). example looks like this graph.

The <rdf:Alt> element is used to describe a list of alternative values (the user can select only one of the values). example looks like this graph.

The contained things are called members

7. RDF collections are used to describe group that contains ONLY the specified members. Container says that the containing resources are members - it does not say that other members are not allowed. example. graph.

8. RDF Schema (RDFS) is an extension to RDF.

RDF describes resources with classes, properties, and values.
In addition, RDF also need a way to define application-specific classes and properties. Application-specific classes and properties must be defined using extensions to RDF.
RDF Schema provides the framework to describe application-specific classes and properties. example. graph.

Number Subject Predicate Object
1 http://www.animals.fake/animals#animal http://www.w3.org/1999/02/22-rdf-syntax-ns#type http://www.w3.org/2000/01/rdf-schema#Class
2 http://www.animals.fake/animals#horse http://www.w3.org/1999/02/22-rdf-syntax-ns#type http://www.w3.org/2000/01/rdf-schema#Class
3 http://www.animals.fake/animals#horse http://www.w3.org/2000/01/rdf-schema#subClassOf http://www.animals.fake/animals#animal

example abbreviated. same table, same graph.
Since an RDFS class is an RDF resource we can abbreviate the example above by using rdfs:Class instead of rdf:Description, and drop the rdf:type information:

9. The Dublin Core Metadata Initiative (DCMI) has created some predefined properties for describing documents.
RDF is metadata (data about data). RDF is used to describe information resources. The Dublin Core is a set of predefined properties for describing documents.
The first Dublin Core properties were defined at the Metadata Workshop in Dublin, Ohio in 1995 and is currently maintained by the Dublin Core Metadata Initiative. example. graph.

Property Definition
Contributor An entity responsible for making contributions to the content of the resource
Coverage The extent or scope of the content of the resource
Creator An entity primarily responsible for making the content of the resource
Format The physical or digital manifestation of the resource
Date A date of an event in the lifecycle of the resource
Description An account of the content of the resource
Identifier An unambiguous reference to the resource within a given context
Language A language of the intellectual content of the resource
Publisher An entity responsible for making the resource available
Relation A reference to a related resource
Rights Information about rights held in and over the resource
Source A Reference to a resource from which the present resource is derived
Subject A topic of the content of the resource
Title A name given to the resource
Type The nature or genre of the content of the resource

 

10. The RDF Namespaces
The RDF namespace (xmlns:rdf) is: http://www.w3.org/1999/02/22-rdf-syntax-ns#
The RDFS namespace (xmlns:rdfs ) is: http://www.w3.org/2000/01/rdf-schema#

11. The RDF Extension and Mime Type
The recommended extension for RDF files is *.rdf. However, the extension *.xml is often used to provide compatibility with older xml parsers.
The registered mime type should be "application/rdf+xml".

RDFS / RDF Classes

Element Class of Subclass of
rdfs:Class All classes  
     
rdfs:Datatype Data types Class
rdfs:Resource All resources Class
     
rdfs:Container Containers Resource
rdfs:Literal Literal values (text and numbers) Resource
     
rdf:List Lists Resource
rdf:Property Properties Resource
rdf:Statement Statements Resource
     
rdf:Alt Containers of alternatives Container
rdf:Bag Unordered containers Container
rdf:Seq Ordered containers Container
     
rdfs:ContainerMembershipProperty Container membership properties Property
rdf:XMLLiteral XML literal values Literal


RDFS / RDF Properties

Element Domain Range Description
rdfs:domain Property Class The domain of the resource
rdfs:range Property Class The range of the resource
rdfs:subPropertyOf Property Property The property is a sub property of a property
       
rdfs:subClassOf Class Class The resource is a subclass of a class
rdfs:comment Resource Literal The human readable description of the resource
rdfs:label Resource Literal The human readable label (name)  of the resource
rdfs:isDefinedBy Resource Resource The definition of the resource
rdfs:seeAlso Resource Resource The additional information about the resource
rdfs:member Resource Resource The member of the resource
       
rdf:first List Resource  
rdf:rest List List  
rdf:subject Statement Resource The subject of the resource in an RDF Statement
rdf:predicate Statement Resource The predicate of the resource in an RDF Statement
rdf:object Statement Resource The object of the resource in an RDF Statement
rdf:value Resource Resource The property used for values
rdf:type Resource Class The resource is an instance of a class


RDF Attributes

Element Domain Range Description
       
rdf:about     Defines the resource being described
rdf:Description     Container for the description of a resource
rdf:resource     Defines a resource to identify a property
rdf:datatype     Defines the data type of an element
rdf:ID     Defines the ID of an element
rdf:li     Defines a list
rdf:_n     Defines a node
rdf:nodeID     Defines the ID of an element node
rdf:parseType     Defines how an element should be parsed
rdf:RDF     The root of an RDF document
xml:base     Defines the XML base
xml:lang     Defines the language of the element content
       
rdf:aboutEach     (removed)
rdf:aboutEachPrefix     (removed)
rdf:bagID     (removed)

 

OWL (Web Ontology Language)

OWL is a part of the "Semantic Web Vision" - a future where:

The OWL namespace: http://www.w3.org/2002/07/owl#

Website: http://www.w3.org/2004/OWL/

OWL Guide

"Tell me what wines I should buy to serve with each course of the following menu. And, by the way, I don't like Sauternes."
To support this sort of computation, it is necessary to go beyond keywords and specify the meaning of the resources described on the Web. This additional layer of interpretation captures the semantics of the data.

An OWL Ontology may include descriptions of classes, properties and their instances.

OWL makes an open world assumption. That is, descriptions of resources are not confined to a single file or scope. While class C1 may be defined originally in ontology O1, it can be extended in other ontologies. The consequences of these additional propositions about C1 are monotonic. New information cannot retract previous information. New information can be contradictory, but facts and entailments can only be added, never deleted.


1. Namespaces (a group of identifiers)

A standard initial component of an ontology includes a set of XML namespace declarations enclosed in an opening rdf:RDF tag.

<rdf:RDF 
xmlns ="http://www.w3.org/TR/2004/REC-owl-guide-20040210/wine#"
xmlns:vin ="http://www.w3.org/TR/2004/REC-owl-guide-20040210/wine#"
xml:base ="http://www.w3.org/TR/2004/REC-owl-guide-20040210/wine#"
xmlns:food="http://www.w3.org/TR/2004/REC-owl-guide-20040210/food#"
xmlns:owl ="http://www.w3.org/2002/07/owl#"
xmlns:rdf ="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
xmlns:xsd ="http://www.w3.org/2001/XMLSchema#">
first two declarations identify the namespace associated with this ontology
1st: xmlns = default namespace, unprefixed qualified names refer to the current ontology
2nd: xmlns:vin = the namespace of the current ontology with the prefix vin:.
3rd: xml:base = identifies the base URI for this document.
4th: xmlns:food = identifies the namespace of the supporting food ontology with the prefix food:.
5th: xmlns:owl = This is a conventional OWL declaration, used to introduce the OWL vocabulary.
OWL depends on constructs defined by RDF, RDFS, and XML Schema datatypes. The next 3 namespace declarations take care of that.


2. Ontology Headers

Once namespaces are established we normally include a collection of assertions about the ontology grouped under an owl:Ontology tag.

<owl:Ontology rdf:about=""> 
<rdfs:comment>An example OWL ontology</rdfs:comment>
<owl:priorVersion rdf:resource="http://www.w3.org/TR/2003/PR-owl-guide-20031215/wine"/>
<owl:imports rdf:resource="http://www.w3.org/TR/2004/REC-owl-guide-20040210/food"/>
<rdfs:label>Wine Ontology</rdfs:label>
...


The owl:Ontology element is a place to collect much of the OWL meta-data for the document.

owl:imports provides an include-style mechanism. owl:imports takes a single argument, identified by the rdf:resource attribute.
Importing another ontology brings the entire set of assertions provided by that ontology into the current ontology.

One common set of additional tags that could reasonably be included here are some of the standard Dublin Core metadata tags.
Examples include Title, Creator, Description, Publisher, and Date (see RDF declarations).


3. Basic Elements of OWL

Most of the elements of an OWL ontology concern classes, properties, instances of classes, and relationships between these instances.

3.1 Simple Classes and Individuals

Sometimes we want to emphasize the distinction between a class as an object and a class as a set containing elements. We call the set of individuals that are members of a class the extension of the class.

3.1.1. Simple Named Classes Class, rdfs:subClassOf

Every individual in the OWL world is a member of the class owl:Thing. Thus each user-defined class is implicitly a subclass of owl:Thing. Domain specific root classes are defined by simply declaring a named class. OWL also defines the empty class, owl:Nothing.

For our sample wines domain, we create three root classes: Winery, Region, and ConsumableThing.

<owl:Class rdf:ID="Winery"/>
<owl:Class rdf:ID="Region"/>
<owl:Class rdf:ID="ConsumableThing"/>

Exist classes that have been given these names, indicated by the 'rdf:ID=' syntax.
The syntax rdf:ID="Region" is used to introduce a name, as part of its definition.

Within this document, the Region class can now be referred to using #Region, e.g. rdf:resource="#Region". Other ontologies may reference this name using its complete form, "http://www.w3.org/TR/2004/REC-owl-guide-20040210/wine#Region".
Another form of reference uses the syntax rdf:about="#Region" to extend the definition of a resource.

This use of the rdf:about="&ont;#x" syntax is a critical element in the creation of a distributed ontology. It permits the extension of the imported definition of x without modifying the original document and supports the incremental construction of a larger ontology.

For the first class, within this document, we can use the relative identifier, #Winery. Other documents may need to reference this class as well. The most reasonable way to do so is to provide namespace and entity definitions that include the defining document as a source:

...
<!ENTITY vin "http://www.w3.org/TR/2004/REC-owl-guide-20040210/wine#" >
<!ENTITY food "http://www.w3.org/TR/2004/REC-owl-guide-20040210/food#" >
...
<rdf:RDF xmlns:vin ="http://www.w3.org/TR/2004/REC-owl-guide-20040210/wine#"
xmlns:food="http://www.w3.org/TR/2004/REC-owl-guide-20040210/food#" ... >
...

Given these definitions we can refer to the winery class either using the XML tag vin:Winery or the attribute value &vin;Winery. More literally, it is always possible to reference a resource using its full URI, here http://www.w3.org/TR/2004/REC-owl-guide-20040210/wine#Winery.

The fundamental taxonomic constructor for classes is rdfs:subClassOf.

<owl:Class rdf:ID="PotableLiquid"> 
<rdfs:subClassOf rdf:resource="#ConsumableThing" />
...
</owl:Class>

define PotableLiquid (liquids suitable for drinking) to be a subclass of ConsumableThing.

A class definition has two parts: a name introduction or reference and a list of restrictions.
So far we have only seen examples that include a single restriction, forcing the new class to be a subclass of some other named class.

<owl:Class rdf:ID="Wine"> 
<rdfs:subClassOf rdf:resource="&food;PotableLiquid"/>
<rdfs:label xml:lang="en">wine</rdfs:label>
<rdfs:label xml:lang="fr">vin</rdfs:label>
...
</owl:Class>

<owl:Class rdf:ID="Pasta">
<rdfs:subClassOf rdf:resource="#EdibleThing" />
... </owl:Class>

The rdfs:label entry provides an optional human readable name for this class. Presentation tools can make use of it. The "lang" attribute provides support for multiple languages. A label is like a comment and contributes nothing to the logical interpretation of an ontology.


3.1.2. Individuals

In addition to classes, we want to be able to describe their members. We normally think of these as individuals in our universe of things. An individual is minimally introduced by declaring it to be a member of a class.

<owl:Thing rdf:ID="CentralCoastRegion" /> 

<owl:Thing rdf:about="#CentralCoastRegion"> 
<rdf:type rdf:resource="#Region"/>
</owl:Thing>

rdf:type is an RDF property that ties an individual to a class of which it is a member.


3.1.3. Design for Use

There are important issues regarding the distinction between a class and an individual in OWL. A class is simply a name and collection of properties that describe a set of individuals. Individuals are the members of those sets. Thus classes should correspond to naturally occurring sets of things in a domain of discourse, and individuals should correspond to actual entities that can be grouped into these classes.

 

3.2. Simple Properties

Properties let us assert general facts about the members of classes and specific facts about individuals.

3.2.1. Defining Properties
ObjectProperty, DatatypeProperty, rdfs:subPropertyOf,
rdfs:domain, rdfs:range

When we define a property there are a number of ways to restrict the relation.
The domain and range can be specified.
The property can be defined to be a specialization (subproperty) of an existing property. More elaborate restrictions are possible.

<owl:ObjectProperty rdf:ID="madeFromGrape">
     <rdfs:domain rdf:resource="#Wine"/>
     <rdfs:range rdf:resource="#WineGrape"/>
</owl:ObjectProperty> 

property madeFromGrape has a domain of Wine and a range of WineGrape.

In OWL, a range may be used to infer a type.

<owl:Thing rdf:ID="LindemansBin65Chardonnay">    
     <madeFromGrape rdf:resource="#ChardonnayGrape" />  
</owl:Thing>

we can infer that LindemansBin65Chardonnay is a wine because the domain of madeFromGrape is Wine.


Properties, like classes, can be arranged in a hierarchy.

<owl:Class rdf:ID="WineDescriptor" />

<owl:Class rdf:ID="WineColor"> <rdfs:subClassOf rdf:resource="#WineDescriptor" /> ...
</owl:Class>

<owl:ObjectProperty rdf:ID="hasWineDescriptor">
<rdfs:domain rdf:resource="#Wine" /> <rdfs:range rdf:resource="#WineDescriptor" /> </owl:ObjectProperty>

<owl:ObjectProperty rdf:ID="hasColor">
<rdfs:subPropertyOf rdf:resource="#hasWineDescriptor" />
<rdfs:range rdf:resource="#WineColor" />
...
</owl:ObjectProperty>

The rdfs:subPropertyOf relation in this case means that anything with a hasColor property with value X also has a hasWineDescriptor property with value X.

<owl:ObjectProperty rdf:ID="locatedIn">    
     ...    
     <rdfs:domain rdf:resource="http://www.w3.org/2002/07/owl#Thing" />    
     <rdfs:range rdf:resource="#Region" />  
</owl:ObjectProperty> 

domain permits anything to be located in a region, including regions themselves.

It is now possible to expand the definition of Wine to include the notion that a wine is made from at least one WineGrape.

<owl:Class rdf:ID="Wine"> 
<rdfs:subClassOf rdf:resource="&food;PotableLiquid"/>
<rdfs:subClassOf>
<owl:Restriction>
<owl:onProperty rdf:resource="#madeFromGrape"/>
<owl:minCardinality rdf:datatype="&xsd;nonNegativeInteger">1</owl:minCardinality>
</owl:Restriction>
</rdfs:subClassOf>
...
</owl:Class>

the set of things with at least one madeFromGrape property. We call these anonymous classes.

<owl:Class rdf:ID="Vintage"> 
<rdfs:subClassOf>
<owl:Restriction>
<owl:onProperty rdf:resource="#vintageOf"/>
<owl:minCardinality rdf:datatype="&xsd;nonNegativeInteger">1</owl:minCardinality>
</owl:Restriction>
</rdfs:subClassOf>
</owl:Class> <owl:ObjectProperty rdf:ID="vintageOf">
<rdfs:domain rdf:resource="#Vintage"/>
<rdfs:range rdf:resource="#Wine"/>
</owl:ObjectProperty>

The property vintageOf ties a Vintage to a Wine.

 

3.2.2. Properties and Datatypes