The Open Biological and Biomedical Ontology Foundry

OBO Flat File Format 1.4 Syntax and Semantics [WORKING DRAFT]

Editor's Draft: May 2012

This version:
http://purl.obolibrary.org/obo/oboformat/2011-11-29/doc/obo-syntax.html
Editor's version:
https://github.com/owlcollab/oboformat/blob/gh-pages/doc/obo-syntax.html
Latest version:
http://purl.obolibrary.org/obo/oboformat/spec.html
Previous version:
http://purl.obolibrary.org/obo/oboformat/2011-08-28/doc/obo-syntax.html
Editors:
Chris Mungall, Lawrence Berkeley National Laboratory
Alan Ruttenberg, University at Buffalo
Ian Horrocks, University of Oxford
David Osumi-Sutherland, Cambridge University
Contributors:
Erick Antezana, Bayer CropScience
James Balhoff, National Evolutionary Synthesis Center
Melanie Courtot, British Columbia Cancer Research Center
Heiko Dietze, Lawrence Berkeley National Laboratory
John Day-Richter, Google
Mathew Horridge, University of Manchester
Amelia Ireland, The Jackson Laboratory
Martin Larralde, ENS Paris-Saclay
Suzanna Lewis, Lawrence Berkeley National Laboratory
Shahid Manzoor, Lawrence Berkeley National Laboratory
Syed Hamid Tirmizi, University of Texas

Abstract

This document provides a BNF specification and mapping to OWL2 of the OBO Flat File Format, version 1.4

Status of this Document

This is an working draft, for comment by the community.

Comments should be sent to obo-format@lists.sourceforge.net (archive) to ensure wide visibility.

Most of this document can be regarded as stable. The parts that are likely to be unstable are marked red.

There is a working reference implementation available from http://oboformat.googlecode.com/.


Table of Contents


1 Introduction

This document specifies the syntax and semantics of OBO Format (OBOF)

The first part of this specification describes the syntax and structure of OBO Format using BNF. This states how strings of characters in OBOF Files are parsed into abstract OBO documents, and describes constraints on the structure of these documents. The second part of this specification describes the semantics of an OBO document via a mapping to OWL2-DL and community metadata vocabularies.

1.1 Relationship to previous specifications and translations

This specification builds on previous attempts to specify the syntax and/or semantics of OBOF.

1.1.1 Golbreich and Horrocks translation

This document is derived from and indebted to a previous document by Ian Horrocks [OBO-OWL Horrocks], later described in an accompanying paper [OBO-OWL Golbreich]. This document provided a semantics for OBO Format in terms of OWL-DL and what was then called OWL1.1 (later to become OWL2). The syntax was underspecified and the mapping was incomplete as there was no treatment of annotation properties.

implementation: This mapping was implemented in the OWLAPI, and is consequently used by current versions of Protege4

1.1.2 OboInOwl

The OboInOwl mapping attempted to fill in the gaps in the Horrocks translation by providing a full translation of all of OBOF 1.2, including synonyms and definitions[OBO-OWL NCBO]. This mapping requires use of the n-ary relations pattern to represent OBO metadata in its entirety. This has undesirable side-effects, such as introducing modeling classes and individuals into an ontology, affecting both reasoning and usability.

implementation:This mapping was implemented via an XSLT translation from OBO-XML. All class and ontology URIs are of the form http://purl.org/obo/owl.

1.1.3. OBO Format 1.3 mapping

OBO Format 1.3 was accompanied by a grammar and mapping to FOL [Obolog]. This mapping was based on the 2005 version of the OBO Relation Ontology and OBOF1.3/Obolog attempted to do justice to dual instance/type level relations and ternary temporally-qualified instance-level relations. However, this introduced too large an impedance mismatch between OBOF and OWL, and as a consequence OBOF1.3 and Obolog have been deprecated.

implementation:Deprecated.

1.1.4 Tirmizi et al

More recent work by Tirmizi provides an implementation of a fully roundtrippable mapping between OBO to OWL and back [OBO-OWL SWAT4LS] [OBO-OWL JBMS]. This did not attempt to provide a full grammar, and implemented the same specification as 1.1.2

implementation:OBO-Edit 2.0 (removed from OBO-Edit-2.1)

1.1.5 OBO Format Guides

The OBO Format 1.2 guide is too loose to be regarded as a full specification. It served as a basis for (1.1.1).

The [OBO Format 1.4 guide] accompanies this document. It is intended as a guide, not a normative specification.

1.1.6 Normative mapping

This document supersedes previous efforts. The goal is to provide a complete, normative specification of both the syntax and semantics of OBOF1.4. The syntax and semantics specified here also hold for any OBOF1.2 document.

The mapping provided here is consistent with the original Golbreich/Horrocks translation (1.1.1). It improves on this translation by specifying the grammar in full, and by providing a full mapping of all OBOF constructs to OWL2, through a combined use of the Information Artefact Ontology (IAO) and the OboInOwl vocabulary.

This mapping uses a different means of translating OBOF terminological elements (synonyms, definitions, etc) to OWL2 than the one provided in 1.1.2. The n-ary relations pattern has been abandoned, and replaced by the use of OWL2 AxiomAnnotations. Many of the OboInOwl vocabulary elements have been replaced by IAO annotation properties. This mapping also uses a different URI scheme. In place of the deprecated http://purl.org/obo/owl URIs, the OBO Foundry standard http://purl.obolibrary/org/obo scheme is now used.

The semantics specified in this document are inconsistent with the abandoned OBOF1.3 specification 1.1.3, which is strongly deprecated.

This specification also introduces new constructs absent from previous versionsof OBOF, and clarifies many syntactic and semantic elements that were previously unclear.

implementation: the reference implementation of this ontology can be found at the oboformat site

1.2 Additional introductory material

More material can be found at the oboformat site

2 Preliminary Definitions

2.1 BNF Notation

OBO-Format 1.4 is defined using a standard BNF notation, which is summarized in the table below.

Table 1. The BNF Notation Used in this Document
Construct Syntax Example
non-terminal symbols boldface QuotedString
terminal symbols single quoted 'Term'
zero or more curly braces { entity-frame }
zero or one square brackets [ ws SynonymType-ID ]
alternative vertical bar Class-ID | Relation-ID
grouping parentheses ( )
complementation minus symbol ( character - NewLineChar )

Because generic tag-value pairs occur in very many places in the syntax, to save space the grammar has meta-productions for basic Tag-Values Pairs (TVPs), Boolean Tags (BTs) and for the tags themselves (T):

<T>-TVP ::= '<T>:' [ws] UnquotedString <T>-BT ::= '<T>:' [ws] ( 'true' | 'false' ) <T>-Tag ::= '<T>:' [ws]

We provide an additional List meta-production, for comma-delimited lists

<NT>List ::= [<NT> {',' ws <NT> }]

For example, QualifierList should be interpreted as: [ Qualifier { ',' ws Qualifier} ]

Documents in the obo-format consist of sequences of Unicode characters [UNICODE] and are encoded in UTF-8 [RFC 3629].

2.2 Characters

2.2.0 Basic Characters

Alpha-Char ::= a |b |c |d |e |f |g |h |i |j |k |l |m |n |o |p |q |r |s |t |u |v |w |x |y |z | A |B |C |D |E |F |G |H |I |J |K |L |M |N |O |P |Q |R |S |T |U |V |W |X |Y |Z Digit ::= 0 |1 |2 |3 |4 |5 |6 |7 |8 |9

2.2.1 Spacing Characters

WhiteSpaceChar ::= ' ' | \t | U+0020 | U+0009 ws ::= WhiteSpaceChar { WhiteSpaceChar } NewlineChar ::= \r | \n | U+000A | U+000C | U+000D nl ::= [ws] NewLineChar nl* ::= { nl }

2.2.2 Special Characters

We use the nonterminal UniCodeChar to represent any Unicode character.

An OBOChar is either any non-newline unicode character, or a unicode character escaped by a backslash. This can be used to encode newline characters and other special characters in different contexts.

UniCodeChar ::= any Unicode character OBOChar ::= '\:' | '\"' | '\!' | '\' Alpha-Char | ( UniCodeChar - (NewLineChar | '\') ) NonWsChar ::= ( OBOChar - WhiteSpaceChar - NewlineChar)

2.3 Line Termination

Each tag-value clause in OBOF is line separated. The line can optionally be ended by a HiddenComment, indicated by the '!' character - this is semantically silent can be ignored by the parser. Each clause can also have zero or more comma-separated tag-value trailing qualifiers between a '{' and a '}' - this is called a QualifierBlock.

EOL ::= {WhiteSpaceChar} [ QualifierBlock ] {WhiteSpaceChar} [ HiddenComment ] {WhiteSpaceChar} NewLineChar HiddenComment ::= '!' { ( UniCodeChar - NewlineChar ) } QualifierBlock ::= '{' QualifierList '}' QualifierList ::= Qualifier {',' ws Qualifier } Qualifier ::= Rel-ID '=' QuotedString

2.4 Clause Values

Depending on the tag, the values for clauses may be written as UnquotedStrings (in which case the entire line after the tag and before any qualifier block is used as the clause value), or as a QuotedString, in which case it is enclosed within double quotes.

QuotedString ::= DblQuote { ( OBOChar - DblQuote ) } DblQuote UnquotedString ::= { OBOChar }

2.5 Identifiers

We introduce 3 rules for class IDs, relation IDs and instance IDs - these all have the same structure.

IDs fall into one of 3 categories: unprefixed, prefixed or URLs. The prefixed category is divided into two: canonical and non-canonical prefixed IDs. Whilst canonical prefixed IDs are preferred, many OBO documents make use of non-canonical prefixed IDs or unprefixed IDs in a variety of contexts - relation, subsetdef and synonymtypedef IDs have traditionally been unprefixed; xrefs frequently do not conform to the canonical prefixed ID pattern.

Class-ID ::= ID Rel-ID ::= ID Instance-ID ::= ID Person-ID ::= ID Subset-ID ::= ID SynonymType-ID ::= ID - ( '[' { UniCodeChar } ) ID ::= Prefixed-ID | Unprefixed-ID | URL-as-ID URL-as-ID ::= ( 'http:' | 'https:' ) { NonWsChar } Unprefixed-ID ::= { ( NonWsChar - ':' ) } Prefixed-ID ::= Canonical-Prefixed-ID | NonCanonical-Prefixed-ID

Canonical-IDPrefix ::= Alpha-Char { ( '_' | Alpha-Char ) } Canonical-LocalID ::= { Digit } Canonical-Prefixed-ID ::= Canonical-IDPrefix ':' Canonical-LocalID NonCanonical-Prefixed-ID ::= ( Any-IDPrefix ':' Any-LocalID ) - Canonical-Prefixed-ID Any-IDPrefix ::= { ( NonWsChar - ':' ) } Any-LocalID ::= { NonWsChar }

2.6 Xref Lists

An XrefList is a comma-separated list of zero or more Xrefs inside square brackets (these are manditory). Each Xref is a string of characters, optionally followed by a quoted description. Note that commas and square brackets in the xref need to be escaped. We include an extra production rule for when solitary Xrefs are provided (i.e. with the 'xref' tag) - in which case escaping is not needed.

Xref ::= ID [ ws QuotedString ] XrefList ::= '[' [ XrefListItem { ',' ws XrefListItem } ] [ ws ] ']' XrefListItem ::= XrefChar { XrefChar } [ ws QuotedString ] XrefChar ::= (NonWsChar - ',' - ']')

3 OBO Grammar

3.1 OBO Document Structure

An OBO document consists of a series of header frame followed by zero or more entity frames. Each frame has a set of clauses, or <tag,value> pairs. A tag is a token which may be drawn from the set of defined OBOF tags. The value can be atomic or multi-values. In the OBO Abstract syntax a clause is written <Tag>(<V1>...<Vn>).

Each frame should by convention be separated by an empty line, but this is not enforced in the syntax

Note that the term "frame" replaces the previously used "stanza" in order to be more consistent with Manchester Syntax.

OBO-Doc := header-frame { entity-frame } nl* entity-frame ::= term-frame | typedef-frame | instance-frame

3.2 OBO Headers

header-frame ::= { header-clause nl* } header-clause ::= format-version-TVP | data-version-TVP | date-Tag DD:MM:YYYY ws hh:mm | saved-by-TVP | auto-generated-by-TVP | import-Tag IRI | filepath | subsetdef-Tag Subset-ID ws QuotedString | synonymtypedef-Tag SynonymType-ID ws QuotedString [ SynonymScope ] | default-namespace-Tag OBONamespace | idspace-Tag IDPrefix ws IRI [ ws QuotedString ] | treat-xrefs-as-equivalent-Tag IDPrefix | treat-xrefs-as-genus-differentia-Tag IDPrefix ws Rel-ID ws Class-ID | treat-xrefs-as-reverse-genus-differentia-Tag IDPrefix ws Rel-ID ws Class-ID | treat-xrefs-as-relationship-Tag IDPrefix Rel-ID | treat-xrefs-as-is_a-Tag IDPrefix | treat-xrefs-as-has-subclass-Tag IDPrefix | property_value-Tag Relation-ID ( QuotedString XSD-Type | ID ) {WhiteSpaceChar} [ QualifierBlock ] {WhiteSpaceChar} [ HiddenComment ] | remark-TVP | ontology-TVP | owl-axioms-TVP | UnreservedToken ':' [ ws ] UnquotedString

3.3 Term Frames

Term frames introduce and define the meaning of terms (AKA classes).

term-frame ::= nl* '[Term]' nl id-Tag Class-ID EOL { term-frame-clause EOL } term-frame-clause ::= is_anonymous-BT | name-TVP | namespace-Tag OBONamespace | alt_id-Tag ID | def-Tag QuotedString ws XrefList | comment-TVP | subset-Tag Subset-ID | synonym-Tag QuotedString ws SynonymScope [ ws SynonymType-ID ] XrefList | xref-Tag Xref | builtin-BT | property_value-Tag Relation-ID ws ( QuotedString ws XSD-Type | ID ) | is_a-Tag Class-ID | intersection_of-Tag Class-ID | intersection_of-Tag Relation-ID ws Class-ID | union_of-Tag Class-ID | equivalent_to-Tag Class-ID | disjoint_from-Tag Class-ID | relationship-Tag Relation-ID ws Class-ID | is_obsolete-BT | replaced_by-Tag Class-ID | consider-Tag ID | created_by-TVP | creation_date-Tag ISO-8601-DateTime

3.4 Typedef Frames

Typedef frames introduce and define the meaning of relations (AKA properties).

typedef-frame ::= [ nl ] '[Typedef]' nl id-Tag Relation-ID EOL { typedef-frame-clause EOL } typedef-frame-clause ::= is_anonymous-BT | name-TVP | namespace-Tag OBONamespace | alt_id-Tag ID | def-Tag QuotedString ws XrefList | comment-TVP | subset-Tag Subset-ID | synonym-Tag QuotedString ws SynonymScope [ ws SynonymType-ID ] ws XrefList | xref-Tag Xref | property_value-Tag Relation-ID ws ( QuotedString ws XSD-Type | ID ) | domain-Tag Class-ID | range-Tag Class-ID | builtin-BT | holds_over_chain-Tag Relation-ID ws Relation-ID | is_anti_symmetric-BT | is_cyclic-BT | is_reflexive-BT | is_symmetric-BT | is_transitive-BT | is_functional-BT | is_inverse_functional-BT | is_a-Tag Rel-ID | intersection_of-Tag Rel-ID | union_of-Tag Rel-ID | equivalent_to-Tag Rel-ID | disjoint_from-Tag Rel-ID | inverse_of-Tag Rel-ID | transitive_over-Tag Relation-ID | equivalent_to_chain-Tag Relation-ID ws Relation-ID | disjoint_over-Tag Rel-ID | relationship-Tag Rel-ID Rel-ID | is-obsolete-BT | replaced_by-Tag Rel-ID | consider-Tag ID | created_by-TVP | creation_date-Tag ISO-8601-DateTime | expand_assertion_to-Tag QuotedString ws XrefList | expand_expression_to-Tag QuotedString ws XrefList | is_metadata_tag-BT | is_class_level-BT

3.5 Instance Frames

Instance frames introduce and define the meaning of instances (AKA individuals).

instance-frame ::= [ nl ] '[Instance]' nl id-Tag Instance-ID EOL { instance-frame-clause EOL } instance-frame-clause ::= is_anonymous-BT | name-TVP | namespace-Tag OBONamespace | alt_id-Tag ID | def-Tag QuotedString ws XrefList | comment-TVP | subset-Tag Subset-ID | synonym-Tag QuotedString ws SynonymScope [ ws SynonymType-ID ] ws XrefList | xref-Tag Xref | property_value-Tag Relation-ID ID | instance_of-Tag Class-ID | PropertyValueTagValue | relationship-Tag Rel-ID ws ID | created_by-TVP | creation_date-Tag ISO-8601-DateTime | is-obsolete-BT | replaced_by-Tag Instance-ID | consider-Tag ID

3.5 Synonym Scopes

SynonymScope ::= 'EXACT' | 'BROAD' | 'NARROW' | 'RELATED'

4 OBO Document Structure

This section defines structural constraints on an OBO Document. These structural constraints hold on an abstract OBO Document - the result of parsing a physical OBO document file.

4.1 Frame Identifiers

Each frame in an abstract OBO document can be accessed by its identifier - the value of the id tag. A frame can only of one type. A document containing two frames of different types with the same id is illegal.

4.1.1 Merging frames during parsing

Although it is strongly recommended that an OBOF document does not contain two frames with the same identifier, this is syntactically valid. If two frames have the same identifier, then they are combined. Only frames of the same type can be combined - if a document uses the same ID for two frames of different types, the document is structurally invalid.

When two frames F1 and F2 are combined, the new frame F3 has a set of tag-values consisting of the union of the set of tag-values from F1 and the set of tag-values from F2. If two tag-values are have identical tags and identical values, they are considered a single tag-value.

4.2 Typedef constraints

If a Typedef frame has a clause is_metadata_tag(true), then that Typedef id may never be used as a Rel-ID in an intersection_of clause.

4.3 OBO namespaces and ontology name

Note that OBO namespaces are not the same as OWL namespaces - the analog of OWL namespaces are OBO ID spaces. OBO namespaces are semantics-free properties of a frame that allow partitioning of an ontology into sub-ontologies. For example, the GO is partitioned into 3 ontologies (3 OBO namespaces, 1 OWL namespace).

Every frame must have exactly one namespace. However, these do not need to be explicitly assigned. After parsing an OBO Document, any frame without a namespace is assigned the default-namespace, from the OBO Document header. If this is not specified, the Parser assigns a namespace arbitrarily. It is recommended this is equivalent to the URL or file path from which the document was retrieved.

Every OBODoc should have an "ontology" tag specified in the header. If this is not specified, then the parser should supply a default value. This value should be derived from the URL of the source of the ontology (typically using http or file schemes).

4.4 Processing Header Macros

4.4.1 Implicit Header Macros

Every OBO-Doc is automatically populated with two header clauses: It is illegal to override these. These declarations are taken as implicit, and an OBO format generator need not export these

4.4.2 Header Macro Translation

The following table shows how header macros are used to insert new clauses into the ontology document. If the pre-condition is satisfied (middle column) and the header contains the specified macro (first column), then clauses are added to the ontology such that the post-condition is satisfied (right hand column). Note that the original xref tags are not removed from the source ontology.

These clauses are expanded into a fresh ontology, which can be optionally imported.

In addition, any Typedef frames for relations used in a header macro are also copied into the corresponding bridge ontology

Header Macros
Header Precondition Add clauses
treat-xrefs-as-equivalent(Prefix) xref(Prefix:B) ∈ Frame[A] equivalent_to(Prefix:B) ∈ Frame[A]
treat-xrefs-as-is_a(Prefix) xref(Prefix:B) ∈ Frame[A] is_a(Prefix:B) ∈ Frame[A]
treat-xrefs-as-has-subclass(Prefix) xref(Prefix:B) ∈ Frame[A] is_a(Prefix:A) ∈ Frame[B]
treat-xrefs-as-genus-differentia(Prefix Rel-ID Class-ID) xref(Prefix:B) ∈ Frame[A] and intersection_of(...) ∉ Frame[A] intersection_of(X:B) ∈ Frame[A] and intersection_of(Rel-ID Class-ID) ∈ Frame[A]
treat-xrefs-as-reverse-genus-differentia(Prefix Rel-ID Class-ID) xref(Prefix:B) ∈ Frame[A] and intersection_of(...) ∉ Frame[B] intersection_of(X:A) ∈ Frame[B] and intersection_of(Rel-ID Class-ID) ∈ Frame[B]
treat-xrefs-as-relationship(Prefix Rel-ID) xref(Prefix:B) ∈ Frame[A] relationship(Rel-ID X:B) ∈ Frame[A]

4.4 Tag Cardinality Constraints

Note that the id tag is not specified in this table. Each frame has exactly one id tag, and the id tag uniquely identifier the frame. These constrainst must hold after frames with identical ids have been combined (see above).

Any tag not mentioned has free cardinality (zero, one or many).

Cardinality Constraints
Tag Frame Cardinality (Normative) Informative Notes
ontologyHeaderzero or onerecommended for all docs
format_versionHeaderzero or one-
dateHeaderzero or one-
default-namespaceHeaderzero or one-
saved-byHeaderzero or one-
auto-generated-byHeaderzero or one-
is_anonymous*zero or one-
name*zero or onenames are recommended for all frames
namespace*one-
def*zero or one-
comment*zero or one-
domainTypedefzero or oneuse union constructs in place of multiple values
rangeTypedefzero or oneuse union constructs in place of multiple values
is_anti_symmetric*zero or one-
is_cyclic*zero or one-
is_reflexive*zero or one-
is_symmetric*zero or one-
is_transitive*zero or one-
is_functional*zero or one-
is_inverse_functional*zero or one-
is_obsolete*zero or one-
instance_ofInstancezero or oneuse intersection constructs in place of multiple values
intersection_ofTerm or Typedefzero or (two or more)-
union_ofTerm or Typedefzero or (two or more)-
created_by*zero or one-
creation_date*zero or one-
is_metadata_tagTypedefzero or one-
is_class_levelTypedefzero or one-

5 OBO Semantics : Mapping to OWL2-DL

On completions this section will define the semantics of the entirety of OBO via mappings to OWL2. The mappings could also be used to specify a translation procedure and/or an interface to OWL tools (such as OWL reasoners).

The translation is defined using a translation function T which translates (a fragment of) OBO into OWL DL. The definition of T is often recursive, but it will eventually "ground out" in (a fragment of) OWL DL.

5.0 Ontologies

An OBO-Doc translates to an owl Ontology. The Ontology IRI is generated by extracting the value of the "ontology" tag in the header frame

5.0.1 Ontology Documents

Mapping of Ontology Document
OBO Translation - T(S)
OBO-Doc(Header1...HeaderN Frame1..FrameM) Ontology(TOntID(Header[@ontology],Header[@data-version]) TH(Header-1)...TH(Header-1) TF(Frame1)...TF(FrameM) )

5.0.2 Ontology IRIs

An OBO ontology ID can be either an entire IRI or an abbreviated form such as 'go'. The abbreviated form is translated by prefixing the standard obolibrary PURL scheme and affixing with ".owl"

TOntID(IRI) = IRI

TOntID(abbreviated-ID) = 'http://purl.obolibrary.org/obo/' abbreviated-ID '.owl'

An ontology is in abbreviated form if it contains only alphanumeric characters, plus '_', '-' and '.'

5.0.3 Ontology Headers

Mapping of Ontology Headers
OBO Translation - TH(S)
data-version(String) -
import(IRI) Import(IRI)
remark(String) Annotation(T(remark) T(String))
subsetdef(ID String) AnnotationProperty(T(ID)) SubAnnotationPropertyOf( T(ID) T(subsetdef) ) AnnotationAssertion(T(name) T(ID) ID) AnnotationAssertion(T(comment) T(ID) String)
synonymtypedef(ID String [Scope]) AnnotationProperty(T(ID)) SubAnnotationPropertyOf( T(ID) T(subsetdef) ) AnnotationAssertion(T(name) T(ID) T(String)) [AnnotationAssertion(T(hasScope) T(ID) T(Scope))]
<Tag>(String Qualifiers) AnnotationAssertion(T(Qualifiers) T(<Tag>) T(ID) T(String))

5.0.4 Untranslatable OWL axioms

An ontology can contain any number of arbitrary OWL axioms encoded in OWL functional syntax in the ontology header using the owl-axioms tag. The values of all owl-axioms clauses are concatenated together (with newline characters appended) in order of appearance in the document and interpreted as if they were enclosed within an "Ontology( ... )" production in functional syntax. This allows any OWL ontology to be roundtripped through OBO-format, albeit in such a way that expressive OWL axioms are opaque to many OBO-format applications.

Prefixes

The following prefixes are assumed to be pre-declared - other prefixes cannot be declared and thus full IRIs must be used Prefix: xsd: <http://www.w3.org/2001/XMLSchema#> Prefix: owl: <http://www.w3.org/2002/07/owl#> Prefix: : <http://purl.obolibrary.org/obo/> Prefix: oboInOwl: <http://www.geneontology.org/formats/oboInOwl#> Prefix: xml: <http://www.w3.org/XML/1998/namespace> Prefix: rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> Prefix: dc: <http://purl.org/dc/elements/1.1/> Prefix: rdfs: <http://www.w3.org/2000/01/rdf-schema#> It is recommended that writers use this set of prefixes to write CURIES (shortened URIs), but this is not required.

Example

format-version: 1.2 ontology: go date: 17:04:2012 15:38 ... remark: this is an example of what a header in the GO might look like remark: We have two axioms, the first declares that nothing is part of remark: both a nucleus and a cytoplasm, the second declares that remark: nothing is part of both a cytoplasm and a plasma membrane. remark: (Note we could have chosen to use a single axiom with 3 remark: arguments, which is better as it also gives us the spatial remark: disjointness between nucleus and PM) owl-axioms: DisjointClasses(ObjectSomeValuesFrom(:BFO_0000050 :GO_0005737) ObjectSomeValuesFrom(:BFO_0000050 :GO_0005634)) owl-axioms: DisjointClasses(ObjectSomeValuesFrom(:BFO_0000050 :GO_0005737) ObjectSomeValuesFrom(:BFO_0000050 :GO_0005886)) [Term] ...

Behavior for OBO-writers (normative)

When translating an OWL ontology to an OBO-document, any axiom that is not translated according to the standard obo2owl rules must be added as an owl-axioms clause

Recommendations for OBO-writers (informative)

Note that individual axioms may be broken over multiple lines; alternatively, multiple axioms may be concatenated as one line. It is recommended that writers write each axiom on its own line/clause.
Recommended
owl-axioms: DisjointClasses(ObjectSomeValuesFrom(:BFO_0000050 :GO_0005737) ObjectSomeValuesFrom(:BFO_0000050 :GO_0005634)) owl-axioms: DisjointClasses(ObjectSomeValuesFrom(:BFO_0000050 :GO_0005737) ObjectSomeValuesFrom(:BFO_0000050 :GO_0005886))
Allowed
owl-axioms: DisjointClasses(ObjectSomeValuesFrom(:BFO_0000050 :GO_0005737) owl-axioms: ObjectSomeValuesFrom(:BFO_0000050 :GO_0005634)) owl-axioms: DisjointClasses(ObjectSomeValuesFrom(:BFO_0000050 :GO_0005737) owl-axioms: ObjectSomeValuesFrom(:BFO_0000050 :GO_0005886))
Allowed
owl-axioms: DisjointClasses(ObjectSomeValuesFrom(:BFO_0000050 :GO_0005737) ObjectSomeValuesFrom(:BFO_0000050 :GO_0005634)) DisjointClasses(ObjectSomeValuesFrom(:BFO_0000050 :GO_0005737) ObjectSomeValuesFrom(:BFO_0000050 :GO_0005886))

It is strongly recommended that writers only generate owl-axioms clauses for axioms that fall outside of the OBO-subset of OWL2-DL. Remember that most applications which consume obo-format will ignore the owl-axioms headers.

5.1 Declarations

Mapping of frames/stanzas to OWL
OBO Conditions Translation - TF(S)
Term(Class-id Clause-1..Clause-n) Class(T(Class-id)) TC(Clause-1)..TC(Clause-n)
Typedef(Rel-ID) is_metadata_tag(true) ∈ TypedefFrames[Rel-ID] AnnotationProperty(T(Rel-ID)) TP(Clause-1)..PC(Clause-n)
Typedef(Rel-ID) is_metadata_tag(true) ∉ TypedefFrames[Rel-ID] ObjectProperty(T(Rel-ID)) TP(Clause-1)..PC(Clause-n)
Instance(Instance-id) NamedIndividual(T(Instance-id)) TI(Clause-1)..IC(Clause-n)

5.2 Mapping Term Frames - Class Axioms

5.2.1 Class Axioms, Basic Table

Term frames are mapped to Class declarations and Class-level axioms. We list the translations that are specific to Term frames here - generic translations making annotation assertions are applied later on.

Translation of Term frames to OWL
Term frame with id:Class-ID Translation - TC(S)
is_a(SubClass-ID Qualifiers) SubClassOf(T(Qualifiers) T(Class-ID) T(SubClass-ID))
relationship(Rel-ID TargetClass-ID Qualifiers)
on condition: is_metadata_tag(true) ∈ TypedefFrames[Rel-ID]
AnnotationAssertion(T(Qualifiers) T(Class-ID) T(Rel-ID) T(Target-Class-ID) )
relationship(Rel-ID TargetClass-ID Qualifiers)
on condition: is_metadata_tag(true) ∉ TypedefFrames[Rel-ID]
SubClassOf(T(Qualifiers) T(Class-ID) T(rel(Rel-ID TargetClass-ID Qualifiers)) )
intersection_of(GenusClass1 Qualifiers1) ... intersection_of(GenusClassN QualifiersN) ... intersection_of(Rel-ID1 TargetClass-ID1 QualifiersN+1) ... intersection_of(Rel-IDM TargetClass-IDM QualifiersN+M) EquivalentClasses(T(Qualifiers1 .. QualifiersN+M) T(Class-ID) ObjectIntersectionOf( T(GenusClass1) ... T(GenusClassN) T(rel(Rel-ID1 TargetClass-ID1 Qualifiers)) ... T(rel(Rel-IDM TargetClass-IDM Qualifiers)))
union_of(UnionClass1 Qualifiers1) ... union_of(UnionClassN QualifiersN) EquivalentClasses(T(Qualifiers1 .. QualifiersN) T(Class-ID) ObjectUnionOf( T(UnionClass1) ... T(UnionClassN) )
disjoint_from(TargetClass-ID Qualifiers)) DisjointClasses(T(Qualifiers) T(Class-ID) T(TargetClass-ID))
equivalent_to(TargetClass-ID Qualifiers)) EquivalentClasses(T(Qualifiers) T(Class-ID) T(TargetClass-ID))

5.2.2 Treatment of gci_relation qualifier

If Qualifiers contains the tags gci_relation and gci_filler then T(Class-ID) in 5.2.1 is translated as a class expression:

ObjectIntersectionOf(T(Class-ID) ObjectSomeValuesFrom(T(GCI-Relation) T(GCI-Filler)))

This effectively allows a limited form of General Class Inclusion (GCI) axioms in OBO Documents. For example, in: id: CL:0000232 name: erythrocyte relationship: has_part GO:0005634 {gci_relation="part_of", gci_filler="NCBITaxon:7955"} ! nucleus the relationship is translated as: SubClassOf( ObjectIntersectionOf( ObjectSomeValuesFrom(T('part_of') T('NCBITaxon:7955')) T('CL:0000232')) ObjectSomeValuesFrom(T('has_part') T('GO:0005634')) ) The resulting axiom would be (in Manchester-like syntax, informative): (erythrocyte and part_of some 'Danio rerio') SubClassOf has_part some nucleus

5.3 Class Expressions

The relationship and intersection_of tags can take pairs of values; these pairs map to OWL Class Expressions according to the following table.

Mappings are listed in order of precedence - an expression cannot match more than one production rule, only the first match counts.

Translation of class expressions to OWL
Expression Condition Translation - T(S)
rel(Rel-ID Class-ID Qualifiers) cardinality(X) ∈ Qualifiers ObjectExactCardinality(X T(Rel-ID) T(Class-ID))
rel(Rel-ID Class-ID Qualifiers) cardinality(0) ∈ Qualifiers or maxCardinality(0) ∈ Qualifiers ObjectAllValuesFrom(T(Rel-ID) ObjectComplementOf(T(Class-ID)))
rel(Rel-ID Class-ID Qualifiers) minCardinality(A) ∈ Qualifiers and maxCardinality(B) ∈ Qualifiers ObjectIntersectionOf( ObjectMinCardinality(A T(Rel-ID) T(Class-ID)) ObjectMaxCardinality(B T(Rel-ID) T(Class-ID)))
rel(Rel-ID Class-ID Qualifiers) minCardinality(A) ∈ Qualifiers ObjectMinCardinality(A T(Rel-ID) T(Class-ID))
rel(Rel-ID Class-ID Qualifiers) maxCardinality(A) ∈ Qualifiers ObjectMaxCardinality(A T(Rel-ID) T(Class-ID))
rel(Rel-ID Class-ID Qualifiers) all_only(true) ∈ Qualifiers and all_some(true) ∈ Qualifiers ObjectIntersectionOf( ObjectSomeValuesFrom(X T(Rel-ID) T(Class-ID) ObjectAllValuesFrom(X T(Rel-ID) T(Class-ID) )
rel(Rel-ID Class-ID Qualifiers) all_only(true) ∈ Qualifiers ObjectAllValuesFrom(T(Rel-ID) T(Class-ID))
rel(Rel-ID Class-ID Qualifiers) is_class_level(true) ∈ TypedefFrames[Rel-ID] ObjectHasValue(T(Rel-ID) T(Class-ID))
rel(Rel-ID Class-ID Qualifiers) ObjectSomeValuesFrom(T(Rel-ID) T(Class-ID)

5.4 Property Axioms

Note that OWL2 does not have the ability to declare EquivalentProperties between a named object property and either an intersection or a union. We translate these OBO constructs to a weaker axiom and add an AnnotationAssertion - we will later provide an appendix for handling these in FOL. This is also true for equivalent_to_chain

Translation of Typedef frames to OWL
Typedef frame with id:Rel-ID Translation - TP(S)
is_a(SubRel-ID Qualifiers)) SubObjectPropertyOf(T(Qualifiers) T(Rel-ID) T(SubRel-ID))
domain(TargetClass-ID Qualifiers)) ObjectPropertyDomain(T(Qualifiers) T(Rel-ID) T(TargetClass-ID))
range(TargetClass-ID Qualifiers)) ObjectPropertyRange(T(Qualifiers) T(Rel-ID) T(TargetClass-ID))
disjoint_from(TargetRel-ID Qualifiers)) DisjointObjectProperties(T(Qualifiers) T(Rel-ID) T(TargetRel-ID))
inverse_of(TargetRel-ID Qualifiers)) InverseProperties(T(Qualifiers) T(Rel-ID) T(TargetRel-ID))
equivalent_to(TargetRel-ID Qualifiers)) EquivalentObjectProperties(T(Qualifiers) T(Rel-ID) T(TargetRel-ID))
relationship(MetaRel-ID TargetRel-ID Qualifiers) AnnotationAssertion(T(Qualifiers) T(MetaRel-ID) T(Rel-ID) T(TargetRel-ID) )
intersection_of(TargetRel-ID Qualifiers) SubObjectPropertyOf(T(Qualifiers) T(Rel-ID) (TargetRel-ID)) AnnotationAssertion(T(Qualifiers) T(relation_intersection_of) T(Rel-ID) T(TargetRel-ID))
union_of(TargetRel-ID Qualifiers) SubObjectPropertyOf(T(Qualifiers) T(TargetRel-ID) T(Rel-ID)) AnnotationAssertion(T(Qualifiers) T(union_intersection_of) T(Rel-ID) T(TargetRel-ID))
transitive_over(Rel2-ID Qualifiers) SubObjectPropertyOf(T(Qualifiers) ObjectPropertyChain( T(Rel-ID) T(Rel2-ID) ) T(Rel-ID) )
holds_over_chain(Rel1-ID Rel2-ID Qualifiers) SubObjectPropertyOf(T(Qualifiers) ObjectPropertyChain( T(Rel1-ID) T(Rel2-ID) ) T(Rel-ID) )
equivalent_to_chain(Rel1-ID Rel2-ID Qualifiers) SubObjectPropertyOf(T(Qualifiers) ObjectPropertyChain( T(Rel1-ID) T(Rel2-ID) ) T(Rel-ID) ) AnnotationAssertion( TODO )
disjoint_over(OverRel-ID Qualifiers) AnnotationAssertion(T(Qualifiers) T(disjoint_over) T(Rel-ID) T(OverRel-ID))
is_cyclic(true Qualifiers) AnnotationAssertion(T(Qualifiers) T(is_cyclic) T(Rel-ID) T(true))
is_transitive(true Qualifiers) TransitiveObjectProperty(T(Qualifiers) T(Rel-ID))
is_anti_symmetric(true Qualifiers) AnnotationAssertion(T(Qualifiers) T(is_anti_symmetric) T(Rel-ID) T(true))
is_reflexive(true Qualifiers) ReflexiveObjectProperty(T(Qualifiers) T(Rel-ID))
is_symmetric(true Qualifiers) SymmetricObjectProperty(T(Qualifiers) T(Rel-ID))
is_asymmetric(true Qualifiers) AsymmetricObjectProperty(T(Qualifiers) T(Rel-ID))
is_functional(true Qualifiers) FunctionalObjectProperty(T(Qualifiers) T(Rel-ID))
is_inverse_functional(true Qualifiers) InverseFunctionalObjectProperty(T(Qualifiers) T(Rel-ID))
expand_expression_to(String Xrefs Qualifiers) AnnotationAssertion(T(Qualifiers) T(expand_expression_to) T(Rel-ID) T(String))
expand_assertion_to(String Xrefs Qualifiers) AnnotationAssertion(T(Qualifiers) T(expand_assertion_to) T(Rel-ID) T(String))

Inferred Axioms

The OBO Format equivalent_to_chain clause can only be partially expressed in OWL2. We can do some inference as part of translation - see this prover9 file for a proof.

Additional axioms generated by equivalent_to_chain clauses
Typedef frame with id:Rel-ID Translation - TP(S)
equivalent_to_chain(Rel1-ID Rel2-ID Qualifiers) is_transitive(Rel2-ID) SubObjectPropertyOf(ObjectPropertyChain( T(Rel-ID) T(Rel2-ID) ) T(Rel-ID) )
equivalent_to_chain(Rel1-ID Rel2-ID Qualifiers) is_transitive(Rel1-ID) SubObjectPropertyOf(ObjectPropertyChain( T(Rel-ID) T(Rel1-ID) ) T(Rel-ID) )

5.5 Individual Axioms

Translation of Instance frames to OWL
Instance frame with id:Instance-ID Translation - T(S)
instance_of(Class-ID Qualifiers) ClassAssertion(T(Qualifiers) T(Class-ID) T(Instance-ID))
relationship(Rel-ID ID) PropertyAssertion(T(Qualifiers) T(Rel-ID) T(Instance-ID) T(ID))

5.6 Common Elements to Annotation Properties

Translation of common frame elements
Frame with id:ID Translation - T(S)
name(String Qualifiers) AnnotationAssertion(T(Qualifiers) T(name) T(ID) T(String))
def(String Xrefs Qualifiers) AnnotationAssertion(T(ann(Xrefs Qualifiers)) T(def) T(ID) T(String))
synonym(String 'EXACT' [Type] Xrefs Qualifiers) AnnotationAssertion(T(ann( [Type] Xrefs Qualifiers)) T(exactSynonyn) T(ID) T(String))
synonym(String 'NARROW' [Type] Xrefs Qualifiers) AnnotationAssertion(T(ann( [Type] Xrefs Qualifiers)) T(narrowSynonym) T(ID) T(String))
synonym(String 'RELATED' [Type] Xrefs Qualifiers) AnnotationAssertion(T(ann( [Type] Xrefs Qualifiers)) T(relatedSynonym) T(ID) T(String))
synonym(String 'BROAD' [Type] Xrefs Qualifiers) AnnotationAssertion(T(ann( [Type] Xrefs Qualifiers)) T(broadSynonym) T(ID) T(String))
creation_date(ISO-Date Qualifiers) AnnotationAssertion(T(Qualifiers) T(creation_date) T(ID) T(ISO-Date))
xref(X-ID Qualifiers) AnnotationAssertion(T(Qualifiers) T(xref) T(ID) T(X-ID))
property_value(Rel-ID Entity-ID Qualifiers) AnnotationAssertion(T(Qualifiers) T(Rel-ID) T(ID) T(Entity-ID))
property_value(Rel-ID Value XSD-Type Qualifiers) AnnotationAssertion(T(Qualifiers) T(Rel-ID) T(ID) Tliteral(Value XSD-Type))
subset(Subsetdef-ID) AnnotationAssertion(T(subset) T(ID) T(Subsetdef-ID))
<Tag>(String Qualifiers) AnnotationAssertion(T(Qualifiers) T(<Tag>) T(ID) T(String))

5.7 Translation of OBO Qualifiers to Annotations

Translation of qualifier expressions
Expression Translation - T(S)
Qualifier1 ... QualifierN T(Qualifier1) ... T(QualifierN)
Qualifier(Name,Value) Annotation( T(Name) T(Value))
ann(Xref1...XrefN Qualifiers) Annotation( T(xref) T(Xref1)) .. Annotation( T(xref) T(Xref1)) T(Qualifiers)
ann( Type Xrefs Qualifiers) Annotation( T(synonym-type) T(Type)) T(Scope Xrefs Qualifiers)
The following qualifiers are not translated to annotations:

5.8 Translation of Annotation Vocabulary

Annotation properties are drawn from the followling vocabularies:
Translation of identifiers - hardcoded values
ID Translation - T(S)
is_obsolete owl:deprecated
name rdfs:label
comment rdfs:comment
remark rdfs:comment
date dc:date
expand_expression_to obo:IAO_0000424
expand_assertion_to obo:IAO_0000425
definition obo:IAO_0000115
is_anti_symmetric obo:IAO_0000427
replaced_by obo:IAO_0100001
consider oboInOwl:consider
hasScope oboInOwl:hasScope
synonym-type oboInOwl:hasSynonymType
exactSynonym oboInOwl:hasExactSynonym
narrowSynonym oboInOwl:hasNarrowSynonym
broadSynonym oboInOwl:hasBroadSynonym
relatedSynonym oboInOwl:hasRelatedSynonym
synonymtypedef oboInOwl:SynonymTypeProperty
xref oboInOwl:hasDbXref
subsetdef oboInOwl:SubsetProperty
subset oboInOwl:inSubset
alt_id oboInOwl:hasAlternativeId
shorthand oboInOwl:shorthand
created_by dc:creator
creation_date dc:date

Note we have no cognate of IAO:definition_source - we would have a generic xref annotation on the definition annotation assertion

5.8.1 Semantics of synonyms

Mapping of OBO Format synonym clauses is described in 5.6. These map to annotation properties without any semantics in OWL. This section provides semantics external to OWL.

Informal Description (Informative)

In OBO Format, the term "synonym" is used loosely for any kind of alternative label for a class (the "name" tag is used for the community preferred label). A label is a synonym for a class if there exists some user or user community (existing or historic) for which this label unambiguously denotes the class.

Synonyms are always scoped into one of four disjoint categories: EXACT, BROAD, NARROW, RELATED. A synonym is EXACT if it is a "true" synonym - most members of the community served by the ontology regard the exact synonym as substitutable for the primary label. If an ontology contains two classes which share either primary label (name) or exact synonym, then these classes are equivalent. A synonym for a class C is BROAD if it denotes a broader class than C. Here, broader than is an informal notion that encompasses both subsumption and possibly mereological and temporal containment. For example, "skull" could conceivably be a BROAD synonym for the class cranium. Conversely a synonym for a class C is NARROW if C denotes a broader class than the synonym. If a synonym is neither EXACT, NARROW or BROAD, then it is RELATED.

Scoping of synonyms has proven extremely useful for OBO format ontologies - most ontologies use this feature. The ability to designate a synonym as EXACT, in particular, is very useful for introducing additional terminological precision and reducing ambiguity in ontologies. These have also proven useful for NLP purposes.

5.9 Translation of Identifiers

5.9.1. Pre-processing

Every OBO-Document is automatically populated with two clauses: (see 4.4.1)

5.9.2. Translation of identifiers

The following table is used when translating an OBO identifier to an IRI

Translation of identifiers
ID Translation - T(S)
Canonical-Prefixed-ID Tprefix(Canonical-IDPrefix) Canonical-LocalID
NonCanonical-Prefixed-ID Tprefix(Any-IDPrefix) '#' Any-LocalID
Unprefixed-ID OntologyIRI '#' Unprefixed-ID
URL-as-ID URL-as-ID

5.9.3. Special Rules for Relations

An exception is made to 5.9.2 for translating relation IDs:

A side effect of this substitution is the generation of an annotation assertion that preserves the original shorthand relation: AnnotationAssertion(T(shorthand) T(Formal-ID) "Original-ID"^^xsd:string)

These rules are normative. An informative summary is provided in section 8.2.2.

Mapping ID Prefixes

Examples (informative)

5.10 Post-Processing OWL DL

Post-processing OWL2 DL
Remove Axiom If Ontology Contains Replace With
ObjectExactCardinality(n P CE), n>0 TransitiveProperty(P) SomeValuesFrom(P CE)
ObjectMinCardinality(n P CE), n>0 TransitiveProperty(P) SomeValuesFrom(P CE)
ObjectMaxCardinality(n P CE) TransitiveProperty(P) -

5.11 OWL to OBO

Not every OWL ontology can be converted to OBO Format. This may be because the OWL ontology uses constructs that cannot be expressed directly in OBO, or it may be because the corresponding OBO ontology would violate OBO document structure constraints (see section 4.4).

Note that all classes in an OBO Document are assumed to satisfiable. Any axioms refering to owl:Thing should be removed. An ontology with unsatisfiable classes is not translateable.

6 OBO Sublanguages

We introduce the concept of an OBO sublanguage. This is similar to an OWL profile, but the goal is not to provide sublanguages for reasoning purposes. Sublanguages are intended for legacy applications that make unwarranted assumptions about OBO format. The assumptions do not hold about OBO documents as a whole, but they may hold for particular sublanguages.

At this time, only one sublanguage is specified - OBO basic. To define this sublanguage, we first introduce a number of sublanguage characteristics.

6.1 OBO Sublanguage Characteristics

6.1.1 DAG

An ontology is a DAG if the graph formed by its logical relationships does not contain any cycles. Formally: let V be a set of nodes corresponding to all Term frames in the ontology (OWL Classes). Let E be a set of edge pairs A,B where both A and B are in V and either Note that the relationship type (i.e. ObjectProperty) is irrelevant here.

The ontology is a DAG if the transitive closure of E contains no cycles.

Note that any ontology that contains an EquivalenceAxiom between two named classes is not a DAG, as an equivalence axiom is the same as a reciprocal SubClassOf pair.

Rationale: the original Gene Ontology paper described the GO as a DAG. Consequently a large body of legacy software exists that assumes the ontology is a DAG. This software breaks or loops indefinitely if there is a cycle involving any set of is_a or relationship tags

6.1.2 Dangling Clauses

A dangling clause is an is_a, relationship, intersection_of, union_of, transitive_over, equivalent_to_chain, holds_over_chain, disjoint_from, domain or range clause in which any of the values is not declared in the ontology document. Here declaration means there exists a frame with the corresponding identifier.

We can define 3 sub-characteristics:

6.1.3 Unidirectional

An ontology is said to be multidirectional if it contains a class assertion involving property P and a class assertion involving property P', where P is the inverse of P'. An ontology is unidirectional if it is not multidirectional.

For example, if an ontology contains statements using both part_of and has_part properties, then the ontology is not unidirectional.

Note that multidirectional ontologies with frequently be non-DAGs, according to the definition above, but this need not hold.

6.1.4 Fully asserted

An ontology is fully asserted if the graph formed by is_a and relationship clauses (i.e. SubClassOf axioms) is not lacking any relationships that can be inferred from the full set of axioms in the ontology (e.g. EquivalentClasses axioms, or DisjointFrom Axioms).

More formally: O is an ontology, and O' is an ontology consisting only of is_a and relationship tags asserted in O. O is fully-asserted if there is no SubClassOf axiom that is entailed by O but not by O'.

If an ontology is fully asserted, then basic applications can use solely is_a and relationship tags for graph traversal. intersection_of tags can be removed with no loss of functionality in these basic applications. Ontology publishers may wish to keep two versions - an edit version, in which only minimal assertions are made, and a release version, which is fully asserted, with the assertions added by a reasoner.

6.1.5 Fully Labeled

An OBO ontology is fully labeled if and only if every frame has a "name" tag.

6.1.6 Uniquely Labeled

An OBO ontology is uniquely labeled if and only if there are no two entities share the same name, including both the source ontology and its full import closure.

6.1.7 Locally Uniquely Labeled

An OBO ontology is locally uniquely labeled if and only if there are no two entities share the same name, considering only the source ontology but not its import closure.

6.1.8 No equivalence axioms

An OBO document has no equivalence axioms if the OWL translation contains no EquivalentClasses axioms.

6.1.9 No obo-namespace crossing

An OBO document has no obo-namespace crossing if there are no is_a, relationship, intersection_of (genus portion) or union_of clauses that span two obo-namespaces. Note that an obo-namespace is not the same as an OWL namespace. It is specified by the namespace tag in OBOF. An example is found in the GO, which is a single ontology partitioned into 3 obo-namespaces. Some legacy software assumes no namespace crossings, so the basic version of the GO released by the GOC has no obo-namespace crossings.

6.1.10 Singly labeled edges

An OBO document has singly labeled edges if there are no classes A and B such that there are two asserted relationships connecting A and B

6.1.11 No disjointness axioms

An ontology contains no disjointness axioms if it has no DisjointClasses, DisjointUnion or DisjointObjectProperties axioms. On the obo-format level this manifests as no "disjoint_from:" tags.

Rationale: Some OBO-format based tools do not handle disjointness axioms well

6.1.11 No qualifier lists

An OBO document has no qualifier blocks if the OBO syntax concretization has no '{...}'s appearing in the qualifier list position. I.e. the following grammar rule (from section 3.2):

QualifierBlock ::= '{' QualifierList '}'

Can be replaced by: QualifierBlock ::=

On the OWL document level, this is equivalent to an OWL document with no Axiom Annotations except those axiom annotations required to support OBO metadata constructs, such as definitions and synonyms.

6.1.12 No owl-axioms header

See 5.0.4. An ontology has no owl-axioms header if the header does not contain an "owl-axioms:" tag.

6.1.13 No imports

An ontology has no imports if the header does not contain any "import:" tags

6.2 OBO Basic

OBO Basic is a sub-language of OBO format (i.e. every OBO Basic document is automatically an OBO Document). It allows basic applications to continue to make certain simplifying assumptions; many of these simplifying assumptions were based on the initial version of the Gene Ontology, and have become enshrined in many popular and useful tools such as term enrichment tools.

Examples of such assumptions include: traversing the ontology graph ignoring relationship types using a naive algorithm will not lead to cycles (i.e. the ontology is a DAG); every referenced term is declared in the ontology (i.e. there are no dangling clauses).

An ontology is OBO Basic if and only if it has the following characteristics:

6.3 Conversion to sub-languages

In general translation to a sublanguage is lossy, but this is not always the case. An OBO-Dangling-Clause ontology can be converted to a semantically equivalent ontology with declarations for all undeclared referenced frames (note that this typically happens as a side-effect of obo-to-ow-to-obo roundtripping). Note that this means the target ontology is no longer OBO-Fully-Labeled.

7 OWL Macros

This section cab be considered an independent component that can be used for any OWL ontology. It may eventually form its own separate specification. It should be seen as informative for now.

Macros are specified as the range of an AnnotationAssertion on a property. They take the form of a string encoding either an OWL axiom or OWL expression in an extension Manchester Syntax format [BCP 47]. This extension allows the use of variable markers (written ?X or ?Y) in place of classes or class expressions. When these variable markers are replaced by actual OWL classes or expressions (again serialized in Manchester Syntax) as part of a Template Substitution Operation Subst, the resulting string must be valid Manchester Syntax.

Examples are provided in [OWL Macros].

7.1 Macro Annotation Assertions

Two IRIs are used to specify the two different types of macro expansion.

Macros
Macro Annotation Property Variables Generates
expand_expression_to obo:IAO_0000424 ?Y OWL2 Class Expression
expand_assertion_to obo:IAO_0000425 ?X,?Y OWL2 Axiom

7.2 Macro Expansion and Replacement

OWL Macro Expansion Rules
Expression or Axiom to be Replaced Condition Replace With
ObjectSomeValuesFrom(P CE) AnnotationAssertion(expand_expression_to P Template) Subst(Template)[CE]
ObjectAllValuesFrom(P CE) AnnotationAssertion(expand_expression_to P Template) Subst(Template)[CE]
AnnotationAssertion(P A B) AnnotationAssertion(expand_assertion_to P Template) Subst(Template)[A,B]

7.3 Macro-based GCI Enhancement

An alternative strategy is to use the occurrence of macro relations to add General Class Inclusion (GCI) axioms to the ontology

GCI Addition Rules
Matched Expression or Axiom Condition GCI to Add
ObjectSomeValuesFrom(P CE) AnnotationAssertion(expand_expression_to P Template) EquivalentClasses(ObjectSomeValuesFrom(P CE) Subst(Template)[CE])
ObjectAllValuesFrom(P CE) AnnotationAssertion(expand_expression_to P Template) EquivalentClasses(ObjectSomeValuesFrom(P CE) Subst(Template)[CE])
AnnotationAssertion(P A B) AnnotationAssertion(expand_assertion_to P Template) use previous table

7.4 Placement of Generated Axioms in External Ontologies

The behavior of a macro-expansion engine regarding whether to replace on add new GCIs is implementation-dependent, as this decision of whether to place generated axioms in a new ontology or the source ontology.

The recommended behavior is:

8 Recommendations (informative)

8.1 Generating OBO Format Documents

These guidelines are provided to minimize character-level differences between OBO Documents generated by different procedures.

8.1.1 Whitespace

8.1.2 File Comments

8.1.3 Tag Ordering

See the serializer conventions in the OBO format guide

8.1.4 Escaping characters

When writing solitary 'xref' tags, the comma character should be escaped.

8.2 Identifiers

8.2.1 Class (Term) Identifiers

All class identifiers should follow OBO Foundry ID Policy. i.e. all ID-spaces must be registered and approved (email obo-admin at obofoundry.org to register), and the entire ID should consist of the ID space followed by ':' followed by a zero-padded numeric local identifer (a minimum of 7 digits is recommended).

8.2.2 Relation (Property) Identifiers

Relation identifiers should in follow the same guidelines as class identifiers. Note however that the use of symbolic identifiers such as 'part_of' is common in almost all OBO format ontologies, and has a precedent stretching back over ten years. A large body of software now expects symbolic identifiers for relations, and ontology maintainers are understandably reluctant to change these to numeric identifiers.

This specification provides a means of using numeric identifiers globally whilst retaining symbolic identifiers within the context of a single file. Refer to section 5.9.3 for details.

8.3 Individuals

Individuals are retained in OBOF1.4 for compatibility with previous versions of the format. OBOF is not a good choice for modeling using individuals. Ontologies should not contain individuals, and OBOF is not a good choice for knowledge bases that use individuals.

8.4 Imports

When translating between OBO and OWL, owl:imports in an OWL ontology hader are translated directly to an 'import:' header in an OBO document. The contents of the field MUST not be modified during translation (normative).

As a consequence of this, URIs that resolve to OWL documents must be used if the resulting OWL is to be consumed by an OWL parser. Here is an example of a valid OBO document header (taken from the GO editors file):

import: http://purl.obolibrary.org/obo/go/extensions/chebi_import.owl
import: http://purl.obolibrary.org/obo/go/extensions/cl_import.owl
import: http://purl.obolibrary.org/obo/go/extensions/po_import.owl
ontology: go
The onus is on the OBO parser to be able to handle these OWL documents - either by translating them or mapping the URIs to OBO documents that have been translated in advance

Most OBO tools are not equipped to do this, so imports should be used with caution in OBO format. We strongly recommend using OWL whenever imports are required, and providing OBO versions with the imports closure merged. Note that this can be done with owltools:

owltools myont.owl --merge-imports-closure -o -f obo myont.obo
Note that all versions of OBO-Edit prior to 2.3b7 should be considered broken for imports. 2.3b7 introduces the use of oboedit.catalog files which allow OWL URIs to be used in import directives. These are not part of the OBOFormat spec, but details are included here for completeness

8.4.1 OBO-Edit catalog files

An OBO-Edit catalog file contains newline separated entries mapping an OWL URI to a local file containing an OBO document, such that the OBO document can be substituted for the OWL URI with no loss of content. Each mapping uses a single line, and is separated by whitespace. The file can optionally include lines starting with "#" which are ignored and for human consumption. An example:
# GO imports mappings
http://purl.obolibrary.org/obo/go/extensions/chebi_import.owl ../extensions/chebi_import.obo
http://purl.obolibrary.org/obo/go/extensions/cl_import.owl ../extensions/cl_import.obo
http://purl.obolibrary.org/obo/go/extensions/po_import.owl ../extensions/po_import.obo
When an OBO parser encounters an import directive it may choose to load the document obtained via mapping the URI on the left with the OBO document found in the file on the right. The oboedit.catalog file should live in the same directory as the source OBO ontology.

9 References

9.1 Normative References

[BCP 47]
BCP 47 - Tags for Identifying Languages. A. Phillips and M. Davis, eds. IETF, September 2006. http://www.rfc-editor.org/rfc/bcp/bcp47.txt
[OWL 2 Specification]
OWL 2 Web Ontology Language: Structural Specification and Functional-Style Syntax Boris Motik, Peter F. Patel-Schneider, Bijan Parsia, eds. W3C Recommendation, 27 October 2009, http://www.w3.org/TR/2009/REC-owl2-syntax-20091027/. Latest version available at http://www.w3.org/TR/owl2-syntax/.
[RDF Test Cases]
RDF Test Cases. Jan Grant and Dave Beckett, eds. W3C Recommendation, 10 February 2004, http://www.w3.org/TR/2004/REC-rdf-testcases-20040210/. Latest version available as http://www.w3.org/TR/rdf-testcases/.
[RFC 3629]
RFC 3629: UTF-8, a transformation format of ISO 10646. F. Yergeau. IETF, November 2003, http://www.ietf.org/rfc/rfc3629.txt
[RFC 3987]
RFC 3987: Internationalized Resource Identifiers (IRIs). M. Duerst and M. Suignard. IETF, January 2005, http://www.ietf.org/rfc/rfc3987.txt
[SPARQL]
SPARQL Query Language for RDF. Eric Prud'hommeaux and Andy Seaborne, eds. W3C Recommendation, 15 January 2008, http://www.w3.org/TR/2008/REC-rdf-sparql-query-20080115/. Latest version available as http://www.w3.org/TR/rdf-sparql-query/.
[UNICODE]
The Unicode Standard. The Unicode Consortium, Version 5.1.0, ISBN 0-321-48091-0, as updated from time to time by the publication of new versions. (See http://www.unicode.org/unicode/standard/versions/ for the latest version and additional information on versions of the standard and of the Unicode Character Database).

8.2 Non-normative References

[OBO-OWL Horrocks]
OBO Flat File Format Syntax and Semantics and Mapping to OWL Web Ontology Language. Ian Horrocks, March 2007. http://www.cs.man.ac.uk/~horrocks/obo/
[OBO-OWL Golbreich]
The OBO to OWL mapping, GO to OWL 1.1. Christine Golbreich, Ian Horrocks, 2007. http://en.scientificcommons.org/43278282
[OBO-OWL NCBO]
NCBO Mapping OBO to OWL. Dilvan Moreira, John Day-Richter, Chris Mungall, Nigam H. Shah 2007, http://www.bioontology.org/wiki/index.php/OboInOwl:Main_Page
[Obolog]
Obolog specification. Chris Mungall. Aug 2008, http://oboedit.org/obolog/spec/obolog-spec.pdf
[OBO-OWL WG]
Working Group on OBO-OWL interconversion. http://www.obofoundry.org/wiki/index.php/Working_Group_on_OBO-OWL_interconversion
[OBO OWL JBMS]
Mapping between the OBO and OWL ontology languages.. Syed Hamid Tirmizi, Stuart Aitken, Dilvan Moreira, Chris Mungall, Juan Sequeda, Nigam H. Shah and Daniel P. Miranker. In: Proceedings of Semantic Web Applications and Tools for Life Sciences Workshop, Amsterdam, The Netherlands,
[OBO OWL SWAT4LS]
OBO & OWL: Roundtrip Ontology Transformations.. Syed Hamid Tirmizi, Stuart Aitken, Dilvan Moreira, Chris Mungall, Juan Sequeda, Nigam H. Shah and Daniel P. Miranker. In: Proceedings of Semantic Web Applications and Tools for Life Sciences Workshop, Amsterdam, The Netherlands,
[OBO Format 1.4 Guide]
OBO Format 1.4 Guide. Chris Mungall and John-Day Richter, August 2010, http://oboformat.googlecode.com/svn/trunk/doc/GO.format.obo-1_4.html
[Manchester OWL DL Syntax]
OWL 2 Web Ontology Language Manchester Syntax. Matthew Horridge, Peter F. Patel-Schneider, October 2009, http://www.w3.org/TR/owl2-manchester-syntax/
[OWL Macros]
OWL 2 Macros. Chris Mungall, David Osumi-Sutherland, Alan Ruttenberg, June 2010, http://precedings.nature.com/documents/5292/version/1
[OWLDEF]
OWLDEF: Integrating OBO and OWL. R Hoehndorf, A Oellrich, M Dumontier, J Kelso, H Herre, http://leechuck.de/pub/or-sig.pdf
[Relational Patterns]
Relational patterns in OWL and their application to OBO. R Hoehndorf, A Oellrich, M Dumontier, J Kelso, H Herre, "http://www.webont.org/owled/2010/papers/owled2010_submission_3.pdf