The Open Biological and Biomedical Ontology Foundry

OBO Flat File Format 1.4 Syntax and Semantics [WORKING DRAFT]

Editor's Draft: May 2012

This version:: http://purl.obolibrary.org/obo/oboformat/2011-11-29/doc/obo-syntax.html
Editor's version:: https://github.com/owlcollab/oboformat/blob/gh-pages/doc/obo-syntax.html
Latest version:: http://purl.obolibrary.org/obo/oboformat/spec.html
Previous version:: http://purl.obolibrary.org/obo/oboformat/2011-08-28/doc/obo-syntax.html
Editors:: Chris Mungall, Lawrence Berkeley National Laboratory; Alan Ruttenberg, University at Buffalo; Ian Horrocks, University of Oxford; David Osumi-Sutherland, Cambridge University
Contributors:: Erick Antezana, Bayer CropScience; James Balhoff, National Evolutionary Synthesis Center; Melanie Courtot, British Columbia Cancer Research Center; Heiko Dietze, Lawrence Berkeley National Laboratory; John Day-Richter, Google; Mathew Horridge, University of Manchester; Amelia Ireland, The Jackson Laboratory; Martin Larralde, ENS Paris-Saclay; Suzanna Lewis, Lawrence Berkeley National Laboratory; Shahid Manzoor, Lawrence Berkeley National Laboratory; Syed Hamid Tirmizi, University of Texas

Abstract

This document provides a BNF specification and mapping to OWL2 of the OBO Flat File Format, version 1.4

Status of this Document

This is an working draft, for comment by the community.

Comments should be sent to obo-format@lists.sourceforge.net (archive) to ensure wide visibility.

Most of this document can be regarded as stable. The parts that are likely to be unstable are marked red.

There is a working reference implementation available from http://oboformat.googlecode.com/.

1 Introduction
2 Preliminary Definitions
3 OBO Grammar
4 OBO Document Structure
5 OBO Semantics: Mapping to OWL2-DL
6 OBO Sublanguages
7 OWL Macros
8 Recommendations (informative)
9 References

1 Introduction

This document specifies the syntax and semantics of OBO Format (OBOF)

The first part of this specification describes the syntax and structure of OBO Format using BNF. This states how strings of characters in OBOF Files are parsed into abstract OBO documents, and describes constraints on the structure of these documents. The second part of this specification describes the semantics of an OBO document via a mapping to OWL2-DL and community metadata vocabularies.

1.1 Relationship to previous specifications and translations

This specification builds on previous attempts to specify the syntax and/or semantics of OBOF.

1.1.1 Golbreich and Horrocks translation

This document is derived from and indebted to a previous document by Ian Horrocks [OBO-OWL Horrocks], later described in an accompanying paper [OBO-OWL Golbreich]. This document provided a semantics for OBO Format in terms of OWL-DL and what was then called OWL1.1 (later to become OWL2). The syntax was underspecified and the mapping was incomplete as there was no treatment of annotation properties.

implementation: This mapping was implemented in the OWLAPI, and is consequently used by current versions of Protege4

1.1.2 OboInOwl

The OboInOwl mapping attempted to fill in the gaps in the Horrocks translation by providing a full translation of all of OBOF 1.2, including synonyms and definitions[OBO-OWL NCBO]. This mapping requires use of the n-ary relations pattern to represent OBO metadata in its entirety. This has undesirable side-effects, such as introducing modeling classes and individuals into an ontology, affecting both reasoning and usability.

implementation:This mapping was implemented via an XSLT translation from OBO-XML. All class and ontology URIs are of the form http://purl.org/obo/owl.

1.1.3. OBO Format 1.3 mapping

OBO Format 1.3 was accompanied by a grammar and mapping to FOL [Obolog]. This mapping was based on the 2005 version of the OBO Relation Ontology and OBOF1.3/Obolog attempted to do justice to dual instance/type level relations and ternary temporally-qualified instance-level relations. However, this introduced too large an impedance mismatch between OBOF and OWL, and as a consequence OBOF1.3 and Obolog have been deprecated.

implementation:Deprecated.

1.1.4 Tirmizi et al

More recent work by Tirmizi provides an implementation of a fully roundtrippable mapping between OBO to OWL and back [OBO-OWL SWAT4LS] [OBO-OWL JBMS]. This did not attempt to provide a full grammar, and implemented the same specification as 1.1.2

implementation:OBO-Edit 2.0 (removed from OBO-Edit-2.1)

1.1.5 OBO Format Guides

The OBO Format 1.2 guide is too loose to be regarded as a full specification. It served as a basis for (1.1.1).

The [OBO Format 1.4 guide] accompanies this document. It is intended as a guide, not a normative specification.

1.1.6 Normative mapping

This document supersedes previous efforts. The goal is to provide a complete, normative specification of both the syntax and semantics of OBOF1.4. The syntax and semantics specified here also hold for any OBOF1.2 document.

The mapping provided here is consistent with the original Golbreich/Horrocks translation (1.1.1). It improves on this translation by specifying the grammar in full, and by providing a full mapping of all OBOF constructs to OWL2, through a combined use of the Information Artefact Ontology (IAO) and the OboInOwl vocabulary.

This mapping uses a different means of translating OBOF terminological elements (synonyms, definitions, etc) to OWL2 than the one provided in 1.1.2. The n-ary relations pattern has been abandoned, and replaced by the use of OWL2 AxiomAnnotations. Many of the OboInOwl vocabulary elements have been replaced by IAO annotation properties. This mapping also uses a different URI scheme. In place of the deprecated http://purl.org/obo/owl URIs, the OBO Foundry standard http://purl.obolibrary/org/obo scheme is now used.

The semantics specified in this document are inconsistent with the abandoned OBOF1.3 specification 1.1.3, which is strongly deprecated.

This specification also introduces new constructs absent from previous versionsof OBOF, and clarifies many syntactic and semantic elements that were previously unclear.

implementation: the reference implementation of this ontology can be found at the oboformat site

1.2 Additional introductory material

More material can be found at the oboformat site

2 Preliminary Definitions

2.1 BNF Notation

OBO-Format 1.4 is defined using a standard BNF notation, which is summarized in the table below.

Table 1. The BNF Notation Used in this Document
Construct	Syntax	Example
non-terminal symbols	boldface	QuotedString
terminal symbols	single quoted	'Term'
zero or more	curly braces	{ entity-frame }
zero or one	square brackets	[ ws SynonymType-ID ]
alternative	vertical bar	Class-ID \| Relation-ID
grouping	parentheses	( )
complementation	minus symbol	( character - NewLineChar )

Because generic tag-value pairs occur in very many places in the syntax, to save space the grammar has meta-productions for basic Tag-Values Pairs (TVPs), Boolean Tags (BTs) and for the tags themselves (T):

<T>-TVP ::= '<T>:' [ws] UnquotedString <T>-BT ::= '<T>:' [ws] ( 'true' | 'false' ) <T>-Tag ::= '<T>:' [ws]

We provide an additional List meta-production, for comma-delimited lists

<NT>List ::= [<NT> {',' ws <NT> }]

For example, QualifierList should be interpreted as: [ Qualifier { ',' ws Qualifier} ]

Documents in the obo-format consist of sequences of Unicode characters [UNICODE] and are encoded in UTF-8 [RFC 3629].

2.2 Characters

2.2.0 Basic Characters

Alpha-Char ::= a |b |c |d |e |f |g |h |i |j |k |l |m |n |o |p |q |r |s |t |u |v |w |x |y |z | A |B |C |D |E |F |G |H |I |J |K |L |M |N |O |P |Q |R |S |T |U |V |W |X |Y |Z Digit ::= 0 |1 |2 |3 |4 |5 |6 |7 |8 |9

2.2.1 Spacing Characters

WhiteSpaceChar ::= ' ' | \t | U+0020 | U+0009 ws ::= WhiteSpaceChar { WhiteSpaceChar } NewlineChar ::= \r | \n | U+000A | U+000C | U+000D nl ::= [ws] NewLineChar nl* ::= { nl }

2.2.2 Special Characters

We use the nonterminal UniCodeChar to represent any Unicode character.

An OBOChar is either any non-newline unicode character, or a unicode character escaped by a backslash. This can be used to encode newline characters and other special characters in different contexts.

2.3 Line Termination

Each tag-value clause in OBOF is line separated. The line can optionally be ended by a HiddenComment, indicated by the '!' character - this is semantically silent can be ignored by the parser. Each clause can also have zero or more comma-separated tag-value trailing qualifiers between a '{' and a '}' - this is called a QualifierBlock.

EOL ::= {WhiteSpaceChar} [ QualifierBlock ] {WhiteSpaceChar} [ HiddenComment ] {WhiteSpaceChar} NewLineChar HiddenComment ::= '!' { ( UniCodeChar - NewlineChar ) } QualifierBlock ::= '{' QualifierList '}' QualifierList ::= Qualifier {',' ws Qualifier } Qualifier ::= Rel-ID '=' QuotedString

2.4 Clause Values

Depending on the tag, the values for clauses may be written as UnquotedStrings (in which case the entire line after the tag and before any qualifier block is used as the clause value), or as a QuotedString, in which case it is enclosed within double quotes.

QuotedString ::= DblQuote { ( OBOChar - DblQuote ) } DblQuote UnquotedString ::= { OBOChar }

2.5 Identifiers

We introduce 3 rules for class IDs, relation IDs and instance IDs - these all have the same structure.

IDs fall into one of 3 categories: unprefixed, prefixed or URLs. The prefixed category is divided into two: canonical and non-canonical prefixed IDs. Whilst canonical prefixed IDs are preferred, many OBO documents make use of non-canonical prefixed IDs or unprefixed IDs in a variety of contexts - relation, subsetdef and synonymtypedef IDs have traditionally been unprefixed; xrefs frequently do not conform to the canonical prefixed ID pattern.

Class-ID ::= ID Rel-ID ::= ID Instance-ID ::= ID Person-ID ::= ID Subset-ID ::= ID SynonymType-ID ::= ID - ( '[' { UniCodeChar } ) ID ::= Prefixed-ID | Unprefixed-ID | URL-as-ID URL-as-ID ::= ( 'http:' | 'https:' ) { NonWsChar } Unprefixed-ID ::= { ( NonWsChar - ':' ) } Prefixed-ID ::= Canonical-Prefixed-ID | NonCanonical-Prefixed-ID

Canonical-IDPrefix ::= Alpha-Char { ( '_' | Alpha-Char ) } Canonical-LocalID ::= { Digit } Canonical-Prefixed-ID ::= Canonical-IDPrefix ':' Canonical-LocalID NonCanonical-Prefixed-ID ::= ( Any-IDPrefix ':' Any-LocalID ) - Canonical-Prefixed-ID Any-IDPrefix ::= { ( NonWsChar - ':' ) } Any-LocalID ::= { NonWsChar }

2.6 Xref Lists

An XrefList is a comma-separated list of zero or more Xrefs inside square brackets (these are manditory). Each Xref is a string of characters, optionally followed by a quoted description. Note that commas and square brackets in the xref need to be escaped. We include an extra production rule for when solitary Xrefs are provided (i.e. with the 'xref' tag) - in which case escaping is not needed.

Xref ::= ID [ ws QuotedString ] XrefList ::= '[' [ XrefListItem { ',' ws XrefListItem } ] [ ws ] ']' XrefListItem ::= XrefChar { XrefChar } [ ws QuotedString ] XrefChar ::= (NonWsChar - ',' - ']')

3 OBO Grammar

3.1 OBO Document Structure

An OBO document consists of a series of header frame followed by zero or more entity frames. Each frame has a set of clauses, or <tag,value> pairs. A tag is a token which may be drawn from the set of defined OBOF tags. The value can be atomic or multi-values. In the OBO Abstract syntax a clause is written <Tag>(<V1>...<Vn>).

Each frame should by convention be separated by an empty line, but this is not enforced in the syntax

Note that the term "frame" replaces the previously used "stanza" in order to be more consistent with Manchester Syntax.

OBO-Doc := header-frame { entity-frame } nl* entity-frame ::= term-frame | typedef-frame | instance-frame

3.2 OBO Headers

header-frame ::= { header-clause nl* } header-clause ::= format-version-TVP | data-version-TVP | date-Tag DD:MM:YYYY ws hh:mm | saved-by-TVP | auto-generated-by-TVP | import-Tag IRI | filepath | subsetdef-Tag Subset-ID ws QuotedString | synonymtypedef-Tag SynonymType-ID ws QuotedString [ SynonymScope ] | default-namespace-Tag OBONamespace | idspace-Tag IDPrefix ws IRI [ ws QuotedString ] | treat-xrefs-as-equivalent-Tag IDPrefix | treat-xrefs-as-genus-differentia-Tag IDPrefix ws Rel-ID ws Class-ID | treat-xrefs-as-reverse-genus-differentia-Tag IDPrefix ws Rel-ID ws Class-ID | treat-xrefs-as-relationship-Tag IDPrefix Rel-ID | treat-xrefs-as-is_a-Tag IDPrefix | treat-xrefs-as-has-subclass-Tag IDPrefix | property_value-Tag Relation-ID ( QuotedString XSD-Type | ID ) {WhiteSpaceChar} [ QualifierBlock ] {WhiteSpaceChar} [ HiddenComment ] | remark-TVP | ontology-TVP | owl-axioms-TVP | UnreservedToken ':' [ ws ] UnquotedString

3.3 Term Frames

Term frames introduce and define the meaning of terms (AKA classes).

term-frame ::= nl* '[Term]' nl id-Tag Class-ID EOL { term-frame-clause EOL } term-frame-clause ::= is_anonymous-BT | name-TVP | namespace-Tag OBONamespace | alt_id-Tag ID | def-Tag QuotedString ws XrefList | comment-TVP | subset-Tag Subset-ID | synonym-Tag QuotedString ws SynonymScope [ ws SynonymType-ID ] XrefList | xref-Tag Xref | builtin-BT | property_value-Tag Relation-ID ws ( QuotedString ws XSD-Type | ID ) | is_a-Tag Class-ID | intersection_of-Tag Class-ID | intersection_of-Tag Relation-ID ws Class-ID | union_of-Tag Class-ID | equivalent_to-Tag Class-ID | disjoint_from-Tag Class-ID | relationship-Tag Relation-ID ws Class-ID | is_obsolete-BT | replaced_by-Tag Class-ID | consider-Tag ID | created_by-TVP | creation_date-Tag ISO-8601-DateTime

3.4 Typedef Frames

Typedef frames introduce and define the meaning of relations (AKA properties).

typedef-frame ::= [ nl ] '[Typedef]' nl id-Tag Relation-ID EOL { typedef-frame-clause EOL } typedef-frame-clause ::= is_anonymous-BT | name-TVP | namespace-Tag OBONamespace | alt_id-Tag ID | def-Tag QuotedString ws XrefList | comment-TVP | subset-Tag Subset-ID | synonym-Tag QuotedString ws SynonymScope [ ws SynonymType-ID ] ws XrefList | xref-Tag Xref | property_value-Tag Relation-ID ws ( QuotedString ws XSD-Type | ID ) | domain-Tag Class-ID | range-Tag Class-ID | builtin-BT | holds_over_chain-Tag Relation-ID ws Relation-ID | is_anti_symmetric-BT | is_cyclic-BT | is_reflexive-BT | is_symmetric-BT | is_transitive-BT | is_functional-BT | is_inverse_functional-BT | is_a-Tag Rel-ID | intersection_of-Tag Rel-ID | union_of-Tag Rel-ID | equivalent_to-Tag Rel-ID | disjoint_from-Tag Rel-ID | inverse_of-Tag Rel-ID | transitive_over-Tag Relation-ID | equivalent_to_chain-Tag Relation-ID ws Relation-ID | disjoint_over-Tag Rel-ID | relationship-Tag Rel-ID Rel-ID | is-obsolete-BT | replaced_by-Tag Rel-ID | consider-Tag ID | created_by-TVP | creation_date-Tag ISO-8601-DateTime | expand_assertion_to-Tag QuotedString ws XrefList | expand_expression_to-Tag QuotedString ws XrefList | is_metadata_tag-BT | is_class_level-BT

3.5 Instance Frames

Instance frames introduce and define the meaning of instances (AKA individuals).

instance-frame ::= [ nl ] '[Instance]' nl id-Tag Instance-ID EOL { instance-frame-clause EOL } instance-frame-clause ::= is_anonymous-BT | name-TVP | namespace-Tag OBONamespace | alt_id-Tag ID | def-Tag QuotedString ws XrefList | comment-TVP | subset-Tag Subset-ID | synonym-Tag QuotedString ws SynonymScope [ ws SynonymType-ID ] ws XrefList | xref-Tag Xref | property_value-Tag Relation-ID ID | instance_of-Tag Class-ID | PropertyValueTagValue | relationship-Tag Rel-ID ws ID | created_by-TVP | creation_date-Tag ISO-8601-DateTime | is-obsolete-BT | replaced_by-Tag Instance-ID | consider-Tag ID

3.5 Synonym Scopes

SynonymScope ::= 'EXACT' | 'BROAD' | 'NARROW' | 'RELATED'

4 OBO Document Structure

This section defines structural constraints on an OBO Document. These structural constraints hold on an abstract OBO Document - the result of parsing a physical OBO document file.

4.1 Frame Identifiers

Each frame in an abstract OBO document can be accessed by its identifier - the value of the id tag. A frame can only of one type. A document containing two frames of different types with the same id is illegal.

4.1.1 Merging frames during parsing

Although it is strongly recommended that an OBOF document does not contain two frames with the same identifier, this is syntactically valid. If two frames have the same identifier, then they are combined. Only frames of the same type can be combined - if a document uses the same ID for two frames of different types, the document is structurally invalid.

When two frames F1 and F2 are combined, the new frame F3 has a set of tag-values consisting of the union of the set of tag-values from F1 and the set of tag-values from F2. If two tag-values are have identical tags and identical values, they are considered a single tag-value.

4.2 Typedef constraints

If a Typedef frame has a clause is_metadata_tag(true), then that Typedef id may never be used as a Rel-ID in an intersection_of clause.

4.3 OBO namespaces and ontology name

Note that OBO namespaces are not the same as OWL namespaces - the analog of OWL namespaces are OBO ID spaces. OBO namespaces are semantics-free properties of a frame that allow partitioning of an ontology into sub-ontologies. For example, the GO is partitioned into 3 ontologies (3 OBO namespaces, 1 OWL namespace).

Every frame must have exactly one namespace. However, these do not need to be explicitly assigned. After parsing an OBO Document, any frame without a namespace is assigned the default-namespace, from the OBO Document header. If this is not specified, the Parser assigns a namespace arbitrarily. It is recommended this is equivalent to the URL or file path from which the document was retrieved.

Every OBODoc should have an "ontology" tag specified in the header. If this is not specified, then the parser should supply a default value. This value should be derived from the URL of the source of the ontology (typically using http or file schemes).

4.4 Processing Header Macros

4.4.1 Implicit Header Macros

Every OBO-Doc is automatically populated with two header clauses:

treat-xrefs-as-equivalent(RO)
treat-xrefs-as-equivalent(BFO)

It is illegal to override these. These declarations are taken as implicit, and an OBO format generator need not export these

4.4.2 Header Macro Translation

The following table shows how header macros are used to insert new clauses into the ontology document. If the pre-condition is satisfied (middle column) and the header contains the specified macro (first column), then clauses are added to the ontology such that the post-condition is satisfied (right hand column). Note that the original xref tags are not removed from the source ontology.

These clauses are expanded into a fresh ontology, which can be optionally imported.

In addition, any Typedef frames for relations used in a header macro are also copied into the corresponding bridge ontology

Header Macros
Header	Precondition	Add clauses
treat-xrefs-as-equivalent(Prefix)	xref(Prefix:B) ∈ Frame[A]	equivalent_to(Prefix:B) ∈ Frame[A]
treat-xrefs-as-is_a(Prefix)	xref(Prefix:B) ∈ Frame[A]	is_a(Prefix:B) ∈ Frame[A]
treat-xrefs-as-has-subclass(Prefix)	xref(Prefix:B) ∈ Frame[A]	is_a(Prefix:A) ∈ Frame[B]
treat-xrefs-as-genus-differentia(Prefix Rel-ID Class-ID)	xref(Prefix:B) ∈ Frame[A] and intersection_of(...) ∉ Frame[A]	intersection_of(X:B) ∈ Frame[A] and intersection_of(Rel-ID Class-ID) ∈ Frame[A]
treat-xrefs-as-reverse-genus-differentia(Prefix Rel-ID Class-ID)	xref(Prefix:B) ∈ Frame[A] and intersection_of(...) ∉ Frame[B]	intersection_of(X:A) ∈ Frame[B] and intersection_of(Rel-ID Class-ID) ∈ Frame[B]
treat-xrefs-as-relationship(Prefix Rel-ID)	xref(Prefix:B) ∈ Frame[A]	relationship(Rel-ID X:B) ∈ Frame[A]

4.4 Tag Cardinality Constraints

Note that the id tag is not specified in this table. Each frame has exactly one id tag, and the id tag uniquely identifier the frame. These constrainst must hold after frames with identical ids have been combined (see above).

Any tag not mentioned has free cardinality (zero, one or many).

Cardinality Constraints
Tag	Frame	Cardinality (Normative)	Informative Notes
ontology	Header	zero or one	recommended for all docs
format_version	Header	zero or one	-
date	Header	zero or one	-
default-namespace	Header	zero or one	-
saved-by	Header	zero or one	-
auto-generated-by	Header	zero or one	-
is_anonymous	*	zero or one	-
name	*	zero or one	names are recommended for all frames
namespace	*	one	-
def	*	zero or one	-
comment	*	zero or one	-
domain	Typedef	zero or one	use union constructs in place of multiple values
range	Typedef	zero or one	use union constructs in place of multiple values
is_anti_symmetric	*	zero or one	-
is_cyclic	*	zero or one	-
is_reflexive	*	zero or one	-
is_symmetric	*	zero or one	-
is_transitive	*	zero or one	-
is_functional	*	zero or one	-
is_inverse_functional	*	zero or one	-
is_obsolete	*	zero or one	-
instance_of	Instance	zero or one	use intersection constructs in place of multiple values
intersection_of	Term or Typedef	zero or (two or more)	-
union_of	Term or Typedef	zero or (two or more)	-
created_by	*	zero or one	-
creation_date	*	zero or one	-
is_metadata_tag	Typedef	zero or one	-
is_class_level	Typedef	zero or one	-

If a frame with an intersection_of clause has exactly one intersection_of(Class-ID) clause, then the frame is said to have a genus-differentia definition

5 OBO Semantics : Mapping to OWL2-DL

On completions this section will define the semantics of the entirety of OBO via mappings to OWL2. The mappings could also be used to specify a translation procedure and/or an interface to OWL tools (such as OWL reasoners).

The translation is defined using a translation function T which translates (a fragment of) OBO into OWL DL. The definition of T is often recursive, but it will eventually "ground out" in (a fragment of) OWL DL.

5.0 Ontologies

An OBO-Doc translates to an owl Ontology. The Ontology IRI is generated by extracting the value of the "ontology" tag in the header frame

5.0.1 Ontology Documents

Mapping of Ontology Document
OBO	Translation - T(S)
OBO-Doc(Header1...HeaderN Frame1..FrameM)	Ontology(T^OntID(Header[@ontology],Header[@data-version]) T^H(Header-1)...T^H(Header-1) T^F(Frame1)...T^F(FrameM) )

5.0.2 Ontology IRIs

An OBO ontology ID can be either an entire IRI or an abbreviated form such as 'go'. The abbreviated form is translated by prefixing the standard obolibrary PURL scheme and affixing with ".owl"

T^OntID(IRI) = IRI

T^OntID(abbreviated-ID) = 'http://purl.obolibrary.org/obo/' abbreviated-ID '.owl'

An ontology is in abbreviated form if it contains only alphanumeric characters, plus '_', '-' and '.'

5.0.3 Ontology Headers

Mapping of Ontology Headers
OBO	Translation - T^H(S)
data-version(String)	-
import(IRI)	Import(IRI)
remark(String)	Annotation(T(remark) T(String))
subsetdef(ID String)	AnnotationProperty(T(ID)) SubAnnotationPropertyOf( T(ID) T(subsetdef) ) AnnotationAssertion(T(name) T(ID) ID) AnnotationAssertion(T(comment) T(ID) String)
synonymtypedef(ID String [Scope])	AnnotationProperty(T(ID)) SubAnnotationPropertyOf( T(ID) T(subsetdef) ) AnnotationAssertion(T(name) T(ID) T(String)) [AnnotationAssertion(T(hasScope) T(ID) T(Scope))]
<Tag>(String Qualifiers)	AnnotationAssertion(T(Qualifiers) T(<Tag>) T(ID) T(String))

5.0.4 Untranslatable OWL axioms

An ontology can contain any number of arbitrary OWL axioms encoded in OWL functional syntax in the ontology header using the owl-axioms tag. The values of all owl-axioms clauses are concatenated together (with newline characters appended) in order of appearance in the document and interpreted as if they were enclosed within an "Ontology( ... )" production in functional syntax. This allows any OWL ontology to be roundtripped through OBO-format, albeit in such a way that expressive OWL axioms are opaque to many OBO-format applications.

Prefixes

The following prefixes are assumed to be pre-declared - other prefixes cannot be declared and thus full IRIs must be used Prefix: xsd: <http://www.w3.org/2001/XMLSchema#> Prefix: owl: <http://www.w3.org/2002/07/owl#> Prefix: : <http://purl.obolibrary.org/obo/> Prefix: oboInOwl: <http://www.geneontology.org/formats/oboInOwl#> Prefix: xml: <http://www.w3.org/XML/1998/namespace> Prefix: rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> Prefix: dc: <http://purl.org/dc/elements/1.1/> Prefix: rdfs: <http://www.w3.org/2000/01/rdf-schema#> It is recommended that writers use this set of prefixes to write CURIES (shortened URIs), but this is not required.

Example

format-version: 1.2 ontology: go date: 17:04:2012 15:38 ... remark: this is an example of what a header in the GO might look like remark: We have two axioms, the first declares that nothing is part of remark: both a nucleus and a cytoplasm, the second declares that remark: nothing is part of both a cytoplasm and a plasma membrane. remark: (Note we could have chosen to use a single axiom with 3 remark: arguments, which is better as it also gives us the spatial remark: disjointness between nucleus and PM) owl-axioms: DisjointClasses(ObjectSomeValuesFrom(:BFO_0000050 :GO_0005737) ObjectSomeValuesFrom(:BFO_0000050 :GO_0005634)) owl-axioms: DisjointClasses(ObjectSomeValuesFrom(:BFO_0000050 :GO_0005737) ObjectSomeValuesFrom(:BFO_0000050 :GO_0005886)) [Term] ...

Behavior for OBO-writers (normative)

When translating an OWL ontology to an OBO-document, any axiom that is not translated according to the standard obo2owl rules must be added as an owl-axioms clause

Recommendations for OBO-writers (informative)

Note that individual axioms may be broken over multiple lines; alternatively, multiple axioms may be concatenated as one line. It is recommended that writers write each axiom on its own line/clause.

Allowed

owl-axioms: DisjointClasses(ObjectSomeValuesFrom(:BFO_0000050 :GO_0005737) owl-axioms: ObjectSomeValuesFrom(:BFO_0000050 :GO_0005634)) owl-axioms: DisjointClasses(ObjectSomeValuesFrom(:BFO_0000050 :GO_0005737) owl-axioms: ObjectSomeValuesFrom(:BFO_0000050 :GO_0005886))

Allowed

owl-axioms: DisjointClasses(ObjectSomeValuesFrom(:BFO_0000050 :GO_0005737) ObjectSomeValuesFrom(:BFO_0000050 :GO_0005634)) DisjointClasses(ObjectSomeValuesFrom(:BFO_0000050 :GO_0005737) ObjectSomeValuesFrom(:BFO_0000050 :GO_0005886))

It is strongly recommended that writers only generate owl-axioms clauses for axioms that fall outside of the OBO-subset of OWL2-DL. Remember that most applications which consume obo-format will ignore the owl-axioms headers.

5.1 Declarations

Mapping of frames/stanzas to OWL
OBO	Conditions	Translation - T^F(S)
Term(Class-id Clause-1..Clause-n)		Class(T(Class-id)) T^C(Clause-1)..T^C(Clause-n)
Typedef(Rel-ID)	is_metadata_tag(true) ∈ TypedefFrames[Rel-ID]	AnnotationProperty(T(Rel-ID)) T^P(Clause-1)..P^C(Clause-n)
Typedef(Rel-ID)	is_metadata_tag(true) ∉ TypedefFrames[Rel-ID]	ObjectProperty(T(Rel-ID)) T^P(Clause-1)..P^C(Clause-n)
Instance(Instance-id)		NamedIndividual(T(Instance-id)) T^I(Clause-1)..I^C(Clause-n)

5.2 Mapping Term Frames - Class Axioms

5.2.1 Class Axioms, Basic Table

Term frames are mapped to Class declarations and Class-level axioms. We list the translations that are specific to Term frames here - generic translations making annotation assertions are applied later on.

Translation of Term frames to OWL
Term frame with id:Class-ID	Translation - T^C(S)
is_a(SubClass-ID Qualifiers)	SubClassOf(T(Qualifiers) T(Class-ID) T(SubClass-ID))
relationship(Rel-ID TargetClass-ID Qualifiers) on condition: is_metadata_tag(true) ∈ TypedefFrames[Rel-ID]	AnnotationAssertion(T(Qualifiers) T(Class-ID) T(Rel-ID) T(Target-Class-ID) )
relationship(Rel-ID TargetClass-ID Qualifiers) on condition: is_metadata_tag(true) ∉ TypedefFrames[Rel-ID]	SubClassOf(T(Qualifiers) T(Class-ID) T(rel(Rel-ID TargetClass-ID Qualifiers)) )
intersection_of(GenusClass1 Qualifiers1) ... intersection_of(GenusClassN QualifiersN) ... intersection_of(Rel-ID1 TargetClass-ID1 QualifiersN+1) ... intersection_of(Rel-IDM TargetClass-IDM QualifiersN+M)	EquivalentClasses(T(Qualifiers1 .. QualifiersN+M) T(Class-ID) ObjectIntersectionOf( T(GenusClass1) ... T(GenusClassN) T(rel(Rel-ID1 TargetClass-ID1 Qualifiers)) ... T(rel(Rel-IDM TargetClass-IDM Qualifiers)))
union_of(UnionClass1 Qualifiers1) ... union_of(UnionClassN QualifiersN)	EquivalentClasses(T(Qualifiers1 .. QualifiersN) T(Class-ID) ObjectUnionOf( T(UnionClass1) ... T(UnionClassN) )
disjoint_from(TargetClass-ID Qualifiers))	DisjointClasses(T(Qualifiers) T(Class-ID) T(TargetClass-ID))
equivalent_to(TargetClass-ID Qualifiers))	EquivalentClasses(T(Qualifiers) T(Class-ID) T(TargetClass-ID))

5.2.2 Treatment of gci_relation qualifier

If Qualifiers contains the tags gci_relation and gci_filler then T(Class-ID) in 5.2.1 is translated as a class expression:

ObjectIntersectionOf(T(Class-ID) ObjectSomeValuesFrom(T(GCI-Relation) T(GCI-Filler)))

This effectively allows a limited form of General Class Inclusion (GCI) axioms in OBO Documents. For example, in: id: CL:0000232 name: erythrocyte relationship: has_part GO:0005634 {gci_relation="part_of", gci_filler="NCBITaxon:7955"} ! nucleus the relationship is translated as: SubClassOf( ObjectIntersectionOf( ObjectSomeValuesFrom(T('part_of') T('NCBITaxon:7955')) T('CL:0000232')) ObjectSomeValuesFrom(T('has_part') T('GO:0005634')) ) The resulting axiom would be (in Manchester-like syntax, informative): (erythrocyte and part_of some 'Danio rerio') SubClassOf has_part some nucleus

5.3 Class Expressions

The relationship and intersection_of tags can take pairs of values; these pairs map to OWL Class Expressions according to the following table.

Mappings are listed in order of precedence - an expression cannot match more than one production rule, only the first match counts.

Translation of class expressions to OWL
Expression	Condition	Translation - T(S)
rel(Rel-ID Class-ID Qualifiers)	cardinality(X) ∈ Qualifiers	ObjectExactCardinality(X T(Rel-ID) T(Class-ID))
rel(Rel-ID Class-ID Qualifiers)	cardinality(0) ∈ Qualifiers or maxCardinality(0) ∈ Qualifiers	ObjectAllValuesFrom(T(Rel-ID) ObjectComplementOf(T(Class-ID)))
rel(Rel-ID Class-ID Qualifiers)	minCardinality(A) ∈ Qualifiers and maxCardinality(B) ∈ Qualifiers	ObjectIntersectionOf( ObjectMinCardinality(A T(Rel-ID) T(Class-ID)) ObjectMaxCardinality(B T(Rel-ID) T(Class-ID)))
rel(Rel-ID Class-ID Qualifiers)	minCardinality(A) ∈ Qualifiers	ObjectMinCardinality(A T(Rel-ID) T(Class-ID))
rel(Rel-ID Class-ID Qualifiers)	maxCardinality(A) ∈ Qualifiers	ObjectMaxCardinality(A T(Rel-ID) T(Class-ID))
rel(Rel-ID Class-ID Qualifiers)	all_only(true) ∈ Qualifiers and all_some(true) ∈ Qualifiers	ObjectIntersectionOf( ObjectSomeValuesFrom(X T(Rel-ID) T(Class-ID) ObjectAllValuesFrom(X T(Rel-ID) T(Class-ID) )
rel(Rel-ID Class-ID Qualifiers)	all_only(true) ∈ Qualifiers	ObjectAllValuesFrom(T(Rel-ID) T(Class-ID))
rel(Rel-ID Class-ID Qualifiers)	is_class_level(true) ∈ TypedefFrames[Rel-ID]	ObjectHasValue(T(Rel-ID) T(Class-ID))
rel(Rel-ID Class-ID Qualifiers)		ObjectSomeValuesFrom(T(Rel-ID) T(Class-ID)

5.4 Property Axioms

Note that OWL2 does not have the ability to declare EquivalentProperties between a named object property and either an intersection or a union. We translate these OBO constructs to a weaker axiom and add an AnnotationAssertion - we will later provide an appendix for handling these in FOL. This is also true for equivalent_to_chain

Translation of Typedef frames to OWL
Typedef frame with id:Rel-ID	Translation - T^P(S)
is_a(SubRel-ID Qualifiers))	SubObjectPropertyOf(T(Qualifiers) T(Rel-ID) T(SubRel-ID))
domain(TargetClass-ID Qualifiers))	ObjectPropertyDomain(T(Qualifiers) T(Rel-ID) T(TargetClass-ID))
range(TargetClass-ID Qualifiers))	ObjectPropertyRange(T(Qualifiers) T(Rel-ID) T(TargetClass-ID))
disjoint_from(TargetRel-ID Qualifiers))	DisjointObjectProperties(T(Qualifiers) T(Rel-ID) T(TargetRel-ID))
inverse_of(TargetRel-ID Qualifiers))	InverseProperties(T(Qualifiers) T(Rel-ID) T(TargetRel-ID))
equivalent_to(TargetRel-ID Qualifiers))	EquivalentObjectProperties(T(Qualifiers) T(Rel-ID) T(TargetRel-ID))
relationship(MetaRel-ID TargetRel-ID Qualifiers)	AnnotationAssertion(T(Qualifiers) T(MetaRel-ID) T(Rel-ID) T(TargetRel-ID) )
intersection_of(TargetRel-ID Qualifiers)	SubObjectPropertyOf(T(Qualifiers) T(Rel-ID) (TargetRel-ID)) AnnotationAssertion(T(Qualifiers) T(relation_intersection_of) T(Rel-ID) T(TargetRel-ID))
union_of(TargetRel-ID Qualifiers)	SubObjectPropertyOf(T(Qualifiers) T(TargetRel-ID) T(Rel-ID)) AnnotationAssertion(T(Qualifiers) T(union_intersection_of) T(Rel-ID) T(TargetRel-ID))
transitive_over(Rel2-ID Qualifiers)	SubObjectPropertyOf(T(Qualifiers) ObjectPropertyChain( T(Rel-ID) T(Rel2-ID) ) T(Rel-ID) )
holds_over_chain(Rel1-ID Rel2-ID Qualifiers)	SubObjectPropertyOf(T(Qualifiers) ObjectPropertyChain( T(Rel1-ID) T(Rel2-ID) ) T(Rel-ID) )
equivalent_to_chain(Rel1-ID Rel2-ID Qualifiers)	SubObjectPropertyOf(T(Qualifiers) ObjectPropertyChain( T(Rel1-ID) T(Rel2-ID) ) T(Rel-ID) ) AnnotationAssertion( TODO )
disjoint_over(OverRel-ID Qualifiers)	AnnotationAssertion(T(Qualifiers) T(disjoint_over) T(Rel-ID) T(OverRel-ID))
is_cyclic(true Qualifiers)	AnnotationAssertion(T(Qualifiers) T(is_cyclic) T(Rel-ID) T(true))
is_transitive(true Qualifiers)	TransitiveObjectProperty(T(Qualifiers) T(Rel-ID))
is_anti_symmetric(true Qualifiers)	AnnotationAssertion(T(Qualifiers) T(is_anti_symmetric) T(Rel-ID) T(true))
is_reflexive(true Qualifiers)	ReflexiveObjectProperty(T(Qualifiers) T(Rel-ID))
is_symmetric(true Qualifiers)	SymmetricObjectProperty(T(Qualifiers) T(Rel-ID))
is_asymmetric(true Qualifiers)	AsymmetricObjectProperty(T(Qualifiers) T(Rel-ID))
is_functional(true Qualifiers)	FunctionalObjectProperty(T(Qualifiers) T(Rel-ID))
is_inverse_functional(true Qualifiers)	InverseFunctionalObjectProperty(T(Qualifiers) T(Rel-ID))
expand_expression_to(String Xrefs Qualifiers)	AnnotationAssertion(T(Qualifiers) T(expand_expression_to) T(Rel-ID) T(String))
expand_assertion_to(String Xrefs Qualifiers)	AnnotationAssertion(T(Qualifiers) T(expand_assertion_to) T(Rel-ID) T(String))

Inferred Axioms

The OBO Format equivalent_to_chain clause can only be partially expressed in OWL2. We can do some inference as part of translation - see this prover9 file for a proof.

Additional axioms generated by equivalent_to_chain clauses
Typedef frame with id:Rel-ID	Translation - T^P(S)
equivalent_to_chain(Rel1-ID Rel2-ID Qualifiers) is_transitive(Rel2-ID)	SubObjectPropertyOf(ObjectPropertyChain( T(Rel-ID) T(Rel2-ID) ) T(Rel-ID) )
equivalent_to_chain(Rel1-ID Rel2-ID Qualifiers) is_transitive(Rel1-ID)	SubObjectPropertyOf(ObjectPropertyChain( T(Rel-ID) T(Rel1-ID) ) T(Rel-ID) )

5.5 Individual Axioms

Translation of Instance frames to OWL
Instance frame with id:Instance-ID	Translation - T(S)
instance_of(Class-ID Qualifiers)	ClassAssertion(T(Qualifiers) T(Class-ID) T(Instance-ID))
relationship(Rel-ID ID)	PropertyAssertion(T(Qualifiers) T(Rel-ID) T(Instance-ID) T(ID))

5.6 Common Elements to Annotation Properties

Translation of common frame elements
Frame with id:ID	Translation - T(S)
name(String Qualifiers)	AnnotationAssertion(T(Qualifiers) T(name) T(ID) T(String))
def(String Xrefs Qualifiers)	AnnotationAssertion(T(ann(Xrefs Qualifiers)) T(def) T(ID) T(String))
synonym(String 'EXACT' [Type] Xrefs Qualifiers)	AnnotationAssertion(T(ann( [Type] Xrefs Qualifiers)) T(exactSynonyn) T(ID) T(String))
synonym(String 'NARROW' [Type] Xrefs Qualifiers)	AnnotationAssertion(T(ann( [Type] Xrefs Qualifiers)) T(narrowSynonym) T(ID) T(String))
synonym(String 'RELATED' [Type] Xrefs Qualifiers)	AnnotationAssertion(T(ann( [Type] Xrefs Qualifiers)) T(relatedSynonym) T(ID) T(String))
synonym(String 'BROAD' [Type] Xrefs Qualifiers)	AnnotationAssertion(T(ann( [Type] Xrefs Qualifiers)) T(broadSynonym) T(ID) T(String))
creation_date(ISO-Date Qualifiers)	AnnotationAssertion(T(Qualifiers) T(creation_date) T(ID) T(ISO-Date))
xref(X-ID Qualifiers)	AnnotationAssertion(T(Qualifiers) T(xref) T(ID) T(X-ID))
property_value(Rel-ID Entity-ID Qualifiers)	AnnotationAssertion(T(Qualifiers) T(Rel-ID) T(ID) T(Entity-ID))
property_value(Rel-ID Value XSD-Type Qualifiers)	AnnotationAssertion(T(Qualifiers) T(Rel-ID) T(ID) T^literal(Value XSD-Type))
subset(Subsetdef-ID)	AnnotationAssertion(T(subset) T(ID) T(Subsetdef-ID))
<Tag>(String Qualifiers)	AnnotationAssertion(T(Qualifiers) T(<Tag>) T(ID) T(String))

5.7 Translation of OBO Qualifiers to Annotations

Translation of qualifier expressions
Expression	Translation - T(S)
Qualifier1 ... QualifierN	T(Qualifier1) ... T(QualifierN)
Qualifier(Name,Value)	Annotation( T(Name) T(Value))
ann(Xref1...XrefN Qualifiers)	Annotation( T(xref) T(Xref1)) .. Annotation( T(xref) T(Xref1)) T(Qualifiers)
ann( Type Xrefs Qualifiers)	Annotation( T(synonym-type) T(Type)) T(Scope Xrefs Qualifiers)

The following qualifiers are not translated to annotations:

cardinality
minCardinality
maxCardinality
gci_relation
gci_filler

5.8 Translation of Annotation Vocabulary

Annotation properties are drawn from the followling vocabularies:

IAO - http://purl.obolibrary.org/obo/IAO_
oboInOwl - http://www.geneontology.org/formats/oboInOwl#
rdfs - http://www.w3.org/2000/01/rdf-schema#
dc - http://purl.org/dc/elements/1.1/

Translation of identifiers - hardcoded values
ID	Translation - T(S)
is_obsolete	owl:deprecated
name	rdfs:label
comment	rdfs:comment
remark	rdfs:comment
date	dc:date
expand_expression_to	obo:IAO_0000424
expand_assertion_to	obo:IAO_0000425
definition	obo:IAO_0000115
is_anti_symmetric	obo:IAO_0000427
replaced_by	obo:IAO_0100001
consider	oboInOwl:consider
hasScope	oboInOwl:hasScope
synonym-type	oboInOwl:hasSynonymType
exactSynonym	oboInOwl:hasExactSynonym
narrowSynonym	oboInOwl:hasNarrowSynonym
broadSynonym	oboInOwl:hasBroadSynonym
relatedSynonym	oboInOwl:hasRelatedSynonym
synonymtypedef	oboInOwl:SynonymTypeProperty
xref	oboInOwl:hasDbXref
subsetdef	oboInOwl:SubsetProperty
subset	oboInOwl:inSubset
alt_id	oboInOwl:hasAlternativeId
shorthand	oboInOwl:shorthand
created_by	dc:creator
creation_date	dc:date

Note we have no cognate of IAO:definition_source - we would have a generic xref annotation on the definition annotation assertion

5.8.1 Semantics of synonyms

Mapping of OBO Format synonym clauses is described in 5.6. These map to annotation properties without any semantics in OWL. This section provides semantics external to OWL.

Informal Description (Informative)

In OBO Format, the term "synonym" is used loosely for any kind of alternative label for a class (the "name" tag is used for the community preferred label). A label is a synonym for a class if there exists some user or user community (existing or historic) for which this label unambiguously denotes the class.

Synonyms are always scoped into one of four disjoint categories: EXACT, BROAD, NARROW, RELATED. A synonym is EXACT if it is a "true" synonym - most members of the community served by the ontology regard the exact synonym as substitutable for the primary label. If an ontology contains two classes which share either primary label (name) or exact synonym, then these classes are equivalent. A synonym for a class C is BROAD if it denotes a broader class than C. Here, broader than is an informal notion that encompasses both subsumption and possibly mereological and temporal containment. For example, "skull" could conceivably be a BROAD synonym for the class cranium. Conversely a synonym for a class C is NARROW if C denotes a broader class than the synonym. If a synonym is neither EXACT, NARROW or BROAD, then it is RELATED.

Scoping of synonyms has proven extremely useful for OBO format ontologies - most ontologies use this feature. The ability to designate a synonym as EXACT, in particular, is very useful for introducing additional terminological precision and reducing ambiguity in ontologies. These have also proven useful for NLP purposes.

5.9 Translation of Identifiers

5.9.1. Pre-processing

Every OBO-Document is automatically populated with two clauses:

idspace(RO "http://purl.obolibrary.org/obo/RO_")
idspace(BFO "http://purl.obolibrary.org/obo/BFO_")

(see 4.4.1)

5.9.2. Translation of identifiers

The following table is used when translating an OBO identifier to an IRI

Translation of identifiers
ID	Translation - T(S)
Canonical-Prefixed-ID	T^prefix(Canonical-IDPrefix) Canonical-LocalID
NonCanonical-Prefixed-ID	T^prefix(Any-IDPrefix) '#' Any-LocalID
Unprefixed-ID	OntologyIRI '#' Unprefixed-ID
URL-as-ID	URL-as-ID

5.9.3. Special Rules for Relations

An exception is made to 5.9.2 for translating relation IDs:

If an Unprefixed-ID is declared in a Typedef frame, and that Typedef frame has an xref clause, then the value of that xref clause is used in the translation.
If there are multiple xref clauses in the Typedef frame, then precedence is as follows:
- xrefs in the 'BFO' and 'RO' ID-space take priority over all others
- xrefs for which there is a corresponding idspace declaration in the header of the same document take precedence over xrefs for which there is no such declaration
If these rules produce no clear winner, the source document is invalid.

A side effect of this substitution is the generation of an annotation assertion that preserves the original shorthand relation: AnnotationAssertion(T(shorthand) T(Formal-ID) "Original-ID"^^xsd:string)

These rules are normative. An informative summary is provided in section 8.2.2.

Mapping ID Prefixes

If the OBODoc header tag contains an idspace clause idspace(IDPrefix URIPrefix), then the value of T^prefix(IDPrefix) is URIPrefix
If the OBODoc header tag contains no matching idspace clause then the value of of T^prefix(prefix) is 'http://purl.obolibrary.org/obo/prefix_'

Examples (informative)

5.10 Post-Processing OWL DL

Post-processing OWL2 DL
Remove Axiom	If Ontology Contains	Replace With
ObjectExactCardinality(n P CE), n>0	TransitiveProperty(P)	SomeValuesFrom(P CE)
ObjectMinCardinality(n P CE), n>0	TransitiveProperty(P)	SomeValuesFrom(P CE)
ObjectMaxCardinality(n P CE)	TransitiveProperty(P)	-

5.11 OWL to OBO

Not every OWL ontology can be converted to OBO Format. This may be because the OWL ontology uses constructs that cannot be expressed directly in OBO, or it may be because the corresponding OBO ontology would violate OBO document structure constraints (see section 4.4).

If a Frame has more than one name after translation, an arbitrary one is selected, and a warning is fired
If a Frame has more than one comment after translation, the set of distinct comments for that frame is concatenated using the \n escape character
If an expression cannot be translated, it is dropped OR converted into an OWL Macro, see below

Note that all classes in an OBO Document are assumed to satisfiable. Any axioms refering to owl:Thing should be removed. An ontology with unsatisfiable classes is not translateable.

6 OBO Sublanguages

We introduce the concept of an OBO sublanguage. This is similar to an OWL profile, but the goal is not to provide sublanguages for reasoning purposes. Sublanguages are intended for legacy applications that make unwarranted assumptions about OBO format. The assumptions do not hold about OBO documents as a whole, but they may hold for particular sublanguages.

At this time, only one sublanguage is specified - OBO basic. To define this sublanguage, we first introduce a number of sublanguage characteristics.

6.1 OBO Sublanguage Characteristics

6.1.1 DAG

An ontology is a DAG if the graph formed by its logical relationships does not contain any cycles. Formally: let V be a set of nodes corresponding to all Term frames in the ontology (OWL Classes). Let E be a set of edge pairs A,B where both A and B are in V and either

A is_a B (i.e. SubClassOf(A B)) or
relationship: R B (i.e. SubClassOf(A ObjectSomeValuesFrom(R B)))

Note that the relationship type (i.e. ObjectProperty) is irrelevant here.

The ontology is a DAG if the transitive closure of E contains no cycles.

Note that any ontology that contains an EquivalenceAxiom between two named classes is not a DAG, as an equivalence axiom is the same as a reciprocal SubClassOf pair.

Rationale: the original Gene Ontology paper described the GO as a DAG. Consequently a large body of legacy software exists that assumes the ontology is a DAG. This software breaks or loops indefinitely if there is a cycle involving any set of is_a or relationship tags

6.1.2 Dangling Clauses

A dangling clause is an is_a, relationship, intersection_of, union_of, transitive_over, equivalent_to_chain, holds_over_chain, disjoint_from, domain or range clause in which any of the values is not declared in the ontology document. Here declaration means there exists a frame with the corresponding identifier.

We can define 3 sub-characteristics:

Dangling Classes
Dangling Relations
Dangling Individuals

6.1.3 Unidirectional

An ontology is said to be multidirectional if it contains a class assertion involving property P and a class assertion involving property P', where P is the inverse of P'. An ontology is unidirectional if it is not multidirectional.

For example, if an ontology contains statements using both part_of and has_part properties, then the ontology is not unidirectional.

Note that multidirectional ontologies with frequently be non-DAGs, according to the definition above, but this need not hold.

6.1.4 Fully asserted

An ontology is fully asserted if the graph formed by is_a and relationship clauses (i.e. SubClassOf axioms) is not lacking any relationships that can be inferred from the full set of axioms in the ontology (e.g. EquivalentClasses axioms, or DisjointFrom Axioms).

More formally: O is an ontology, and O' is an ontology consisting only of is_a and relationship tags asserted in O. O is fully-asserted if there is no SubClassOf axiom that is entailed by O but not by O'.

If an ontology is fully asserted, then basic applications can use solely is_a and relationship tags for graph traversal. intersection_of tags can be removed with no loss of functionality in these basic applications. Ontology publishers may wish to keep two versions - an edit version, in which only minimal assertions are made, and a release version, which is fully asserted, with the assertions added by a reasoner.

6.1.5 Fully Labeled

An OBO ontology is fully labeled if and only if every frame has a "name" tag.

6.1.6 Uniquely Labeled

An OBO ontology is uniquely labeled if and only if there are no two entities share the same name, including both the source ontology and its full import closure.

6.1.7 Locally Uniquely Labeled

An OBO ontology is locally uniquely labeled if and only if there are no two entities share the same name, considering only the source ontology but not its import closure.

6.1.8 No equivalence axioms

An OBO document has no equivalence axioms if the OWL translation contains no EquivalentClasses axioms.

6.1.9 No obo-namespace crossing

An OBO document has no obo-namespace crossing if there are no is_a, relationship, intersection_of (genus portion) or union_of clauses that span two obo-namespaces. Note that an obo-namespace is not the same as an OWL namespace. It is specified by the namespace tag in OBOF. An example is found in the GO, which is a single ontology partitioned into 3 obo-namespaces. Some legacy software assumes no namespace crossings, so the basic version of the GO released by the GOC has no obo-namespace crossings.

6.1.10 Singly labeled edges

An OBO document has singly labeled edges if there are no classes A and B such that there are two asserted relationships connecting A and B

6.1.11 No disjointness axioms

An ontology contains no disjointness axioms if it has no DisjointClasses, DisjointUnion or DisjointObjectProperties axioms. On the obo-format level this manifests as no "disjoint_from:" tags.

Rationale: Some OBO-format based tools do not handle disjointness axioms well

6.1.11 No qualifier lists

An OBO document has no qualifier blocks if the OBO syntax concretization has no '{...}'s appearing in the qualifier list position. I.e. the following grammar rule (from section 3.2):

QualifierBlock ::= '{' QualifierList '}'

Can be replaced by: QualifierBlock ::=

On the OWL document level, this is equivalent to an OWL document with no Axiom Annotations except those axiom annotations required to support OBO metadata constructs, such as definitions and synonyms.

6.1.12 No owl-axioms header

See 5.0.4. An ontology has no owl-axioms header if the header does not contain an "owl-axioms:" tag.

6.1.13 No imports

An ontology has no imports if the header does not contain any "import:" tags

6.2 OBO Basic

OBO Basic is a sub-language of OBO format (i.e. every OBO Basic document is automatically an OBO Document). It allows basic applications to continue to make certain simplifying assumptions; many of these simplifying assumptions were based on the initial version of the Gene Ontology, and have become enshrined in many popular and useful tools such as term enrichment tools.

Examples of such assumptions include: traversing the ontology graph ignoring relationship types using a naive algorithm will not lead to cycles (i.e. the ontology is a DAG); every referenced term is declared in the ontology (i.e. there are no dangling clauses).

An ontology is OBO Basic if and only if it has the following characteristics:

DAG
Unidirectional
No Dangling Clauses
Fully Asserted
Fully Labeled
No equivalence axioms
Singly labeled edges
No qualifier lists
No disjointness axioms
No owl-axioms header
No imports

6.3 Conversion to sub-languages

In general translation to a sublanguage is lossy, but this is not always the case. An OBO-Dangling-Clause ontology can be converted to a semantically equivalent ontology with declarations for all undeclared referenced frames (note that this typically happens as a side-effect of obo-to-ow-to-obo roundtripping). Note that this means the target ontology is no longer OBO-Fully-Labeled.

7 OWL Macros

This section cab be considered an independent component that can be used for any OWL ontology. It may eventually form its own separate specification. It should be seen as informative for now.

Macros are specified as the range of an AnnotationAssertion on a property. They take the form of a string encoding either an OWL axiom or OWL expression in an extension Manchester Syntax format [BCP 47]. This extension allows the use of variable markers (written ?X or ?Y) in place of classes or class expressions. When these variable markers are replaced by actual OWL classes or expressions (again serialized in Manchester Syntax) as part of a Template Substitution Operation Subst, the resulting string must be valid Manchester Syntax.

Examples are provided in [OWL Macros].

7.1 Macro Annotation Assertions

Two IRIs are used to specify the two different types of macro expansion.

Macros
Macro	Annotation Property	Variables	Generates
expand_expression_to	obo:IAO_0000424	?Y	OWL2 Class Expression
expand_assertion_to	obo:IAO_0000425	?X,?Y	OWL2 Axiom

7.2 Macro Expansion and Replacement

OWL Macro Expansion Rules
Expression or Axiom to be Replaced	Condition	Replace With
ObjectSomeValuesFrom(P CE)	AnnotationAssertion(expand_expression_to P Template)	Subst(Template)[CE]
ObjectAllValuesFrom(P CE)	AnnotationAssertion(expand_expression_to P Template)	Subst(Template)[CE]
AnnotationAssertion(P A B)	AnnotationAssertion(expand_assertion_to P Template)	Subst(Template)[A,B]

7.3 Macro-based GCI Enhancement

An alternative strategy is to use the occurrence of macro relations to add General Class Inclusion (GCI) axioms to the ontology

GCI Addition Rules
Matched Expression or Axiom	Condition	GCI to Add
ObjectSomeValuesFrom(P CE)	AnnotationAssertion(expand_expression_to P Template)	EquivalentClasses(ObjectSomeValuesFrom(P CE) Subst(Template)[CE])
ObjectAllValuesFrom(P CE)	AnnotationAssertion(expand_expression_to P Template)	EquivalentClasses(ObjectSomeValuesFrom(P CE) Subst(Template)[CE])
AnnotationAssertion(P A B)	AnnotationAssertion(expand_assertion_to P Template)	use previous table

7.4 Placement of Generated Axioms in External Ontologies

The behavior of a macro-expansion engine regarding whether to replace on add new GCIs is implementation-dependent, as this decision of whether to place generated axioms in a new ontology or the source ontology.

The recommended behavior is:

The 2nd (GCI) table should be used.
The source ontology should be left unmodified.
GCIs generated from expand_expression_to templates and new axioms generated by expand_assertion_to should be placed in a separate ontology or ontologies. The recommended IRI of this ontology will be specified in the id-policy document.

8 Recommendations (informative)

8.1 Generating OBO Format Documents

These guidelines are provided to minimize character-level differences between OBO Documents generated by different procedures.

8.1.1 Whitespace

For every tag-value line, the initial ':' character should be followed by a space character (' ').
Every successive pair of frames should be seperated by a single empty line.
The initial frame should be separated from the header block by a single empty line.
Each document should have an ontology declaration in the header.
The ontology id should be all lower case, and should match the primary ID space used in the ontology (for example: 'go', 'cl').

8.1.2 File Comments

If a frame references an identifier, and that identifier is opaque (i.e. it conforms to the Canonical-Prefixed-ID production rule), then the generator should add commments, adding a label for every opaque identifier. For example:
```
relationship: part_of ABC:1234567 ! hand
relationship: R:9999999 ABC:1234567 ! part_of hand
                    
```
All file comments should be preceded by a tag-value pair, and there should be exactly one space character on either side of the '!' character

8.1.3 Tag Ordering

See the serializer conventions in the OBO format guide

8.1.4 Escaping characters

When writing solitary 'xref' tags, the comma character should be escaped.

8.2 Identifiers

8.2.1 Class (Term) Identifiers

All class identifiers should follow OBO Foundry ID Policy. i.e. all ID-spaces must be registered and approved (email obo-admin at obofoundry.org to register), and the entire ID should consist of the ID space followed by ':' followed by a zero-padded numeric local identifer (a minimum of 7 digits is recommended).

8.2.2 Relation (Property) Identifiers

Relation identifiers should in follow the same guidelines as class identifiers. Note however that the use of symbolic identifiers such as 'part_of' is common in almost all OBO format ontologies, and has a precedent stretching back over ten years. A large body of software now expects symbolic identifiers for relations, and ontology maintainers are understandably reluctant to change these to numeric identifiers.

This specification provides a means of using numeric identifiers globally whilst retaining symbolic identifiers within the context of a single file. Refer to section 5.9.3 for details.

Every symbolic relation identifier (e.g. 'part_of') should have an xref tag to a formal relation identifier. E.g.
```
[Typedef]
id: has_part
name: has_part
xref: BFO:0000051
```
This xref should refer to either BFO (http://purl.obolibrary.org/obo/bfo.owl) or to RO (http://purl.obolibrary.org/obo/ro.owl). Xrefs to the old RO can be provided for historic purposes, but are otherwise discouraged.
According to the rules in section 5.9.3, the symbolic relation identifier can be used as a shorthand for the formal relation identifier.
When roundtripping the OBO file, the symbolic identifiers should be preserved.

8.3 Individuals

Individuals are retained in OBOF1.4 for compatibility with previous versions of the format. OBOF is not a good choice for modeling using individuals. Ontologies should not contain individuals, and OBOF is not a good choice for knowledge bases that use individuals.

8.4 Imports

When translating between OBO and OWL, owl:imports in an OWL ontology hader are translated directly to an 'import:' header in an OBO document. The contents of the field MUST not be modified during translation (normative).

As a consequence of this, URIs that resolve to OWL documents must be used if the resulting OWL is to be consumed by an OWL parser. Here is an example of a valid OBO document header (taken from the GO editors file):

import: http://purl.obolibrary.org/obo/go/extensions/chebi_import.owl
import: http://purl.obolibrary.org/obo/go/extensions/cl_import.owl
import: http://purl.obolibrary.org/obo/go/extensions/po_import.owl
ontology: go

The onus is on the OBO parser to be able to handle these OWL documents - either by translating them or mapping the URIs to OBO documents that have been translated in advance

Most OBO tools are not equipped to do this, so imports should be used with caution in OBO format. We strongly recommend using OWL whenever imports are required, and providing OBO versions with the imports closure merged. Note that this can be done with owltools:

owltools myont.owl --merge-imports-closure -o -f obo myont.obo

Note that all versions of OBO-Edit prior to 2.3b7 should be considered broken for imports. 2.3b7 introduces the use of oboedit.catalog files which allow OWL URIs to be used in import directives. These are not part of the OBOFormat spec, but details are included here for completeness

8.4.1 OBO-Edit catalog files

An OBO-Edit catalog file contains newline separated entries mapping an OWL URI to a local file containing an OBO document, such that the OBO document can be substituted for the OWL URI with no loss of content. Each mapping uses a single line, and is separated by whitespace. The file can optionally include lines starting with "#" which are ignored and for human consumption. An example:

# GO imports mappings
http://purl.obolibrary.org/obo/go/extensions/chebi_import.owl ../extensions/chebi_import.obo
http://purl.obolibrary.org/obo/go/extensions/cl_import.owl ../extensions/cl_import.obo
http://purl.obolibrary.org/obo/go/extensions/po_import.owl ../extensions/po_import.obo

When an OBO parser encounters an import directive it may choose to load the document obtained via mapping the URI on the left with the OBO document found in the file on the right. The oboedit.catalog file should live in the same directory as the source OBO ontology.

9 References

9.1 Normative References

[BCP 47]: BCP 47 - Tags for Identifying Languages. A. Phillips and M. Davis, eds. IETF, September 2006. http://www.rfc-editor.org/rfc/bcp/bcp47.txt
[OWL 2 Specification]: OWL 2 Web Ontology Language: Structural Specification and Functional-Style Syntax Boris Motik, Peter F. Patel-Schneider, Bijan Parsia, eds. W3C Recommendation, 27 October 2009, http://www.w3.org/TR/2009/REC-owl2-syntax-20091027/. Latest version available at http://www.w3.org/TR/owl2-syntax/.
[RDF Test Cases]: RDF Test Cases. Jan Grant and Dave Beckett, eds. W3C Recommendation, 10 February 2004, http://www.w3.org/TR/2004/REC-rdf-testcases-20040210/. Latest version available as http://www.w3.org/TR/rdf-testcases/.
[RFC 3629]: RFC 3629: UTF-8, a transformation format of ISO 10646. F. Yergeau. IETF, November 2003, http://www.ietf.org/rfc/rfc3629.txt
[RFC 3987]: RFC 3987: Internationalized Resource Identifiers (IRIs). M. Duerst and M. Suignard. IETF, January 2005, http://www.ietf.org/rfc/rfc3987.txt
[SPARQL]: SPARQL Query Language for RDF. Eric Prud'hommeaux and Andy Seaborne, eds. W3C Recommendation, 15 January 2008, http://www.w3.org/TR/2008/REC-rdf-sparql-query-20080115/. Latest version available as http://www.w3.org/TR/rdf-sparql-query/.
[UNICODE]: The Unicode Standard. The Unicode Consortium, Version 5.1.0, ISBN 0-321-48091-0, as updated from time to time by the publication of new versions. (See http://www.unicode.org/unicode/standard/versions/ for the latest version and additional information on versions of the standard and of the Unicode Character Database).

8.2 Non-normative References

[OBO-OWL Horrocks]: OBO Flat File Format Syntax and Semantics and Mapping to OWL Web Ontology Language. Ian Horrocks, March 2007. http://www.cs.man.ac.uk/~horrocks/obo/
[OBO-OWL Golbreich]: The OBO to OWL mapping, GO to OWL 1.1. Christine Golbreich, Ian Horrocks, 2007. http://en.scientificcommons.org/43278282
[OBO-OWL NCBO]: NCBO Mapping OBO to OWL. Dilvan Moreira, John Day-Richter, Chris Mungall, Nigam H. Shah 2007, http://www.bioontology.org/wiki/index.php/OboInOwl:Main_Page
[Obolog]: Obolog specification. Chris Mungall. Aug 2008, http://oboedit.org/obolog/spec/obolog-spec.pdf
[OBO-OWL WG]: Working Group on OBO-OWL interconversion. http://www.obofoundry.org/wiki/index.php/Working_Group_on_OBO-OWL_interconversion
[OBO OWL JBMS]: Mapping between the OBO and OWL ontology languages.. Syed Hamid Tirmizi, Stuart Aitken, Dilvan Moreira, Chris Mungall, Juan Sequeda, Nigam H. Shah and Daniel P. Miranker. In: Proceedings of Semantic Web Applications and Tools for Life Sciences Workshop, Amsterdam, The Netherlands,
[OBO OWL SWAT4LS]: OBO & OWL: Roundtrip Ontology Transformations.. Syed Hamid Tirmizi, Stuart Aitken, Dilvan Moreira, Chris Mungall, Juan Sequeda, Nigam H. Shah and Daniel P. Miranker. In: Proceedings of Semantic Web Applications and Tools for Life Sciences Workshop, Amsterdam, The Netherlands,
[OBO Format 1.4 Guide]: OBO Format 1.4 Guide. Chris Mungall and John-Day Richter, August 2010, http://oboformat.googlecode.com/svn/trunk/doc/GO.format.obo-1_4.html
[Manchester OWL DL Syntax]: OWL 2 Web Ontology Language Manchester Syntax. Matthew Horridge, Peter F. Patel-Schneider, October 2009, http://www.w3.org/TR/owl2-manchester-syntax/
[OWL Macros]: OWL 2 Macros. Chris Mungall, David Osumi-Sutherland, Alan Ruttenberg, June 2010, http://precedings.nature.com/documents/5292/version/1
[OWLDEF]: OWLDEF: Integrating OBO and OWL. R Hoehndorf, A Oellrich, M Dumontier, J Kelso, H Herre, http://leechuck.de/pub/or-sig.pdf
[Relational Patterns]: Relational patterns in OWL and their application to OBO. R Hoehndorf, A Oellrich, M Dumontier, J Kelso, H Herre, "http://www.webont.org/owled/2010/papers/owled2010_submission_3.pdf

OBO Flat File Format 1.4 Syntax and Semantics [WORKING DRAFT]

Editor's Draft: May 2012

Abstract

Status of this Document

Table of Contents

1 Introduction

1.1 Relationship to previous specifications and translations

1.1.1 Golbreich and Horrocks translation

1.1.2 OboInOwl

1.1.3. OBO Format 1.3 mapping

1.1.4 Tirmizi et al

1.1.5 OBO Format Guides

1.1.6 Normative mapping

1.2 Additional introductory material

2 Preliminary Definitions

2.1 BNF Notation

2.2 Characters

2.2.0 Basic Characters

2.2.1 Spacing Characters

2.2.2 Special Characters

2.3 Line Termination

2.4 Clause Values

2.5 Identifiers

2.6 Xref Lists

3 OBO Grammar

3.1 OBO Document Structure

3.2 OBO Headers

3.3 Term Frames

3.4 Typedef Frames

3.5 Instance Frames

3.5 Synonym Scopes

4 OBO Document Structure

4.1 Frame Identifiers

4.1.1 Merging frames during parsing

4.2 Typedef constraints

4.3 OBO namespaces and ontology name

4.4 Processing Header Macros

4.4.1 Implicit Header Macros

4.4.2 Header Macro Translation

4.4 Tag Cardinality Constraints

5 OBO Semantics : Mapping to OWL2-DL

5.0 Ontologies

5.0.1 Ontology Documents

5.0.2 Ontology IRIs

5.0.3 Ontology Headers

5.0.4 Untranslatable OWL axioms

Prefixes

Example

Behavior for OBO-writers (normative)

Recommendations for OBO-writers (informative)

Recommended

Allowed

Allowed

5.1 Declarations

5.2 Mapping Term Frames - Class Axioms

5.2.1 Class Axioms, Basic Table

5.2.2 Treatment of gci_relation qualifier

5.3 Class Expressions

5.4 Property Axioms

Inferred Axioms

5.5 Individual Axioms

5.6 Common Elements to Annotation Properties

5.7 Translation of OBO Qualifiers to Annotations

5.8 Translation of Annotation Vocabulary

5.8.1 Semantics of synonyms

Informal Description (Informative)

5.9 Translation of Identifiers

5.9.1. Pre-processing

5.9.2. Translation of identifiers

5.9.3. Special Rules for Relations

Mapping ID Prefixes

Examples (informative)

5.10 Post-Processing OWL DL

5.11 OWL to OBO

6 OBO Sublanguages

6.1 OBO Sublanguage Characteristics

6.1.1 DAG

6.1.2 Dangling Clauses

6.1.3 Unidirectional

6.1.4 Fully asserted