This document provides a BNF specification and mapping to OWL2 of the OBO Flat File Format, version 1.4
This is an working draft, for comment by the community.
Comments should be sent to obo-format@lists.sourceforge.net (archive) to ensure wide visibility.
Most of this document can be regarded as stable. The parts that are likely to be unstable are marked red.
There is a working reference implementation available from http://oboformat.googlecode.com/.
This document specifies the syntax and semantics of OBO Format (OBOF)
The first part of this specification describes the syntax and structure of OBO Format using BNF. This states how strings of characters in OBOF Files are parsed into abstract OBO documents, and describes constraints on the structure of these documents. The second part of this specification describes the semantics of an OBO document via a mapping to OWL2-DL and community metadata vocabularies.
This specification builds on previous attempts to specify the syntax and/or semantics of OBOF.
This document is derived from and indebted to a previous document by Ian Horrocks [OBO-OWL Horrocks], later described in an accompanying paper [OBO-OWL Golbreich]. This document provided a semantics for OBO Format in terms of OWL-DL and what was then called OWL1.1 (later to become OWL2). The syntax was underspecified and the mapping was incomplete as there was no treatment of annotation properties.
implementation: This mapping was implemented in the OWLAPI, and is consequently used by current versions of Protege4
The OboInOwl mapping attempted to fill in the gaps in the Horrocks translation by providing a full translation of all of OBOF 1.2, including synonyms and definitions[OBO-OWL NCBO]. This mapping requires use of the n-ary relations pattern to represent OBO metadata in its entirety. This has undesirable side-effects, such as introducing modeling classes and individuals into an ontology, affecting both reasoning and usability.
implementation:This mapping was implemented via an XSLT translation from OBO-XML. All class and ontology URIs are of the form http://purl.org/obo/owl.
OBO Format 1.3 was accompanied by a grammar and mapping to FOL [Obolog]. This mapping was based on the 2005 version of the OBO Relation Ontology and OBOF1.3/Obolog attempted to do justice to dual instance/type level relations and ternary temporally-qualified instance-level relations. However, this introduced too large an impedance mismatch between OBOF and OWL, and as a consequence OBOF1.3 and Obolog have been deprecated.
implementation:Deprecated.
More recent work by Tirmizi provides an implementation of a fully roundtrippable mapping between OBO to OWL and back [OBO-OWL SWAT4LS] [OBO-OWL JBMS]. This did not attempt to provide a full grammar, and implemented the same specification as 1.1.2
implementation:OBO-Edit 2.0 (removed from OBO-Edit-2.1)
The OBO Format 1.2 guide is too loose to be regarded as a full specification. It served as a basis for (1.1.1).
The [OBO Format 1.4 guide] accompanies this document. It is intended as a guide, not a normative specification.
This document supersedes previous efforts. The goal is to provide a complete, normative specification of both the syntax and semantics of OBOF1.4. The syntax and semantics specified here also hold for any OBOF1.2 document.
The mapping provided here is consistent with the original Golbreich/Horrocks translation (1.1.1). It improves on this translation by specifying the grammar in full, and by providing a full mapping of all OBOF constructs to OWL2, through a combined use of the Information Artefact Ontology (IAO) and the OboInOwl vocabulary.
This mapping uses a different means of translating OBOF terminological elements (synonyms, definitions, etc) to OWL2 than the one provided in 1.1.2. The n-ary relations pattern has been abandoned, and replaced by the use of OWL2 AxiomAnnotations. Many of the OboInOwl vocabulary elements have been replaced by IAO annotation properties. This mapping also uses a different URI scheme. In place of the deprecated http://purl.org/obo/owl URIs, the OBO Foundry standard http://purl.obolibrary/org/obo scheme is now used.
The semantics specified in this document are inconsistent with the abandoned OBOF1.3 specification 1.1.3, which is strongly deprecated.
This specification also introduces new constructs absent from previous versionsof OBOF, and clarifies many syntactic and semantic elements that were previously unclear.
implementation: the reference implementation of this ontology can be found at the oboformat site
More material can be found at the oboformat site
OBO-Format 1.4 is defined using a standard BNF notation, which is summarized in the table below.
Construct | Syntax | Example |
---|---|---|
non-terminal symbols | boldface | QuotedString |
terminal symbols | single quoted | 'Term' |
zero or more | curly braces | { entity-frame } |
zero or one | square brackets | [ ws SynonymType-ID ] |
alternative | vertical bar | Class-ID | Relation-ID |
grouping | parentheses | ( ) |
complementation | minus symbol | ( character - NewLineChar ) |
Because generic tag-value pairs occur in very many places in the syntax, to save space the grammar has meta-productions for basic Tag-Values Pairs (TVPs), Boolean Tags (BTs) and for the tags themselves (T):
<T>-TVP ::= '<T>:' [ws] UnquotedString <T>-BT ::= '<T>:' [ws] ( 'true' | 'false' ) <T>-Tag ::= '<T>:' [ws]
We provide an additional List meta-production, for comma-delimited lists
<NT>List ::= [<NT> {',' ws <NT> }]
For example, QualifierList should be interpreted as: [ Qualifier { ',' ws Qualifier} ]
Documents in the obo-format consist of sequences of Unicode characters [UNICODE] and are encoded in UTF-8 [RFC 3629].
Alpha-Char ::= a |b |c |d |e |f |g |h |i |j |k |l |m |n |o |p |q |r |s |t |u |v |w |x |y |z | A |B |C |D |E |F |G |H |I |J |K |L |M |N |O |P |Q |R |S |T |U |V |W |X |Y |Z Digit ::= 0 |1 |2 |3 |4 |5 |6 |7 |8 |9
WhiteSpaceChar ::= ' ' | \t | U+0020 | U+0009 ws ::= WhiteSpaceChar { WhiteSpaceChar } NewlineChar ::= \r | \n | U+000A | U+000C | U+000D nl ::= [ws] NewLineChar nl* ::= { nl }
We use the nonterminal UniCodeChar to represent any Unicode character.
An OBOChar is either any non-newline unicode character, or a unicode character escaped by a backslash. This can be used to encode newline characters and other special characters in different contexts.
UniCodeChar ::= any Unicode character OBOChar ::= '\:' | '\"' | '\!' | '\' Alpha-Char | ( UniCodeChar - (NewLineChar | '\') ) NonWsChar ::= ( OBOChar - WhiteSpaceChar - NewlineChar)
Each tag-value clause in OBOF is line separated. The line can optionally be ended by a HiddenComment, indicated by the '!' character - this is semantically silent can be ignored by the parser. Each clause can also have zero or more comma-separated tag-value trailing qualifiers between a '{' and a '}' - this is called a QualifierBlock.
EOL ::= {WhiteSpaceChar} [ QualifierBlock ] {WhiteSpaceChar} [ HiddenComment ] {WhiteSpaceChar} NewLineChar HiddenComment ::= '!' { ( UniCodeChar - NewlineChar ) } QualifierBlock ::= '{' QualifierList '}' QualifierList ::= Qualifier {',' ws Qualifier } Qualifier ::= Rel-ID '=' QuotedString
Depending on the tag, the values for clauses may be written as UnquotedStrings (in which case the entire line after the tag and before any qualifier block is used as the clause value), or as a QuotedString, in which case it is enclosed within double quotes.
QuotedString ::= DblQuote { ( OBOChar - DblQuote ) } DblQuote UnquotedString ::= { OBOChar }
We introduce 3 rules for class IDs, relation IDs and instance IDs - these all have the same structure.
IDs fall into one of 3 categories: unprefixed, prefixed or URLs. The prefixed category is divided into two: canonical and non-canonical prefixed IDs. Whilst canonical prefixed IDs are preferred, many OBO documents make use of non-canonical prefixed IDs or unprefixed IDs in a variety of contexts - relation, subsetdef and synonymtypedef IDs have traditionally been unprefixed; xrefs frequently do not conform to the canonical prefixed ID pattern.
Class-ID ::= ID Rel-ID ::= ID Instance-ID ::= ID Person-ID ::= ID Subset-ID ::= ID SynonymType-ID ::= ID - ( '[' { UniCodeChar } ) ID ::= Prefixed-ID | Unprefixed-ID | URL-as-ID URL-as-ID ::= ( 'http:' | 'https:' ) { NonWsChar } Unprefixed-ID ::= { ( NonWsChar - ':' ) } Prefixed-ID ::= Canonical-Prefixed-ID | NonCanonical-Prefixed-ID
Canonical-IDPrefix ::= Alpha-Char { ( '_' | Alpha-Char ) } Canonical-LocalID ::= { Digit } Canonical-Prefixed-ID ::= Canonical-IDPrefix ':' Canonical-LocalID NonCanonical-Prefixed-ID ::= ( Any-IDPrefix ':' Any-LocalID ) - Canonical-Prefixed-ID Any-IDPrefix ::= { ( NonWsChar - ':' ) } Any-LocalID ::= { NonWsChar }
An XrefList is a comma-separated list of zero or more Xrefs inside square brackets (these are manditory). Each Xref is a string of characters, optionally followed by a quoted description. Note that commas and square brackets in the xref need to be escaped. We include an extra production rule for when solitary Xrefs are provided (i.e. with the 'xref' tag) - in which case escaping is not needed.
Xref ::= ID [ ws QuotedString ] XrefList ::= '[' [ XrefListItem { ',' ws XrefListItem } ] [ ws ] ']' XrefListItem ::= XrefChar { XrefChar } [ ws QuotedString ] XrefChar ::= (NonWsChar - ',' - ']')
An OBO document consists of a series of header frame followed by zero or more entity frames. Each frame has a set of clauses, or <tag,value> pairs. A tag is a token which may be drawn from the set of defined OBOF tags. The value can be atomic or multi-values. In the OBO Abstract syntax a clause is written <Tag>(<V1>...<Vn>).
Each frame should by convention be separated by an empty line, but this is not enforced in the syntax
Note that the term "frame" replaces the previously used "stanza" in order to be more consistent with Manchester Syntax.
OBO-Doc := header-frame { entity-frame } nl* entity-frame ::= term-frame | typedef-frame | instance-frame
header-frame ::= { header-clause nl* } header-clause ::= format-version-TVP | data-version-TVP | date-Tag DD:MM:YYYY ws hh:mm | saved-by-TVP | auto-generated-by-TVP | import-Tag IRI | filepath | subsetdef-Tag Subset-ID ws QuotedString | synonymtypedef-Tag SynonymType-ID ws QuotedString [ SynonymScope ] | default-namespace-Tag OBONamespace | idspace-Tag IDPrefix ws IRI [ ws QuotedString ] | treat-xrefs-as-equivalent-Tag IDPrefix | treat-xrefs-as-genus-differentia-Tag IDPrefix ws Rel-ID ws Class-ID | treat-xrefs-as-reverse-genus-differentia-Tag IDPrefix ws Rel-ID ws Class-ID | treat-xrefs-as-relationship-Tag IDPrefix Rel-ID | treat-xrefs-as-is_a-Tag IDPrefix | treat-xrefs-as-has-subclass-Tag IDPrefix | property_value-Tag Relation-ID ( QuotedString XSD-Type | ID ) {WhiteSpaceChar} [ QualifierBlock ] {WhiteSpaceChar} [ HiddenComment ] | remark-TVP | ontology-TVP | owl-axioms-TVP | UnreservedToken ':' [ ws ] UnquotedString
Term frames introduce and define the meaning of terms (AKA classes).
term-frame ::= nl* '[Term]' nl id-Tag Class-ID EOL { term-frame-clause EOL } term-frame-clause ::= is_anonymous-BT | name-TVP | namespace-Tag OBONamespace | alt_id-Tag ID | def-Tag QuotedString ws XrefList | comment-TVP | subset-Tag Subset-ID | synonym-Tag QuotedString ws SynonymScope [ ws SynonymType-ID ] XrefList | xref-Tag Xref | builtin-BT | property_value-Tag Relation-ID ws ( QuotedString ws XSD-Type | ID ) | is_a-Tag Class-ID | intersection_of-Tag Class-ID | intersection_of-Tag Relation-ID ws Class-ID | union_of-Tag Class-ID | equivalent_to-Tag Class-ID | disjoint_from-Tag Class-ID | relationship-Tag Relation-ID ws Class-ID | is_obsolete-BT | replaced_by-Tag Class-ID | consider-Tag ID | created_by-TVP | creation_date-Tag ISO-8601-DateTime
Typedef frames introduce and define the meaning of relations (AKA properties).
typedef-frame ::= [ nl ] '[Typedef]' nl id-Tag Relation-ID EOL { typedef-frame-clause EOL } typedef-frame-clause ::= is_anonymous-BT | name-TVP | namespace-Tag OBONamespace | alt_id-Tag ID | def-Tag QuotedString ws XrefList | comment-TVP | subset-Tag Subset-ID | synonym-Tag QuotedString ws SynonymScope [ ws SynonymType-ID ] ws XrefList | xref-Tag Xref | property_value-Tag Relation-ID ws ( QuotedString ws XSD-Type | ID ) | domain-Tag Class-ID | range-Tag Class-ID | builtin-BT | holds_over_chain-Tag Relation-ID ws Relation-ID | is_anti_symmetric-BT | is_cyclic-BT | is_reflexive-BT | is_symmetric-BT | is_transitive-BT | is_functional-BT | is_inverse_functional-BT | is_a-Tag Rel-ID | intersection_of-Tag Rel-ID | union_of-Tag Rel-ID | equivalent_to-Tag Rel-ID | disjoint_from-Tag Rel-ID | inverse_of-Tag Rel-ID | transitive_over-Tag Relation-ID | equivalent_to_chain-Tag Relation-ID ws Relation-ID | disjoint_over-Tag Rel-ID | relationship-Tag Rel-ID Rel-ID | is-obsolete-BT | replaced_by-Tag Rel-ID | consider-Tag ID | created_by-TVP | creation_date-Tag ISO-8601-DateTime | expand_assertion_to-Tag QuotedString ws XrefList | expand_expression_to-Tag QuotedString ws XrefList | is_metadata_tag-BT | is_class_level-BT
Instance frames introduce and define the meaning of instances (AKA individuals).
instance-frame ::= [ nl ] '[Instance]' nl id-Tag Instance-ID EOL { instance-frame-clause EOL } instance-frame-clause ::= is_anonymous-BT | name-TVP | namespace-Tag OBONamespace | alt_id-Tag ID | def-Tag QuotedString ws XrefList | comment-TVP | subset-Tag Subset-ID | synonym-Tag QuotedString ws SynonymScope [ ws SynonymType-ID ] ws XrefList | xref-Tag Xref | property_value-Tag Relation-ID ID | instance_of-Tag Class-ID | PropertyValueTagValue | relationship-Tag Rel-ID ws ID | created_by-TVP | creation_date-Tag ISO-8601-DateTime | is-obsolete-BT | replaced_by-Tag Instance-ID | consider-Tag ID
SynonymScope ::= 'EXACT' | 'BROAD' | 'NARROW' | 'RELATED'
This section defines structural constraints on an OBO Document. These structural constraints hold on an abstract OBO Document - the result of parsing a physical OBO document file.
Each frame in an abstract OBO document can be accessed by its identifier - the value of the id tag. A frame can only of one type. A document containing two frames of different types with the same id is illegal.
Although it is strongly recommended that an OBOF document does not contain two frames with the same identifier, this is syntactically valid. If two frames have the same identifier, then they are combined. Only frames of the same type can be combined - if a document uses the same ID for two frames of different types, the document is structurally invalid.
When two frames F1 and F2 are combined, the new frame F3 has a set of tag-values consisting of the union of the set of tag-values from F1 and the set of tag-values from F2. If two tag-values are have identical tags and identical values, they are considered a single tag-value.
If a Typedef frame has a clause is_metadata_tag(true), then that Typedef id may never be used as a Rel-ID in an intersection_of clause.
Note that OBO namespaces are not the same as OWL namespaces - the analog of OWL namespaces are OBO ID spaces. OBO namespaces are semantics-free properties of a frame that allow partitioning of an ontology into sub-ontologies. For example, the GO is partitioned into 3 ontologies (3 OBO namespaces, 1 OWL namespace).
Every frame must have exactly one namespace. However, these do not need to be explicitly assigned. After parsing an OBO Document, any frame without a namespace is assigned the default-namespace, from the OBO Document header. If this is not specified, the Parser assigns a namespace arbitrarily. It is recommended this is equivalent to the URL or file path from which the document was retrieved.
Every OBODoc should have an "ontology" tag specified in the header. If this is not specified, then the parser should supply a default value. This value should be derived from the URL of the source of the ontology (typically using http or file schemes).
The following table shows how header macros are used to insert new clauses into the ontology document. If the pre-condition is satisfied (middle column) and the header contains the specified macro (first column), then clauses are added to the ontology such that the post-condition is satisfied (right hand column). Note that the original xref tags are not removed from the source ontology.
These clauses are expanded into a fresh ontology, which can be optionally imported.
In addition, any Typedef frames for relations used in a header macro are also copied into the corresponding bridge ontology
Note that the id tag is not specified in this table. Each frame has exactly one id tag, and the id tag uniquely identifier the frame. These constrainst must hold after frames with identical ids have been combined (see above).
Any tag not mentioned has free cardinality (zero, one or many).
On completions this section will define the semantics of the entirety of OBO via mappings to OWL2. The mappings could also be used to specify a translation procedure and/or an interface to OWL tools (such as OWL reasoners).
The translation is defined using a translation function T which translates (a fragment of) OBO into OWL DL. The definition of T is often recursive, but it will eventually "ground out" in (a fragment of) OWL DL.
An OBO-Doc translates to an owl Ontology. The Ontology IRI is generated by extracting the value of the "ontology" tag in the header frame
An OBO ontology ID can be either an entire IRI or an abbreviated form such as 'go'. The abbreviated form is translated by prefixing the standard obolibrary PURL scheme and affixing with ".owl"
TOntID(IRI) = IRI
TOntID(abbreviated-ID) = 'http://purl.obolibrary.org/obo/' abbreviated-ID '.owl'
An ontology is in abbreviated form if it contains only alphanumeric characters, plus '_', '-' and '.'
An ontology can contain any number of arbitrary OWL axioms encoded in OWL functional syntax in the ontology header using the owl-axioms tag. The values of all owl-axioms clauses are concatenated together (with newline characters appended) in order of appearance in the document and interpreted as if they were enclosed within an "Ontology( ... )" production in functional syntax. This allows any OWL ontology to be roundtripped through OBO-format, albeit in such a way that expressive OWL axioms are opaque to many OBO-format applications.
The following prefixes are assumed to be pre-declared - other prefixes cannot be declared and thus full IRIs must be used Prefix: xsd: <http://www.w3.org/2001/XMLSchema#> Prefix: owl: <http://www.w3.org/2002/07/owl#> Prefix: : <http://purl.obolibrary.org/obo/> Prefix: oboInOwl: <http://www.geneontology.org/formats/oboInOwl#> Prefix: xml: <http://www.w3.org/XML/1998/namespace> Prefix: rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> Prefix: dc: <http://purl.org/dc/elements/1.1/> Prefix: rdfs: <http://www.w3.org/2000/01/rdf-schema#> It is recommended that writers use this set of prefixes to write CURIES (shortened URIs), but this is not required.
format-version: 1.2 ontology: go date: 17:04:2012 15:38 ... remark: this is an example of what a header in the GO might look like remark: We have two axioms, the first declares that nothing is part of remark: both a nucleus and a cytoplasm, the second declares that remark: nothing is part of both a cytoplasm and a plasma membrane. remark: (Note we could have chosen to use a single axiom with 3 remark: arguments, which is better as it also gives us the spatial remark: disjointness between nucleus and PM) owl-axioms: DisjointClasses(ObjectSomeValuesFrom(:BFO_0000050 :GO_0005737) ObjectSomeValuesFrom(:BFO_0000050 :GO_0005634)) owl-axioms: DisjointClasses(ObjectSomeValuesFrom(:BFO_0000050 :GO_0005737) ObjectSomeValuesFrom(:BFO_0000050 :GO_0005886)) [Term] ...
When translating an OWL ontology to an OBO-document, any axiom that is not translated according to the standard obo2owl rules must be added as an owl-axioms clause
It is strongly recommended that writers only generate owl-axioms clauses for axioms that fall outside of the OBO-subset of OWL2-DL. Remember that most applications which consume obo-format will ignore the owl-axioms headers.
Term frames are mapped to Class declarations and Class-level axioms. We list the translations that are specific to Term frames here - generic translations making annotation assertions are applied later on.
If Qualifiers contains the tags gci_relation and gci_filler then T(Class-ID) in 5.2.1 is translated as a class expression:
ObjectIntersectionOf(T(Class-ID) ObjectSomeValuesFrom(T(GCI-Relation) T(GCI-Filler)))
This effectively allows a limited form of General Class Inclusion (GCI) axioms in OBO Documents. For example, in: id: CL:0000232 name: erythrocyte relationship: has_part GO:0005634 {gci_relation="part_of", gci_filler="NCBITaxon:7955"} ! nucleus the relationship is translated as: SubClassOf( ObjectIntersectionOf( ObjectSomeValuesFrom(T('part_of') T('NCBITaxon:7955')) T('CL:0000232')) ObjectSomeValuesFrom(T('has_part') T('GO:0005634')) ) The resulting axiom would be (in Manchester-like syntax, informative): (erythrocyte and part_of some 'Danio rerio') SubClassOf has_part some nucleus
The relationship and intersection_of tags can take pairs of values; these pairs map to OWL Class Expressions according to the following table.
Mappings are listed in order of precedence - an expression cannot match more than one production rule, only the first match counts.
Note that OWL2 does not have the ability to declare EquivalentProperties between a named object property and either an intersection or a union. We translate these OBO constructs to a weaker axiom and add an AnnotationAssertion - we will later provide an appendix for handling these in FOL. This is also true for equivalent_to_chain
The OBO Format equivalent_to_chain clause can only be partially expressed in OWL2. We can do some inference as part of translation - see this prover9 file for a proof.
Note we have no cognate of IAO:definition_source - we would have a generic xref annotation on the definition annotation assertion
Mapping of OBO Format synonym clauses is described in 5.6. These map to annotation properties without any semantics in OWL. This section provides semantics external to OWL.
In OBO Format, the term "synonym" is used loosely for any kind of alternative label for a class (the "name" tag is used for the community preferred label). A label is a synonym for a class if there exists some user or user community (existing or historic) for which this label unambiguously denotes the class.
Synonyms are always scoped into one of four disjoint categories: EXACT, BROAD, NARROW, RELATED. A synonym is EXACT if it is a "true" synonym - most members of the community served by the ontology regard the exact synonym as substitutable for the primary label. If an ontology contains two classes which share either primary label (name) or exact synonym, then these classes are equivalent. A synonym for a class C is BROAD if it denotes a broader class than C. Here, broader than is an informal notion that encompasses both subsumption and possibly mereological and temporal containment. For example, "skull" could conceivably be a BROAD synonym for the class cranium. Conversely a synonym for a class C is NARROW if C denotes a broader class than the synonym. If a synonym is neither EXACT, NARROW or BROAD, then it is RELATED.
Scoping of synonyms has proven extremely useful for OBO format ontologies - most ontologies use this feature. The ability to designate a synonym as EXACT, in particular, is very useful for introducing additional terminological precision and reducing ambiguity in ontologies. These have also proven useful for NLP purposes.
The following table is used when translating an OBO identifier to an IRI
A side effect of this substitution is the generation of an annotation assertion that preserves the original shorthand relation: AnnotationAssertion(T(shorthand) T(Formal-ID) "Original-ID"^^xsd:string)
These rules are normative. An informative summary is provided in section 8.2.2.
Not every OWL ontology can be converted to OBO Format. This may be because the OWL ontology uses constructs that cannot be expressed directly in OBO, or it may be because the corresponding OBO ontology would violate OBO document structure constraints (see section 4.4).
Note that all classes in an OBO Document are assumed to satisfiable. Any axioms refering to owl:Thing should be removed. An ontology with unsatisfiable classes is not translateable.
We introduce the concept of an OBO sublanguage. This is similar to an OWL profile, but the goal is not to provide sublanguages for reasoning purposes. Sublanguages are intended for legacy applications that make unwarranted assumptions about OBO format. The assumptions do not hold about OBO documents as a whole, but they may hold for particular sublanguages.
At this time, only one sublanguage is specified - OBO basic. To define this sublanguage, we first introduce a number of sublanguage characteristics.
The ontology is a DAG if the transitive closure of E contains no cycles.
Note that any ontology that contains an EquivalenceAxiom between two named classes is not a DAG, as an equivalence axiom is the same as a reciprocal SubClassOf pair.
Rationale: the original Gene Ontology paper described the GO as a DAG. Consequently a large body of legacy software exists that assumes the ontology is a DAG. This software breaks or loops indefinitely if there is a cycle involving any set of is_a or relationship tagsA dangling clause is an is_a, relationship, intersection_of, union_of, transitive_over, equivalent_to_chain, holds_over_chain, disjoint_from, domain or range clause in which any of the values is not declared in the ontology document. Here declaration means there exists a frame with the corresponding identifier.
An ontology is said to be multidirectional if it contains a class assertion involving property P and a class assertion involving property P', where P is the inverse of P'. An ontology is unidirectional if it is not multidirectional.
For example, if an ontology contains statements using both part_of and has_part properties, then the ontology is not unidirectional.
Note that multidirectional ontologies with frequently be non-DAGs, according to the definition above, but this need not hold.
An ontology is fully asserted if the graph formed by is_a and relationship clauses (i.e. SubClassOf axioms) is not lacking any relationships that can be inferred from the full set of axioms in the ontology (e.g. EquivalentClasses axioms, or DisjointFrom Axioms).
More formally: O is an ontology, and O' is an ontology consisting only of is_a and relationship tags asserted in O. O is fully-asserted if there is no SubClassOf axiom that is entailed by O but not by O'.
If an ontology is fully asserted, then basic applications can use solely is_a and relationship tags for graph traversal. intersection_of tags can be removed with no loss of functionality in these basic applications. Ontology publishers may wish to keep two versions - an edit version, in which only minimal assertions are made, and a release version, which is fully asserted, with the assertions added by a reasoner.
An OBO ontology is fully labeled if and only if every frame has a "name" tag.
An OBO ontology is uniquely labeled if and only if there are no two entities share the same name, including both the source ontology and its full import closure.
An OBO ontology is locally uniquely labeled if and only if there are no two entities share the same name, considering only the source ontology but not its import closure.
An OBO document has no equivalence axioms if the OWL translation contains no EquivalentClasses axioms.
An OBO document has no obo-namespace crossing if there are no is_a, relationship, intersection_of (genus portion) or union_of clauses that span two obo-namespaces. Note that an obo-namespace is not the same as an OWL namespace. It is specified by the namespace tag in OBOF. An example is found in the GO, which is a single ontology partitioned into 3 obo-namespaces. Some legacy software assumes no namespace crossings, so the basic version of the GO released by the GOC has no obo-namespace crossings.
An OBO document has singly labeled edges if there are no classes A and B such that there are two asserted relationships connecting A and B
An ontology contains no disjointness axioms if it has no DisjointClasses, DisjointUnion or DisjointObjectProperties axioms. On the obo-format level this manifests as no "disjoint_from:" tags.
Rationale: Some OBO-format based tools do not handle disjointness axioms well
An OBO document has no qualifier blocks if the OBO syntax concretization has no '{...}'s appearing in the qualifier list position. I.e. the following grammar rule (from section 3.2):
QualifierBlock ::= '{' QualifierList '}'
Can be replaced by: QualifierBlock ::=
On the OWL document level, this is equivalent to an OWL document with no Axiom Annotations except those axiom annotations required to support OBO metadata constructs, such as definitions and synonyms.
See 5.0.4. An ontology has no owl-axioms header if the header does not contain an "owl-axioms:" tag.
An ontology has no imports if the header does not contain any "import:" tags
OBO Basic is a sub-language of OBO format (i.e. every OBO Basic document is automatically an OBO Document). It allows basic applications to continue to make certain simplifying assumptions; many of these simplifying assumptions were based on the initial version of the Gene Ontology, and have become enshrined in many popular and useful tools such as term enrichment tools.
Examples of such assumptions include: traversing the ontology graph ignoring relationship types using a naive algorithm will not lead to cycles (i.e. the ontology is a DAG); every referenced term is declared in the ontology (i.e. there are no dangling clauses).
In general translation to a sublanguage is lossy, but this is not always the case. An OBO-Dangling-Clause ontology can be converted to a semantically equivalent ontology with declarations for all undeclared referenced frames (note that this typically happens as a side-effect of obo-to-ow-to-obo roundtripping). Note that this means the target ontology is no longer OBO-Fully-Labeled.
This section cab be considered an independent component that can be used for any OWL ontology. It may eventually form its own separate specification. It should be seen as informative for now.
Macros are specified as the range of an AnnotationAssertion on a property. They take the form of a string encoding either an OWL axiom or OWL expression in an extension Manchester Syntax format [BCP 47]. This extension allows the use of variable markers (written ?X or ?Y) in place of classes or class expressions. When these variable markers are replaced by actual OWL classes or expressions (again serialized in Manchester Syntax) as part of a Template Substitution Operation Subst, the resulting string must be valid Manchester Syntax.
Examples are provided in [OWL Macros].
Two IRIs are used to specify the two different types of macro expansion.
Macro | Annotation Property | Variables | Generates |
---|---|---|---|
expand_expression_to | obo:IAO_0000424 | ?Y | OWL2 Class Expression |
expand_assertion_to | obo:IAO_0000425 | ?X,?Y | OWL2 Axiom |
An alternative strategy is to use the occurrence of macro relations to add General Class Inclusion (GCI) axioms to the ontology
The behavior of a macro-expansion engine regarding whether to replace on add new GCIs is implementation-dependent, as this decision of whether to place generated axioms in a new ontology or the source ontology.
These guidelines are provided to minimize character-level differences between OBO Documents generated by different procedures.
relationship: part_of ABC:1234567 ! hand relationship: R:9999999 ABC:1234567 ! part_of hand
See the serializer conventions in the OBO format guide
When writing solitary 'xref' tags, the comma character should be escaped.
All class identifiers should follow OBO Foundry ID Policy. i.e. all ID-spaces must be registered and approved (email obo-admin at obofoundry.org to register), and the entire ID should consist of the ID space followed by ':' followed by a zero-padded numeric local identifer (a minimum of 7 digits is recommended).
Relation identifiers should in follow the same guidelines as class identifiers. Note however that the use of symbolic identifiers such as 'part_of' is common in almost all OBO format ontologies, and has a precedent stretching back over ten years. A large body of software now expects symbolic identifiers for relations, and ontology maintainers are understandably reluctant to change these to numeric identifiers.
This specification provides a means of using numeric identifiers globally whilst retaining symbolic identifiers within the context of a single file. Refer to section 5.9.3 for details.
[Typedef] id: has_part name: has_part xref: BFO:0000051
Individuals are retained in OBOF1.4 for compatibility with previous versions of the format. OBOF is not a good choice for modeling using individuals. Ontologies should not contain individuals, and OBOF is not a good choice for knowledge bases that use individuals.
When translating between OBO and OWL, owl:imports in an OWL ontology hader are translated directly to an 'import:' header in an OBO document. The contents of the field MUST not be modified during translation (normative).
As a consequence of this, URIs that resolve to OWL documents must be used if the resulting OWL is to be consumed by an OWL parser. Here is an example of a valid OBO document header (taken from the GO editors file):
import: http://purl.obolibrary.org/obo/go/extensions/chebi_import.owl import: http://purl.obolibrary.org/obo/go/extensions/cl_import.owl import: http://purl.obolibrary.org/obo/go/extensions/po_import.owl ontology: goThe onus is on the OBO parser to be able to handle these OWL documents - either by translating them or mapping the URIs to OBO documents that have been translated in advance
Most OBO tools are not equipped to do this, so imports should be used with caution in OBO format. We strongly recommend using OWL whenever imports are required, and providing OBO versions with the imports closure merged. Note that this can be done with owltools:
owltools myont.owl --merge-imports-closure -o -f obo myont.oboNote that all versions of OBO-Edit prior to 2.3b7 should be considered broken for imports. 2.3b7 introduces the use of oboedit.catalog files which allow OWL URIs to be used in import directives. These are not part of the OBOFormat spec, but details are included here for completeness
# GO imports mappings http://purl.obolibrary.org/obo/go/extensions/chebi_import.owl ../extensions/chebi_import.obo http://purl.obolibrary.org/obo/go/extensions/cl_import.owl ../extensions/cl_import.obo http://purl.obolibrary.org/obo/go/extensions/po_import.owl ../extensions/po_import.oboWhen an OBO parser encounters an import directive it may choose to load the document obtained via mapping the URI on the left with the OBO document found in the file on the right. The oboedit.catalog file should live in the same directory as the source OBO ontology.