Comparing some Python libraries for working with ontologies

Whenever I need to programmatically manipulate a OWL ontology for whatever reason, my toolkit of choice is usually Java and the OWLAPI, and I will typically write the code as a ROBOT pluggable command, so that I can delegate the boring stuff (e.g. loading the ontology from disk, parsing the command line options, etc.) to ROBOT and focus instead on the interesting bits of the task that needs to be done. This is especially useful and efficient if said task is supposed to be part of a larger ROBOT pipeline.

But when the task is supposed to be part of a larger Python-based pipeline instead, then using Java is suddenly much less practical, and I’d rather perform the task directly in Python if possible.

This raises the question of the Python library to use to manipulate ontologies. The same question in Java is a no-brainer for me because the OWLAPI is the one library to rule them all. But in Python, things are much less clear cut.

In this post, I put several ontology-related Python libraries to the test by using them to perform a simple task.

The task

Given an ontology (ideally in any format) and a list of terms, I need to check for each term whether it corresponds exactly to the label or to an exact synonym of one of the classes in the ontology. If it corresponds to a label, then I need to get the shortened identifier (“CURIE”) of the matching class; if it corresponds to an exact synonym, then I need to get both the label and the shortened identifier of the matching class; if it doesn’t correspond to anything, I must get an “unknown term” error.

For example, if the ontology is the Drosophila Anatomy Ontology (hereafter “FBbt”) and the list of terms is as follows:

adult dorsal vessel
T neuron T2
frobnicator muscle

then the expected output should be:

adult dorsal vessel ; FBbt:00003152
T neuron T2 -> T2 neuron ; FBbt:00003728
Unknown term: frobnicator muscle

because “adult dorsal vessel” is the label of the FBbt:00003152 class, “T neuron T2” is an exact synonym for the FBbt:00003728 class (whose label is “T2 neuron”), and there is no such thing as a “frobnicator muscle” (at least not in Drosophila).

Pronto

The first tested library is Pronto, because it happens to be the one I already knew about and had already used.

Here’s what a code performing the task described above using Pronto could look like:

from pronto import Ontology
from pronto.term import Term

class OntologyWrapper():

    ont: Ontology
    terms_by_label: dict[str, Term] = {}
    terms_by_synonym: dict[str, Term] = {}

    def __init__(self, path: str, prefix: str):
        self.ont = Ontology(path)
        for term in self.ont.terms():
            if not term.id.startswith(prefix):
                # Ignore non-FBbt terms
                continue
            if term.name is None:
                # Should not happen because all terms in FBbt should
                # have a label, but Mypy does not know that, so guarding
                # against the absence of label keeps Mypy happy
                continue
            self.terms_by_label[term.name] = term
            for synonym in [s for s in term.synonyms if s.scope == EXACT']:
                self.terms_by_synonym[synonym.description] = term

    def lookup(self, s: str) -> str:
        term = self.terms_by_label.get(s)
        if term:
            return f"{term.name} ; {term.id}"
        else:
            term = self.terms_by_synonym.get(s)
            if term:
                return f"{s} -> {term.name} ; {term.id}"
            else:
                return f"Unknown term: {s}"

wrapper = OntologyWrapper("fbbt.obo", "FBbt:")
for file in sys.argv[1:]:
    with open(file, "r") as f:
        for line in f:
            print(wrapper.lookup(line.strip()))'

This is pretty straightforward. We define a OntologyWrapper class to do two things:¹

When the class is initialised, we load the ontology, then iterate once over all the terms to populate two hash tables: one using the labels as keys, and one using the exact synonyms as keys.
When the lookup method is called with an input term, we query in turn the two aforementioned hash tables to find a matching ontology term.

Pronto supports three different input formats: OBO, OBOGraph-JSON, and RDF/XML. However the best performances are obtained with the OBO format. Loading an ontology in OBOGraph-JSON or RDF/XML is, according to my test, three to five times slower than loading the same ontology in OBO (on my machine and with the latest development version of FBbt,² the code above takes ~1.5s in OBO, versus ~5s in JSON or RDF/XML). More importantly, loading from RDF/XML causes the library to emit lots of warnings.

Unfortunately, Pronto will fail to read some perfectly valid OBO files, for two different types of reasons:

Fastobo (see below), the OBO parser upon which Pronto is built, implements the OBO Flat File Format specification slightly incompletely (notably, it fails to recognize all “OBO escape sequences” everywhere they are allowed to be used).
Pronto itself considers that “dangling clauses” (when a clause references a class that is not defined elsewhere in the same ontology) are illegal and rejects any file that contains such clauses.

The second point in particular is unlikely to be fixed any time soon because Pronto developers do not consider it to be a bug, despite the fact that dangling clauses are never explicitly forbidden by the OBO format specification, except specifically in the “OBO Basic” profile.

It so happens that the OBO version of FBbt, the main ontology I work with, currently contains no unexpected escape sequences and no dangling clauses, and is therefore fully parseable with Pronto. But other ontologies like the Uberon anatomy ontology and the Cell Ontology do contain both escape sequences and dangling clauses, and are therefore unusable.

Verdict: Pronto is good when working with OBO files, with the important caveat that not all OBO files will be supported. If your files are supported, then Pronto is quite fast and has a reasonably intuitive interface.

Fastobo

Fastobo is a Rust library (with Python bindings) specifically intended to parse OBO files. As noted above, it is the backend used by the Pronto library when loading from OBO files.

Here’s a Fastobo version of the OntologyWrapper class we’ve seen above with Pronto:³

import fastobo
from fastobo.doc import OboDoc
from fastobo.term import TermFrame, NameClause, SynonymClause

class OntologyWrapper:

    ont: OboDoc
    curies_by_label: dict[str, str] = {}
    curies_by_synonym: dict[str, str] = {}
    labels_by_curie: dict[str, str] = {}

    def __init__(self, path: str, prefix: str):
        self.ont = fastobo.load(path)
        for term in [
            frame
            for frame in self.ont
            if type(frame) == TermFrame and frame.id.prefix == prefix
        ]:
            curie = term.id.prefix + ":" + term.id.local
            for clause in term:
                if type(clause) == NameClause:
                    self.curies_by_label[clause.name] = curie
                    self.labels_by_curie[curie] = clause.name
                elif type(clause) == SynonymClause:
                    if clause.synonym.scope == "EXACT":
                        self.curies_by_synonym[clause.synonym.desc] = curie

    def lookup(self, s: str) -> str:
        curie = self.curies_by_label.get(s)
        if curie:
            return f"{s} ; {curie}"
        else:
            curie = self.curies_by_synonym.get(s)
            if curie:
                label = self.labels_by_curie[curie]
                return f"{s} -> {label} ; {curie}"
            else:
                return f"Unknown term: {s}"


wrapper = OntologyWrapper("fbbt.obo", "FBbt")

Compared to Pronto, Fastobo has a lower-level interface, that is much closer to the structure of a OBO file. It helps to be familiar with that structure (e.g. to know what “frames” and “clauses” are) to understand how the library can be used. For example, if a class has a logical definition, what would normally be represented as a single EquivalentClasses axiom in a higher-level library will be represented in Fastobo as a list of IntersectionOfClause objects, because logical definitions are represented in the OBO format as a list of intersection_of tags.

Performance-wise, Fastobo did not usurp its name: the library is fast. In fact, it is the fastest of all the libraries tested here. It reads FBbt in less than half a second – no other library comes under the one second mark.

Unfortunately, as noted above with Pronto (which uses Fastobo under the hood), Fastobo will not be able to read all OBO files, including the OBO versions of Uberon and CL.

Verdict: Fastobo is good if ① you have only compatible OBO files, ② you know enough of the OBO format to be happy with the low-level interface, and ③ you need top-notch performances. I can’t really fault it for supporting only the OBO format (the way I would normally do for other libraries), since it is specifically a OBO library, not a generic ontology library.

Owlready2

Next up is Owlready2. Without further ado, here’s the code:

from os.path import realpath

from owlready2 import get_ontology
from owlready2.namespace import Ontology
from owlready2.entity import ThingClass

class OntologyWrapper:

    ont: Ontology
    terms_by_label: dict[str, ThingClass] = {}
    terms_by_synonym: dict[str, ThingClass] = {}
    prefix_len: int
    prefix_name: str

    def __init__(self, path: str, prefix_name: str, prefix: str):
        self.ont = get_ontology("file://" + realpath(path))
        self.ont.load()
        for klass in [k for k in self.ont.classes() if k.iri.startswith(prefix)]:
            for label in klass.label:
                self.terms_by_label[label] = klass
            for syn in klass.hasExactSynonym:
                self.terms_by_synonym[syn] = klass
        self.prefix_len = len(prefix)
        self.prefix_name = prefix_name

    def to_curie(self, iri: str) -> str:
        return self.prefix_name + ":" + iri[self.prefix_len :]

    def lookup(self, s: str) -> str:
        term = self.terms_by_label.get(s)
        if term:
            curie = self.to_curie(term.iri)
            return f"{term.label[0]} ; {curie}"
        else:
            term = self.terms_by_synonym.get(s)
            if term:
                curie = self.to_curie(term.iri)
                return f"{s} -> {term.label[0]} ; {curie}"
            else:
                return f"Unknown term: {s}"


wrapper = OntologyWrapper("fbbt.owl", "FBbt", "http://purl.obolibrary.org/obo/FBbt_")

Owlready2 supports reading from RDF/XML, OWL/XML, and NTriples. It does not support the OBO format (which I don’t mind – I’d rather have a library that supports RDF/XML but not OBO, rather than the other way around). More annoyingly, it does not support OWL Functional Syntax.

Regardless of the syntax, Owlready2 does not support the presence of punned entities. It is a known issue that is seemingly going to be left unfixed on purpose. This unfortunately means the library cannot be used to work with a standard release of CL, which does contain a few punned entities.

Also, because it is a OWL library and not a OBO library, it has – expectedly – no built-in concept of “CURIE”. Entities are always only identified by their full-length IRI; if you need CURIEs for some reason, you have to shorten the identifiers yourself (as the code above is doing). This may come as an annoyance for the OBO folks, but I personally don’t mind – in fact, I prefer that way.

However, I do dislike the way annotations are made accessible in the interface, by making them appear as built-in attributes of the object representing the entity, with a name derived from the name of the annotation property (as in klass.hasExactSynonym to access an annotation with the http://www.geneontology.org/formats/oboInOwl#hasExactSynonym property). From experience, this kind of syntactic sugar regularly ends up doing more harm than good. For example, what would happen if the ontology had two different annotation properties with an identical local name, but in two different namespaces? I did the test with a class carrying both a http://www.geneontology.org/formats/oboInOwl#hasExactSynonym annotation and a https://example.org/hasExactSynonym annotation: klass.hasExactSynonym only returns the latter annotation value, and I have no idea on how to get the former, or if it is even possible. (Granted, I do not have any real ontology where this happens, but I don’t think this is a far-fetched scenario.)

Performance-wise, Owlready2 reads the RDF/XML version of FBbt in about 7 seconds on my machine, which is not too bad (some other libraries are far worse) but is at the upper range of what I am willing to accept.

Verdict: Owlready2 is good if ① you are not working with OBO files, ② you need not worry about the presence of punned entities (lucky you!), ③ you do not mind the quirks of the interface. For my part I will avoid it, since I do mind those quirks as explained above and I do have punned entities in at least some of the ontologies that I work with.

PyOBO

PyOBO leaves me a very mixed feeling.

On the surface, that library seemingly offers the easiest way to do exactly what I need, with the pyobo.ground high-level function:

>>> pyobo.ground("fbbt", "T neuron T2")
NormalizedNamableReference(prefix="fbbt", identifier="00003728", name="T2 neuron")

Nice, right?

Well, except for one thing: this function is automatically using the latest published version of FBbt, downloaded from the Internet if it is not already in PyOBO’s local cache. That’s great if you are a user of FBbt, but in my case I edit the ontology and most of the time, I want to work with the “development” version of the ontology – the version that is checked out locally on my computer and that may contain dozens, sometimes hundreds, of edits that have not yet found their way to a published version.

Unfortunately, there doesn’t seem to be a way to force pyobo.ground to use a local version of an ontology. I was hoping it would be possible to manually load an ontology from file and put it into PyOBO’s cache, like this:

>>> fbbt = pyobo.from_obo_path("fbbt.obo", prefix="FBbt", version="dev")
>>> fbbt.write_cache()

But while this does write a bunch of files in the cache, this has no effect on pyobo.ground, which will always attempt to download FBbt from the Internet no matter what is in the cache. At this point I am not sure if it’s a bug, or if I what I’m trying to do is simply not a supported use case – it certainly doesn’t look like a supported use case according to what little documentation is available.

So, I have to forget about pyobo.ground, and more generally almost all high-level functions in pyobo (which all have the same tendency of always wanting to download a fresh version from the Internet), and instead work with the ontology object returned from pyobo.from_obo_path – an object whose methods will at least use the data read from the provided file.

Here’s then the PyOBO version of the OntologyWrapper class:

from pyobo import get_ontology, from_obo_path, Obo, Term

class OntologyWrapper:

    ont: Obo
    prefix: str
    terms_by_label: dict[str, Term] = {}
    terms_by_synonym: dict[str, Term] = {}

    def __init__(self, path: str, prefix: str):
        self.ont = from_obo_path(path, prefix, version="dev")
        self.prefix = prefix
        for term in self.ont.iter_terms():
            self.terms_by_label[term.name] = term
            for synonym in [s for s in term.synonyms if s.specificity == "EXACT"]:
                self.terms_by_synonym[synonym.name] = term

    def lookup(self, s: str) -> str:
        term = self.terms_by_label.get(s)
        if term:
            return f"{term.name} ; {self.prefix}:{term.identifier}"
        else:
            term = self.terms_by_synonym.get(s)
            if term:
                return f"{s} -> {term.name} ; {self.prefix}:{term.identifier}"
            else:
                return f"Unknown term: {s}"


wrapper = OntologyWrapper("fbbt.obo", "FBbt")

The interface overall is very similary to that of other libraries like Pronto. But the performances are painfully bad: it takes 45 seconds to read FBbt, which makes PyOBO the slowest of the OBO parsers I have tested by a large margin.

UPDATE (2025/09/15): Following the publication of this note, PyOBO developers released a new version of the library that makes it possible to rewrite the above wrapper as follows:

from pyobo import from_obo_path
from ssslm import Grounder


class OntologyWrapper:

    grounder: Grounder
    prefix: str

    def __init__(self, path: str, prefix: str):
        ont = from_obo_path(path, prefix.lower(), version="dev")
        self.grounder = ont.get_grounder()
        self.prefix = prefix

    def lookup(self, s: str) -> str:
        matches = self.grounder.get_matches(s)
        if len(matches) == 1:
            match = matches[0]
            if match.name == s:
                return f"{match.name} ; {self.prefix}:{match.identifier}"
            else:
                return f"{s} -> {match.name} ; {self.prefix}:{match.identifier}"
        else:
            return f"Unknown (or ambiguous) term: {s}"

This at least allows to use the high-level grounding interface, which in turn makes the script much smaller and more straightforward (since the grounder basically does everything for us), even though it still suffers from pyobo.from_obo_path’s poor performances (since we still don’t get any benefit from the cache).

Huge thanks to PyOBO developer Charles Hoyt for his reactivity!

Verdict: PyOBO PyOBO is great for people who just want to use informations from OBO ontologies without ever worrying about how said informations are obtained – in fact all the examples given in the project’s README are doing just that. But it is clearly not intended to allow working efficiently with local ontologies, which happens to be what I need to to the most.

FunOWL

Then comes FunOWL, which seems much more useful for building ontologies than for querying one – as readily acknowledged on the project’s README: the library is firstly intended as a generator, and only secondarily as a consummer. As far as I can tell, the library offers no function to, say, get a given class in an ontology, or even to get all classes. Once an ontology has been parsed, all we can get from it is the complete list of axioms, that we must swift through to obtain the information we need. So it’s a very low-level interface, a bit like Fastobo for OBO files.

Here’s what the OntologyWrapper class could look like with FunOWL:

from funowl import AnnotationAssertion, OntologyDocument
from funowl.converters.functional_converter import to_python

class OntologyWrapper:

    ont: OntologyDocument
    curies_by_label: dict[str, str] = {}
    curies_by_synonym: dict[str, str] = {}
    labels_by_curie: dict[str, str] = {}

    def __init__(self, path: str, prefix: str):
        self.ont = to_python(path)
        for axiom in [
            ax
            for ax in self.ont.ontology.axioms
            if type(ax) == AnnotationAssertion and ax.subject.v.v.startswith(prefix)
        ]:
            curie = self.to_curie(axiom.subject.v.v)
            if axiom.property.v == "rdfs:label":
                label = axiom.value.v.v
                self.curies_by_label[label] = curie
                self.labels_by_curie[curie] = label
            elif axiom.property.v == "oboInOwl:hasExactSynonym":
                self.curies_by_synonym[axiom.value.v.v] = curie

    def to_curie(self, s: str) -> str:
        return ":".join(s.split(":", 1)[1].split("_", 1))

    def lookup(self, s: str) -> str:
        curie = self.curies_by_label.get(s)
        if curie:
            return f"{s} ; {curie}"
        else:
            curie = self.curies_by_synonym.get(s)
            if curie:
                label = self.labels_by_curie[curie]
                return f"{s} -> {label} ; {curie}"
            else:
                return f"Unknown term: {s}"


wrapper = OntologyWrapper("fbbt.ofn", "obo:FBbt_")

Alas, FunOWL’s parsing performances are horrendously bad: it reads the Functional Syntax version of FBbt in… 3 minutes and 30 seconds! Also, it fails to parse the latest version of CL for some reason (I didn’t investigate why, since at this point it was clear I was not going to use the library anyway; some uncaught AttributeError).

Verdict: I will make no judgment on FunOWL’s usefulness as a OWL generator (its primary intended use), as I did not test that aspect of the library at all (it’s not something that I often need to do, or rather not something I often need to do while working in Python). But as a Functional Syntax parser, it is practically unusable.

Py-Horned-OWL

Py-Horned-OWL is a set of Python bindings for the Horned-OWL Rust library.

As usual, here is the Py-Horned-OWL version of the OntologyWrapper class:

from pyhornedowl import open_ontology, PyIndexedOntology

class OntologyWrapper:

    ont: PyIndexedOntology
    iris_by_label: dict[str, str] = {}
    iris_by_synonym: dict[str, str] = {}
    labels_by_iri: dict[str, str] = {}

    def __init__(self, path: str, prefix_name: str, prefix: str):
        self.ont = open_ontology(path)
        self.ont.add_prefix_mapping(prefix_name, prefix)
        self.ont.add_prefix_mapping("rdfs", "http://www.w3.org/2000/01/rdf-schema#")
        self.ont.add_prefix_mapping("oio", "http://www.geneontology.org/formats/oboInOwl#")
        self.ont.build_indexes()

        for klass in [c for c in self.ont.get_classes() if c.startswith(prefix)]:
            for label in self.ont.get_annotations(klass, "rdfs:label"):
                self.iris_by_label[label] = klass
                self.labels_by_iri[klass] = label
            for synonym in self.ont.get_annotations(klass, "oio:hasExactSynonym"):
                self.iris_by_synonym[synonym] = klass

    def lookup(self, s: str) -> str:
        klass = self.iris_by_label.get(s)
        if klass:
            curie = self.ont.get_id_for_iri(klass)
            return f"{s} ; {curie}"
        else:
            klass = self.iris_by_synonym.get(s)
            if klass:
                curie = self.ont.get_id_for_iri(klass)
                label = self.labels_by_iri[klass]
                return f"{s} -> {label} ; {curie}"
            else:
                return f"Unknown term: {s}"


wrapper = OntologyWrapper("fbbt.owl", "FBbt", "http://purl.obolibrary.org/obo/FBbt_")

The library supports the RDF/XML, OWL/XML, and Functional Syntax formats, can read without a glitch all the ontologies I routinely work with, and does so with very decent performances (about 2 seconds to read the RDF/XML version of FBbt).

Its interface is somewhat reminiscent of that of the Java OWLAPI, for example in that the annotations for a given entity must be obtained from the top-level ontology object, rather than from an object representing the entity itself (as can be seen in the wrapper above: ont.get_annotations(klass...), rather than klass.get_annotations(...)). I suspect many OBO folks might not like that, and would prefer a Pronto- or Owlready2-like interface, but I personally don’t mind at all. In fact, for someone used to the OWLAPI, Py-Horned-OWL’s interface has a nice “homey” feeling.

Verdict: Py-Horned-OWL is great if you ① are working with any kind of OWL files, ② have no need for OBO support, and ③ are a transfuge from the land of Java and you miss the OWLAPI. ;) In fact, Py-Horned-OWL is the closest thing to the ideal Python OWL library that I have been looking for – I only wished I had discovered it sooner.

Ontology Access Kit

Finally, there’s the Ontology Access Kit (OAK).

This one is a bit special in that it does not provide its own parsers (apart from the “simpleobo” OBO parser), but instead aims to provide a common, high-level interface to several underlying libraries and their parsers.

Here’s the OAK-based OntologyWrapper class:

from oaklib import get_adapter
from oaklib.datamodels.search import SearchConfiguration, SearchProperty
from oaklib.interfaces import SearchInterface

class OntologyWrapper:

    ont: SearchInterface
    cfg: SearchConfiguration = SearchConfiguration(
        properties=[SearchProperty.LABEL, SearchProperty.ALIAS]
    )
    prefix: str

    def __init__(self, selector: str, prefix: str):
        self.ont = get_adapter(selector)
        self.prefix = prefix

    def lookup(self, s: str) -> str:
        found = [
            c
            for c in self.ont.basic_search(s, config=self.cfg)
            if c.startswith(self.prefix)
        ]
        if not found:
            return f"Unknown term: {s}"
        found = found[0]
        label = self.ont.label(found)
        if label == s:
            return f"{label} ; {found}"
        else:
            return f"{s} -> {label} ; {found}"

wrapper = OntologyWrapper("pronto:fbbt.obo", "FBbt")

It is significantly different from all the other versions, since the OAK high-level interface dispenses us from having to do most of the heavy lifting.

We can’t really discuss the performances and limitations (e.g. in terms of supported formats) of the Ontology Access Kit itself, since they are almost entirely dependent on the underlying library used under the hood.

In the example above, the backend library is Pronto, and we thus inherit the aforementioned limitations of that library, including the inability to parse some OBO files (such as recent versions of CL) or slower performances when parsing from a RDF/XML file compared to an OBO file.

The best performances, sometimes by a large margin, are obtained with the sqlite backend (which seems clearly intended as the “primary” backend), which requires that the ontology to work with must be converted to the SemSQL format first. Assuming a fbbt.owl file exists in the current directory, this can be done with:

$ semsql make fbbt.db

On my machine, this takes approximately 3 minutes, and yields a SQLite file of about 2.5GB (from a 112MB RDF/XML file). But this only needs to be done once (at least as long as the ontology doesn’t change), and is really the only way to ① load any ontology (bypassing all the limitations of the other backends) and ② get decent performances.

You can also let OAK automatically download and cache pre-built SemSQL versions of the most common OBO ontologies, if you do not want to do that yourself. But in my case, as explained above when discussing PyOBO, this is not an option as I need to work with my local version of FBbt, not the latest published version.)

Verdict: The OAK is good if ① you want/need the high-level interface or features and ② you do not mind having to generate SemSQL versions of your ontologies.

RDFLib

For completeness, let us try RDFLib. It is not, strictly speaking, a library for manipulating ontologies, but since a OWL ontology can be represented as a RDF graph, it can also be manipulated as one.

Its use is very similar to FunOWL above, except that instead of swifting through axioms, we have to swift through triples – which in effect does not make a great difference for the use case considered here, since we are only interested in annotation assertion axioms and each one of those corresponds to exactly one triple.

Here’s the RDFLib-based version of the OntologyWrapper class:

from typing import ClassVar

from rdflib import Graph, URIRef

class OntologyWrapper:

    ont: Graph
    iris_by_label: dict[str, str] = {}
    iris_by_synonym: dict[str, str] = {}
    labels_by_iri: dict[str, str] = {}
    prefix_name: str
    prefix_len: int

    LABEL: ClassVar = URIRef("http://www.w3.org/2000/01/rdf-schema#label")
    SYNONYM: ClassVar = URIRef("http://www.geneontology.org/formats/oboInOwl#hasExactSynonym")

    def __init__(self, path: str, prefix_name: str, prefix: str):
        self.ont = Graph().parse(path)
        self.prefix_name = prefix_name
        self.prefix_len = len(prefix)

        for subject in [
            s
            for s in self.ont.subjects(unique=True)
            if type(s) == URIRef and s.startswith(prefix)
        ]:
            for label in self.ont.objects(subject=subject, predicate=self.LABEL):
                self.iris_by_label[str(label)] = str(subject)
                self.labels_by_iri[str(subject)] = str(label)
            for synonym in self.ont.objects(subject=subject, predicate=self.SYNONYM):
                self.iris_by_synonym[str(synonym)] = str(subject)

    def to_curie(self, iri: str) -> str:
        return self.prefix_name + ":" + iri[self.prefix_len :]

    def lookup(self, s: str) -> str:
        iri = self.iris_by_label.get(s)
        if iri:
            curie = self.to_curie(iri)
            return f"{s} ; {curie}"
        else:
            iri = self.iris_by_synonym.get(s)
            if iri:
                label = self.labels_by_iri[iri]
                curie = self.to_curie(iri)
                return f"{s} -> {label} ; {curie}"
            else:
                return f"Unknown term: {s}"


wrapper = OntologyWrapper("fbbt.owl", "FBbt", "http://purl.obolibrary.org/obo/FBbt_")

This works for all the ontologies I need to work with, but performance-wise, is a bit too slow for my taste (about 20 seconds to parse FBbt). Still, it would have been a reasonable fallback to work with CL or Uberon if Py-Horned-OWL had not been available.

↑ For such a simple task, the wrapper class may seem a bit “overkill”, but: ① I am used to write Java code, so encapsulating everything in classes is a second nature to me :D and ② it will make it somewhat easier to compare the different libraries being tested – I will use the same kind of wrapper for all of them.
↑ All further durations given in this note shall be understood as being “on my machine and with the latest development version of FBbt”. For reference, the machine I am talking about is my professional computer, a Macbook from 2024 with a 8-core M3 CPU and 24GB of RAM.
↑ Not repeating the code that reads the input strings and queries the wrapper, since it is identical.

13 September 2025

You can add a comment by replying to this message on the Fediverse.