Ontology

Concept

The pyhpo.Ontology is the main component of hpo3, it contains references to all pyhpo.HPOTerms, pyhpo.Gene and Diseases (pyhpo.Omim, pyhpo.Orpha). It is provided as a singleton and must be instantiated once to load all terms and annotations. Afterwards, the complete Ontology is available in the global scope across all submodules.

hpo3 ships with a default provided Ontology that contains all terms, genes and diseases. You can also use your own version of custom annotations.

Instantiation

The Ontology must be instantiated once in every running program. This loads all HPO terms, their connections and annotation into memory.

Omitting all arguments will automatically load the built-in version. Alternatively, you can specify a binary data file or a folder that contains the JAX standard HPO data files.

Ontology()

Parameters
data_folder

(str) Path to the source files (default: None) Leave blank to load the builtin Ontology (recommended)

from_obo_file

(bool) Whether the input format is the standard from Jax HPO (default True). Set to False to load a binary data source.

transitive

(bool) Load the ontology transitive, i.e. use the phenotype_to_genes.txt source instead to link terms to genes. This means that if an HPO-term is linked to a gene, all it’s parent term will be linked automatically as well. (default False)

Returns

None (calling Ontology instatiates the global Ontology singleton)

Examples

from pyhpo import Ontology

# load built-in default ontology
Ontology()

# check the release date of the HPO
Ontology.version()
# ==> '2024-04-26'

term = Ontology.hpo(11968)
term.name()  # ==> 'Feeding difficulties'
term.id()    # ==> 'HP:0011968'
int(tern)    # ==> 11968
from pyhpo import Ontology

# load custom data from a local directory
Ontology("/path/to/folder/")

The following code with multiple modules works, because the Ontology must only be loaded once:

File main.py

from pyhpo import Ontology

import submodule
from submodule import foo

Ontology()

foo()
submodule.bar()

File submodule.py

from pyhpo import Ontology

def foo():
    print(len(Ontology))

def bar():
    print(len(Ontology))

API

Due to a limitation of Sphinx (or my understanding of it), the Ontology object here is written as _Ontology. Please disregard the underscore :)

class _Ontology
genes

A list of all genes included in the ontology

Returns

All genes that are associated to the pyhpo.HPOTerm in the ontology

Return type

list[pyhpo.Gene]

Important

The return type of this method will very likely change into an Iterator of Gene. (Info about likely API changes)

Raises

NameError – Ontology not yet constructed:

get_hpo_object(query)

Returns a single HPOTerm based on its name or id

Parameters

query (str or int) –

  • str HPO term (e.g.: Scoliosis)

  • str HPO-ID (e.g.: HP:0002650)

  • int HPO term id (e.g.: 2650)

Returns

A single matching HPO term instance

Return type

pyhpo.HPOTerm

Raises
  • NameError – Ontology not yet constructed

  • RuntimeError – No HPO term is found for the provided query

  • TypeError – The provided query is an unsupported type and can’t be properly converted

  • ValueError – The provided HPO ID cannot be converted to the correct integer representation

Examples

from pyhpo import Ontology
Ontology()

# Search by ID (int)
Ontology.get_hpo_object(3)
# >> HP:0000003 | Multicystic kidney dysplasia

# Search by HPO-ID (string)
Ontology.get_hpo_object('HP:0000003')
# >> HP:0000003 | Multicystic kidney dysplasia

# Search by term (string)
Ontology.get_hpo_object('Multicystic kidney dysplasia')
# >> HP:0000003 | Multicystic kidney dysplasia

Note

This method differs slightly from pyhpo, because it does not fall back to the synonym for searching

hpo(id)

Returns the HpoTerm with the provided id

Parameters

id (int) – ID of the term as int (HP:0000123 –> 123)

Returns

The HPO-Term

Return type

pyhpo.HPOTerm

Raises
  • NameError – Ontology not yet constructed

  • KeyError – No HPO term is found for the provided query

Examples

from pyhpo import Ontology

Ontology()

term = Ontology.hpo(11968)
term.name()  # >> 'Feeding difficulties'
term.id()    # >> 'HP:0011968'
int(tern)    # >> 11968
match(query)

Returns a single HPOTerm based on its name

Parameters

query (str) – Name of the HPO term, e.g. Scoliosis

Returns

A single matching HPO term instance

Return type

pyhpo.HPOTerm

Raises
  • NameError – Ontology not yet constructed

  • RuntimeError – No HPO term is found for the provided query

Examples

from pyhpo import Ontology
Ontology()

Ontology.match('Multicystic kidney dysplasia')
# >>> HP:0000003 | Multicystic kidney dysplasia
omim_diseases

A list of all Omim Diseases included in the ontology

Returns

All Omim diseases that are associated to the pyhpo.HPOTerm in the ontology

Return type

list[pyhpo.Omim]

Important

The return type of this method will very likely change into an Iterator of Omim. (Info about likely API changes)

Raises

NameError – Ontology not yet constructed:

orpha_diseases

A list of all Orpha Diseases included in the ontology

Returns

All Orpha diseases that are associated to the pyhpo.HPOTerm in the ontology

Return type

list[pyhpo.Orpha]

Important

The return type of this method will very likely change into an Iterator of Orpha. (Info about likely API changes)

Raises

NameError – Ontology not yet constructed:

path(query1, query2)

Returns the shortest path from one to another HPO Term

Parameters
  • query1 (str or int) – Name, HPO-ID (HP:0040064) or integer ID of source term e.g: Abnormality of the nervous system

  • query2 (str or int) – Name, HPO-ID (HP:0040064) or integer ID of target term e.g: Abnormality of the nervous system

Returns

  • int – Length of path

  • list – List of HPOTerms in the path

  • int – Number of steps from term1 to the common parent (Not implemented. Returns 0)

  • int – Number of steps from term2 to the common parent (Not implemented. Returns 0)

Raises
  • NameError – Ontology not yet constructed

  • RuntimeError – No HPO term is found for the provided query

  • TypeError – The provided query is an unsupported type and can’t be properly converted

  • ValueError – The provided HPO ID cannot be converted to the correct integer representation

Examples

from pyhpo import Ontology
Ontology()

Ontology.path(40064, 'Multicystic kidney dysplasia')
# >> (
# >>     8,
# >>     [
# >>         <HpoTerm (HP:0040064)>, <HpoTerm (HP:0000118)>,
# >>         <HpoTerm (HP:0000119)>, <HpoTerm (HP:0000079)>,
# >>         <HpoTerm (HP:0010935)>, <HpoTerm (HP:0000077)>,
# >>         <HpoTerm (HP:0012210)>, <HpoTerm (HP:0000107)>,
# >>         <HpoTerm (HP:0000003)>
# >>     ],
# >>     0,
# >>     0
# >> )
search(query)

Returns a list of HPOTerms that match the query

Parameters

query (str) – Query for substring search of HPOTerms

Returns

All terms matching the query string

Return type

list[HPOTerm]

Important

The return type of this method will very likely change into an Iterator of HPOTerm. (Info about likely API changes)

Raises

NameError – Ontology not yet constructed

Examples

from pyhpo import Ontology
Ontology()

for term in Ontology.search("kidney dis"):
    print(term)

# >> HP:0003774 | Stage 5 chronic kidney disease
# >> HP:0012622 | Chronic kidney disease
# >> HP:0012623 | Stage 1 chronic kidney disease
# >> HP:0012624 | Stage 2 chronic kidney disease
# >> HP:0012625 | Stage 3 chronic kidney disease
# >> HP:0012626 | Stage 4 chronic kidney disease
set_custom_information_content(scores)

Adds custom Information content scores to the Ontology

This method can be used once to set custom information scores once for the whole Ontology. It can not be used multiple times.

Important

This method might not be thread safe. If you use multi-threading, please ensure that no background tasks are running before setting custom information content scores.

Parameters

scores (list[(int, float)]) – A list of tuples containing the HPOTermId and the score

Returns

Replaces the Ontology with a new one

Return type

None

Raises
  • NameError – Ontology not yet constructed

  • KeyError – No HPO term is found for the provided query

  • RuntimeError – The custom scores could not be added. This could be the case if custom scores were already assigned previously. Assigning custom scores is only possible once and can’t be done multiple times.

Examples

from pyhpo import Ontology

Ontology()

# Assign random information scores based on the term name length...
# For demonstration purposes, we do this more verbose than needed
scores = []
for term in Ontology:
    term_id = int(term)
    custom_ic_score = len(term.name) / 100
    score = (term_id, custom_ic_score)
    scores.append(score)

Ontology.set_custom_information_content(scores)

print(Ontology.hpo(118).information_content.custom)
# >> "0.2199999988079071"
version()

Returns the HPO version

Returns

The HPO version, e.g. 2023-04-05

Return type

str

Raises

NameError – Ontology not yet constructed

Examples

from pyhpo import Ontology

Ontology()

Ontology.version()
# >> "2023-04-05"

Iterating

You can iterate all HPOTerms in the ontology. The iteration occurs in random order.

from pyhpo import Ontology
Ontology()

for term in Ontology:
    term.name()  # ==> 'Feeding difficulties'
    term.id()    # ==> 'HP:0011968'
    int(tern)    # ==> 11968

Length

The length of the Ontology indicates the number of HPOTerms within

from pyhpo import Ontology
Ontology()

len(Ontology)
# ==> 18961