HPOSet
An HPOSet
is a collection of HPOTerm
that can be used to document the clinical information of a patient. At the same time, the phenotypes associated with genes and diseases are also HPOSets.
HPOSet
can be instantiated in multiple ways, depending on the available data types. Whichever way you choose, you must instantiate the Ontology beforehand.
Instantiation
- from_queries(queries)
Instantiate an HPOSet from various inputs
This is the most common way to instantiate HPOSet because it can use all kind of different inputs. Callers must ensure that each query paramater matches a single HPOTerm.
- Parameters
queries (list[str or int]) –
str HPO term (e.g.:
Scoliosis
)str HPO-ID (e.g.:
HP:0002650
)int HPO term id (e.g.:
2650
)
- Returns
A new
HPOSet
- Return type
- Raises
NameError – Ontology not yet constructed
ValueError – query cannot be converted to HpoTermId
RuntimeError – No HPO term is found for the provided query
Examples
from pyhpo import Ontology Ontology() my_set = HPOSet.from_queries([ "HP:0002650", 118, "Thoracolumbar scoliosis" ]) len(my_set) # >> 3
- from_serialized(pickle)
Instantiate an HPOSet from a serialized HPOSet
This method is used when you have a serialized form of the HPOSet to share between applications. See
pyhpo.HPOSet.serialize()
- Parameters
pickle (str) – A pickled string of all HPOTerms, e.g.
118+2650
- Returns
A new
HPOSet
- Return type
- Raises
NameError – Ontology not yet constructed
ValueError – pickled item cannot be converted to HpoTermId
KeyError – No HPO term is found for the provided query
Examples
from pyhpo import Ontology Ontology() my_set = HPOSet.from_serialized("7+118+152+234+271+315+478+479+492+496") len(my_set # >> 10
- from_gene(gene)
Instantiate an HPOSet from a Gene
- Parameters
gene (
pyhpo.Gene
) – A gene from the ontology- Returns
A new
HPOSet
- Return type
- Raises
NameError – Ontology not yet constructed
Examples
from pyhpo import Ontology Ontology() gene_set = HPOSet.from_gene(Ontology.genes[0]) len(gene_set) # >> 118
- from_disease(disease)
Instantiate an HPOSet from an Omim disease
- Parameters
gene (
pyhpo.Omim
) – An Omim disease from the ontology- Returns
A new
HPOSet
- Return type
- Raises
NameError – Ontology not yet constructed
Examples
from pyhpo import Ontology Ontology() disease_set = HPOSet.from_disease(Ontology.omim_diseases[0]) len(disease_set) # >> 18
Instance methods
- class HPOSet(terms)
- add(term)
Add an HPOTerm to the HPOSet
- Parameters
term (
HPOTerm
or int) – The term to add, either as actualHPOTerm
or the integer representation- Raises
NameError – Ontology not yet constructed
KeyError – (only when
int
are used as input): HPOTerm does not exist
Examples
from pyhpo import Ontology, HPOSet Ontology() my_set = HPOSet([]) my_set.add(Ontology[118]) len(my_set) # >> 1 my_set.add(2650) len(my_set) # >> 2
- all_genes()
Returns a set of associated genes
- Returns
The union of genes associated with terms in the
HPOSet
- Return type
set[
pyhpo.Gene
]- Raises
NameError – Ontology not yet constructed
Examples
from pyhpo import Ontology Ontology() disease = list(Ontology.omim_diseases)[0] for gene in disease.all_genes(): print(gene.name)
- child_nodes()
Returns a new HPOSet that does not contain ancestor terms
If a set contains HPOTerms that are ancestors of other terms in the set, they will be removed. This method is useful to create a set that contains only the most specific terms.
- Returns
A new
HPOSet
that contains only the most specific terms- Return type
- Raises
NameError – Ontology not yet constructed
Examples
from pyhpo import Ontology, HPOSet my_set = HPOSet.from_queries([ 'HP:0002650', 'HP:0010674', 'HP:0000925', 'HP:0009121' ]) child_set = my_set.child_nodes() len(my_set) # >> 4 len(child_set) # >> 1
- information_content(kind='omim')
Returns basic information content stats about the HPOTerms within the set
- Parameters
kind (str, default:
omim
) – Which kind of information content should be calculated. Options are [‘omim’, ‘gene’]- Returns
Dict with the following items
mean - float - Mean information content
max - float - Maximum information content value
total - float - Sum of all information content values
all - list of float - List with all information content values
- Return type
dict
- Raises
NameError – Ontology not yet constructed
Examples
- omim_diseases()
Returns a set of associated diseases
- Returns
The union of Omim diseases associated with terms in the
HPOSet
- Return type
set[
pyhpo.Omim
]- Raises
NameError – Ontology not yet constructed
Examples
from pyhpo import Ontology Ontology() gene = list(Ontology.genes)[0] for disease in disease.omim_diseases(): print(disease.name)
- remove_modifier()
Returns a new HPOSet that does not contain any modifier terms
This method removes all terms that are not children of
HP:0000118 | Phenotypic abnormality
- Returns
A new
HPOSet
that contains only phenotype terms- Return type
- Raises
NameError – Ontology not yet constructed
Examples
from pyhpo import Ontology, HPOSet my_set = HPOSet.from_queries([ 'HP:0002650', 'HP:0010674', 'HP:0000925', 'HP:0009121', 'HP:0012823', ]) pheno_set = my_set.remove_modifier() len(my_set) # >> 5 len(pheno_set) # >> 4
- replace_obsolete()
Returns a new HPOSet that replaces all obsolete terms with their replacement
If an obsolete term has a replacement term defined it will be replaced, otherwise it will be removed.
- Returns
A new
HPOSet
that contains only phenotype terms- Return type
- Raises
NameError – Ontology not yet constructed
Examples
from pyhpo import Ontology, HPOSet my_set = HPOSet.from_queries([ 'HP:0002650', 'HP:0010674', 'HP:0000925', 'HP:0009121', 'HP:0410003', ]) active_set = my_set.replace_obsolete() len(my_set) # >> 5 len(active_set) # >> 5 Ontology[410003] in my_set # >> True Ontology[410003] in active_set # >> False
- serialize()
Returns a serialized string representing the HPOSet
- Returns
A serialized string uniquely representing the HPOSet, e.g.:
3+118+2650`
- Return type
str
Examples
from pyhpo import Ontology Ontology() gene_sets = [g.hpo_set() for g in Ontology.genes] gene_sets[0].serialize() # >> 7+118+152+234+271+315+478+479+492+496.....
- similarity(other, kind, method, combine)
Calculate similarity between this and another HPOSet
This method runs parallelized on all avaible CPU
- Parameters
other (
pyhpo.HPOSet
) – TheHPOSet
to calculate the similarity tokind (str, default:
omim
) –Which kind of information content to use for similarity calculation
Available options:
omim
gene
method (str, default
graphic
) –The method to use to calculate the similarity.
Available options:
resnik - Resnik P, Proceedings of the 14th IJCAI, (1995)
lin - Lin D, Proceedings of the 15th ICML, (1998)
jc - Jiang J, Conrath D, ROCLING X, (1997) This is different to PyHPO
jc2 - Jiang J, Conrath D, ROCLING X, (1997) Same as jc, but kept for backwards compatibility
rel - Relevance measure - Schlicker A, et.al., BMC Bioinformatics, (2006)
ic - Information coefficient - Li B, et. al., arXiv, (2010)
graphic - Graph based Information coefficient - Deng Y, et. al., PLoS One, (2015)
dist - Distance between terms
combine (str, default
funSimAvg
) –The method to combine individual term similarity to HPOSet similarities.
Available options:
funSimAvg
funSimMax
BMA
- Returns
Similarity scores
- Return type
float
- Raises
NameError – Ontology not yet constructed
AttributeError – Invalid
kind
RuntimeError – Invalid
method
orcombine
Examples
from pyhpo import Ontology Ontology() gene_sets = [g.hpo_set() for g in Ontology.genes] gene_sets[0].similarity(gene_sets[1]) # >> 0.29546087980270386
- similarity_scores(other, kind, method, combine)
Calculate similarity between this HPOSet and a list of other HPOSet
This method runs parallelized on all avaible CPU
- Parameters
other (list[
pyhpo.HPOSet
]) – Calculate similarity betweenself
and every providedHPOSet
kind (str, default:
omim
) –Which kind of information content to use for similarity calculation
Available options:
omim
gene
method (str, default
graphic
) –The method to use to calculate the similarity.
Available options:
resnik - Resnik P, Proceedings of the 14th IJCAI, (1995)
lin - Lin D, Proceedings of the 15th ICML, (1998)
jc - Jiang J, Conrath D, ROCLING X, (1997) This is different to PyHPO
jc2 - Jiang J, Conrath D, ROCLING X, (1997) Same as jc, but kept for backwards compatibility
rel - Relevance measure - Schlicker A, et.al., BMC Bioinformatics, (2006)
ic - Information coefficient - Li B, et. al., arXiv, (2010)
graphic - Graph based Information coefficient - Deng Y, et. al., PLoS One, (2015)
dist - Distance between terms
combine (str, default
funSimAvg
) –The method to combine individual term similarity to HPOSet similarities.
Available options:
funSimAvg
funSimMax
BMA
- Returns
Similarity scores for every comparison
- Return type
list[float]
- Raises
NameError – Ontology not yet constructed
KeyError – Invalid
kind
RuntimeError – Invalid
method
orcombine
Examples
from pyhpo import Ontology Ontology() gene_sets = [g.hpo_set() for g in Ontology.genes] similarities = gene_sets[0].similarity_scores(gene_sets) similarities[0:4] # >> [1.0, 0.5000048279762268, 0.29546087980270386, 0.5000059008598328]
- terms()
Returns the HPOTerms in the set
- Returns
A list of every term in the set
- Return type
list[
pyhpo.HPOTerm
]
Important
The return type of this method will very likely change into an Iterator of
HPOTerm
. (Info about likely API changes)- Raises
NameError – Ontology not yet constructed
KeyError – No HPO term is found for the provided query
Examples
from pyhpo import Ontology Ontology() my_set = list(Ontology.genes)[0].hpo_set() for term in my_set.terms(): print(term.name)
- toJSON(verbose)
Returns a dict/JSON representation the HPOSet
- Parameters
verbose (bool) – Indicates if each HPOTerm should contain verbose information see
pyhpo.HpoTerm.toJSON()
- Returns
Dict representation of all HPOTerms in the set that can be used for JSON serialization
- Return type
Dict
- Raises
NameError – Ontology not yet constructed
KeyError – No HPO term is found for the provided query
Examples
from pyhpo import Ontology Ontology() my_set = HPOSet.from_serialized("7+118+152+234+271+315+478+479+492+496") my_set.toJSON() # >> [ # >> {'name': 'Autosomal recessive inheritance', 'id': 'HP:0000007', 'int': 7}, # >> {'name': 'Phenotypic abnormality', 'id': 'HP:0000118', 'int': 118}, # >> {'name': 'Abnormality of head or neck', 'id': 'HP:0000152', 'int': 152}, # >> {'name': 'Abnormality of the head', 'id': 'HP:0000234', 'int': 234}, # >> {'name': 'Abnormality of the face', 'id': 'HP:0000271', 'int': 271}, # >> {'name': 'Abnormality of the orbital region', 'id': 'HP:0000315', 'int': 315}, # >> {'name': 'Abnormality of the eye', 'id': 'HP:0000478', 'int': 478}, # >> {'name': 'Abnormal retinal morphology', 'id': 'HP:0000479', 'int': 479}, # >> {'name': 'Abnormal eyelid morphology', 'id': 'HP:0000492', 'int': 492}, # >> {'name': 'Abnormality of eye movement', 'id': 'HP:0000496', 'int': 496} # >> ]
Not yet implemented
The following instance methods are not yet implemented for pyhpo.HPOSet
- variance(self, /)
Calculates the distances between all its term-pairs. It also provides basic calculations for variances among the pairs.
- Returns
Tuple with the variance metrices
float Average distance between pairs
int Smallest distance between pairs
int Largest distance between pairs
list of int List of all distances between pairs
- Return type
tuple of (int, int, int, list of int)
- combinations(self, /)
Helper generator function that returns all possible two-pair combination between all its terms
This function is direction dependent. That means that every pair will appear twice. Once for each direction
See also
- Yields
Tuple of
pyhpo.HPOTerm
– Tuple containing the follow items * HPOTerm 1 of the pair * HPOTerm 2 of the pair
- combinations_one_way(self, /)
Helper generator function that returns all possible two-pair combination between all its terms
This methow will report each pair only once
See also
- Yields
Tuple of
term.HPOTerm
– Tuple containing the follow itemsHPOTerm instance 1 of the pair
HPOTerm instance 2 of the pair
BasicHPOSet
A BasicHPOSet
is like a normal pyhpo.HPOSet
, but:
only child terms are retained, non-specific parent terms are removed
a obsolete terms are replaced or removed
all modifier terms are removed
HPOPhenoSet
A BasicHPOSet
is like a normal pyhpo.HPOSet
, but:
a obsolete terms are replaced or removed
all modifier terms are removed
Term |
HPOSet |
BasicHPOSet |
HPOPhenoSet |
---|---|---|---|
obsolete |
✅ |
❌ |
❌ |
modifier |
✅ |
❌ |
❌ |
parents |
✅ |
❌ |
✅ |