Abstract

PyHMMER provides Python integration of the popular profile Hidden Markov Model software HMMER via Cython bindings. This allows the annotation of protein sequences with profile HMMs and building new ones directly with Python. PyHMMER increases flexibility of use, allowing creating queries directly from Python code, launching searches, and obtaining results without I/O, or accessing previously unavailable statistics like uncorrected P-values. A new parallelization model greatly improves performance when running multithreaded searches, while producing the exact same results as HMMER.

PyHMMER has been used in the following publications:

  1. Accurate de novo identification of biosynthetic gene clusters with GECCO (preprint).
  2. Identification of microbial metabolic functional guilds from large genomic datasets.
  3. Automated model building and protein identification in cryo-EM maps (preprint).
  4. Homologous Pairs of Low and High Temperature Originating Proteins Spanning the Known Prokaryotic Universe.

Papers
Posters