21 Dec 2009

PANTHER version 7: improved phylogenetic trees, orthologs and collaboration with the Gene Ontology Consortium

Protein Analysis THrough Evolutionary Relationships (PANTHER) is a comprehensive software system for inferring the functions of genes based on their evolutionary relationships. Phylogenetic trees of gene families form the basis for PANTHER and these trees are annotated with ontology terms describing the evolution of gene function from ancestral to modern day genes. One of the main applications of PANTHER is in accurate prediction of the functions of uncharacterized genes, based on their evolutionary relationships to genes with functions known from experiment. The PANTHER website, freely available at http://www.pantherdb.org, also includes software tools for analyzing genomic data relative to known and inferred gene functions. Since 2007, there have been several new developments to PANTHER: (i) improved phylogenetic trees, explicitly representing speciation and gene duplication events, (ii) identification of gene orthologs, including least diverged orthologs (best one-to-one pairs), (iii) coverage of more genomes (48 genomes, up to 87% of genes in each genome; see http://www.pantherdb.org/panther/summaryStats.jsp), (iv) improved support for alternative database identifiers for genes, proteins and microarray probes and (v) adoption of the SBGN standard for display of biological pathways. In addition, PANTHER trees are being annotated with gene function as part of the Gene Ontology Reference Genome project, resulting in an increasing number of curated functional annotations.


(source URL, Via NAR - Advance Access.)

12 Dec 2009

HMMCONVERTER 1.0: a toolbox for hidden Markov models

Hidden Markov models (HMMs) and their variants are widely used in Bioinformatics applications that analyze and compare biological sequences. Designing a novel application requires the insight of a human expert to define the model's architecture. The implementation of prediction algorithms and algorithms to train the model's parameters, however, can be a time-consuming and error-prone task. We here present HMMConverter, a software package for setting up probabilistic HMMs, pair-HMMs as well as generalized HMMs and pair-HMMs. The user defines the model itself and the algorithms to be used via an XML file which is then directly translated into efficient C++ code. The software package provides linear-memory prediction algorithms, such as the Hirschberg algorithm, banding and the integration of prior probabilities and is the first to present computationally efficient linear-memory algorithms for automatic parameter training. Users of HMMConverter can thus set up complex applications with a minimum of effort and also perform parameter training and data analyses for large data sets.


(source URL, Via NAR.)

Pathway Tools version 13.0: integrated software for pathway/genome informatics and systems biology

Pathway Tools is a production-quality software environment for creating a type of model-organism database called a Pathway/Genome Database (PGDB). A PGDB such as EcoCyc integrates the evolving understanding of the genes, proteins, metabolic network and regulatory network of an organism. This article provides an overview of Pathway Tools capabilities. The software performs multiple computational inferences including prediction of metabolic pathways, prediction of metabolic pathway hole fillers and prediction of operons. It enables interactive editing of PGDBs by DB curators. It supports web publishing of PGDBs, and provides a large number of query and visualization tools. The software also supports comparative analyses of PGDBs, and provides several systems biology analyses of PGDBs including reachability analysis of metabolic networks, and interactive tracing of metabolites through a metabolic network. More than 800 PGDBs have been created using Pathway Tools by scientists around the world, many of which are curated DBs for important model organisms. Those PGDBs can be exchanged using a peer-to-peer DB sharing system called the PGDB Registry.


(source URL, Via Briefings in Bioinformatics - Advance Access.)

The Gene Interaction Miner (GIM): A new tool for data mining contextual information for protein-protein interaction analysis

Motivation: This work was motivated by the need for an automated tool for discovery of genetic networks and the availability of extensive contextual protein-protein interaction information in the iHOP repository. At the moment, this information can not be explored to its full potential due to the lack of software tools to reliably collect, process and display that information in a way that life scientists can quickly analyze genes of interest and search for potential interaction networks. Commercial tools can perform a similar job, but results appear to be less informative than those obtained using contextual information.

Results: The Gene Interaction Miner (GIM) could successfully uncover complex network structures of protein protein interactions for a test dataset composed of genes already related to Alzheimer's disease. That same set, when examined using two other analysis tools, namely STRING and Pathway Studio, resulted in incomplete protein-protein interaction networks, which indicate that the use of curated databases only gives a partial picture of the biological processes behind the disease.

Availability: The dataset used in this work and a running version of the software tool is available for download from the website http://www.cs.newcastle.edu.au/~mendes/softwareGIM.html.
(source URL, Via Bioinformatics - Advance Access.)

ConceptGen: a gene set enrichment and gene set relation mapping tool

Motivation: The elucidation of biological concepts enriched with differentially expressed genes has become an integral part of the analysis and interpretation of genomic data. Of additional importance is the ability to explore networks of relationships among previously defined biological concepts from diverse information sources, and to explore results visually from multiple perspectives. Accomplishing these tasks requires a unified framework for agglomeration of data from various genomic resources, novel visualizations, and user functionality.

Results: We have developed ConceptGen, a web-based gene set enrichment and gene set relation mapping tool that is streamlined and simple to use. ConceptGen offers over 20,000 concepts comprising fourteen different types of biological knowledge, including data not currently available in any other gene set enrichment or gene set relation mapping tool. We demonstrate the functionalities of ConceptGen using gene expression data modeling TGF-beta-induced epithelial-mesenchymal transition and metabolomics data comparing metastatic versus localized prostate cancers.

Availability: ConceptGen is part of the NIH's National Center for Integrative Biomedical Informatics (NCIBI) and is freely available at http://conceptgen.ncibi.org.

For Terms of Use, visit http://portal.ncibi.org/gateway/pdf/Terms of use-web.pdf
(source URL, Via Bioinformatics - Advance Access.)

5 Dec 2009

Bionet: integrate data of proteins and genetic diseases

The system Bionet was created by Samuel Brando Oldra as work of the discipline of Trabalho de Conclusão de Curso 2 of the course Bacharelado em Sistemas de Informação of the Universidade de Caxias do Sul, in order to integrate data of proteins and genetic diseases. With it you can get the network of interacting of proteins from the search of a genetic disease, save the files needed for handling in the Cytoscape software and document the entire search process. For this, the Bionet system, integrates data generated from the sites OMIM and STRING, facilitating the search process of the specialist.

!!still at work but promising tool, one to follow-up
(BioNet Home Page)

29 Nov 2009

Eukaryotic Protein Domains as Functional Units of Cellular Evolution

...Modular protein domains are functional units that can be modified through the acquisition of new intrinsic activities or by the formation of novel domain combinations, thereby contributing to the evolution of proteins with new biological properties. Here, we assign proteins to groups with related domain compositions and functional properties, termed "domain clubs," which we use to compare multiple eukaryotic proteomes....

(source URL, Via Science Signaling.)