New Tools developed by i2b2 faculty members:
- Genephony is an online tool for the manipulation of large datasets of genomic information. It can be used as a browser for genomic data, as a high-throughput annotation tool, and as a knowledge discovery tool. It is designed to be ease to use, flexible and extensible. Its knowledgde management engine provides fine-grained control over individual data elements, as well as efficient operations on large datasets. Access at http://genome.ufl.edu/gp/
- Gene Network Enrichment Analysis (GNEA) is an approach that identifies
transcriptionally deregulated biological processes in diseases. A network signature, corresponding to putative, deregulated networks of interacting
proteins, is identified based on mRNA gene expression microarray and
protein-protein interactions from literature. Biological processes are tested for over-representation in the signature and those identified as
enriched are considered transcriptionally deregulated. The approach has
been applied to type 2 diabetes and insulin resistance datasets and is described in greater detail in:
Liu M, Liberzon A, Kong SW Kong, Lai WR, Park PJ, et al "Network-Based
Analysis of Type 2 Diabetes" PLoS Genet. Jun 15;3(6):e96 and can be accessed at http://genomics10.bu.edu/manwayl/.
- Polymorphism Phenotyping (PolyPhen) is an automatic tool for prediction of
possible impact of an amino acid substitution on the structure and function
of a human protein. This prediction is based on straightforward empirical
rules which are applied to the sequence, phylogenetic and structural
information characterizing the substitution and can be accessed at http://genetics.bwh.harvrd.edu/pph/.
- SNP2RFLP (Mouse SNPs Between Strains that Create RFLPs). Single nucleotide
polymorphism (SNP) markers give high resolution in genetic mapping in mouse because they are abundant and easily typed. Initial localization via a
genome-wide SNP panel often defines a large chromosomal interval and
insufficient informative markers with which to proceed with fine-mapping. To
further refine this interval containing a mutation that is causative for a
phenotype of interest, SNP2RFLP extracts region-specific SNPs from the NCBI
mouse SNP database that are informative between the mouse strains used in the cross. SNP2RFLP then identifies those SNPs that create restriction fragment length polymorphisms (RFLPs) that can be easily assayed at the benchtop via
restriction enzyme digestion of SNP-containing PCR products. Access this at http://genetics.bwh.harvard.edu/snp2rflp/.
- SCONE - a sequence conservation evaluation algorithm that has already been applied in the framework of the Chromatin and Replication data analysis group of the ENCODE project (under construction). All SCONE scores for ENCODE regions may be obtained via the UCSC Genome Browser at http://genome.ucsc.edu/ENCODE/.
- Galaxy is a web accessible workbench designed to support reproducible, translational genomic research. It allows biologists to perform complicated bioinformatic tasks using a rapidly growing collection of galaxy tools. These tools are available through a simple but extremely efficient and powerful interface that requires little or no specialized training. While each tool usually performs some relatively simple and specific task, the output from one tool can be "chained": into other tools, allowing very complicated work-flows to be quickly constructed. Galaxy is tightly and transparently integrated with the major genomic annotation resources including the UCSC genome browser and BioMart. i2b2's Ross Lazarus is a member of the core Galaxy devleopment team (http://g2.trac.bx.psu.edu/wiki/GalaxyTeam) and is responsible for developing a suite of statistical genetics tools with support from i2b2 and his own Rgenetics project. These will be demonstrated at two sessions during this year's ASHG meeting in San Diego.
- The Rgenetics project is an international collaborative open-source statistical genetics
software development project supported by a BISTI R01 grant on which the PI is Dr. Ross
Lazarus (PI on the i2b2 Airways Disease DBP). The goal of that project is to
create an easy to use framework for statistical genetics software and analyses that will
be similar to the well know Bioconductor project for gene expression analysis software. Co-investigators include Dr. Vincent Carey and Dr. Robert Gentleman, both founders
of Bioconductor. In order to improve access to the various software applications created
in the Rgenetics and Bioconductor projects, and to integrate best-of-breed external
statistical genetics software and external data and sources of annotation, Dr. Lazarus has
also been striving to lower the practical barriers for translational researchers seeking
transparent access to all these important resources by developing new methods for
integration of statistical software applications into easy to use frameworks for biologists. Check here for releases.
- Knots Detector - a web server that detects knots in protein structures (http://knots.mit.edu); see Kolesov G, Mirnau P, Kardar M and Mirny LA. Protein knot server: detection of knots in protein structures. Nucleic Acid Research, 35(10), 2007.
- Predictor for specificity-determining residues, SDRs - two web-servers that can predict SDRs using two different techniques in a user-supplied alignment of protein sequences (http://tamm.mit.edu/SDR/). See Levine J., Kueh H-Y and Mirny LA. Intrinsic Fluctuations, Robustness and Tunability in Signaling Cycles. Biophysical Journal, 2007.
- BETASCAN - calculates a prioir probabilities of beta-strand formation and paiwise beta-beta sheet interactions (http://groups.csail.mit.edu/cb/betascan) (manuscript in preparation).
- PARTIFOLD - a computational tool for protein structure exploration based on efficient Boltzmann partition function estimation; applicable to protein families with semi-regular geometric fold constraints; uses the primary amino acid residue sequence (http://partifold.csail.mit.edu/TMB/index.html) (manuscipt in press).
Existing Tools developed by i2b2 faculty members:
Tools recommended by i2b2 members:
- Multiple Experiment Viewer - freely available suite
of most popular analytic methods for microarrays. (http://www.tigr.org/software/tm4/mev.html)
- Bioconductor - a large module written in the R
language that has been a major success of the opensource bioinformatics
community for microarray analysis. (http://www.bioconductor.org)
- VISTA - a suite of comparative genomics tools to
study sequence conservation pre-loaded with many useful data sets. (http://genome.lbl.gov/vista)