[1]
|
Swindells, M., Rae, M., Pearce, M., Moodie, S., Miller, R. and Leach, P. (2002) Application of high throughput computing in bioinformatics. Philosophical Transactions. Series A, Mathematical, Physical, and Engineering Sciences, 360, 1179-1189. doi:10.1098/rsta.2002.0987
|
[2]
|
Kann, M.G. (2010) Advances in translational bioinformatics: Computational approaches for the hunting of disease genes. Brief Bioinformatics, 11, 96-110.
doi:10.1093/bib/bbp048
|
[3]
|
Ley, T.J., Mardis, E.R., Ding, L., Fulton, B., McLellan, M.D., et al. (2002) DNA sequencing of a cytogenetically normal acute myeloid leukaemia genome. Nature, 456, 200866-2000872. doi:10.1038/nature07485
|
[4]
|
Isakov, O., Modai, S. and Shomron, N. (2011) Pathogen detection using short-RNA deep sequencing subtraction and assembly. Bioinformatics, 27, 2027-2030.
doi:10.1093/bioinformatics/btr349
|
[5]
|
Mardis, E.R. (2008) The impact of next-generation sequencing technology on genetics. Trends in Genetics, 24, 133-141. doi:10.1016/j.tig.2007.12.007
|
[6]
|
Koboldt, D.C., Ding, L., Mardis, E.R. and Wilson, R.K. (2010) Challenges of sequencing human genomes. Brief Bioinformatics, 11, 484-498. doi:10.1093/bib/bbq016
|
[7]
|
Clarke, S.C. (2005) Pyrosequencing: Nucleotide sequencing technology with bacterial genotyping applications. Expert Review of Molecular Diagnostics, 5, 947-953.
doi:10.1586/14737159.5.6.947
|
[8]
|
Claesson, M.J., O’Sullivan, O., Wang, Q., Nikkilä, J., Marchesi, J.R., Smidt, H., de Vos, W.M., Ross, R.P., and O’Toole, P.W. (2009) Comparative analysis of pyrosequencing and a phylogenetic microarray for exploring microbial community structures in the human distal intestine. PLoS One, 20, e6669.
doi:10.1371/journal.pone.0006669
|
[9]
|
Hamady, M., Lozupone, C. and Knight, R. (2010) Fast UniFrac: Facilitating high-throughput phylogenetic analyses of microbial communities including analysis of pyrosequencing and PhyloChip data. International Society for Microbial Ecology Journal, 4, 17-27.
doi:10.1038/ismej.2009.97
|
[10]
|
Margulies, M., Egholm, M., Altman, W.E., Attiya, S., Bader J.S., Bemben, L.A., Berka, J., Braverman, M.S., Chen, Y-J. and Chen, Z. (2005a) Genome sequencing in microfabricated high-density picolitre reactors. Nature, 437, 376-380. doi:10.1038/nature03959
|
[11]
|
Bentley, D.R., Balasubramanian, S., Swerdlow, H.P., Smith, G.P., Milton, J., et al. (2008) Accurate whole human genome sequencing using reversible terminator chemistry. Nature, 456, 53-59. doi:10.1038/nature07517
|
[12]
|
McKenna, A., Hanna, M., Banks, E., Sivachenko, A., Cibulskis, K., et al. (2010) The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Research, 20, 1297-1303. doi:10.1038/nature07517
|
[13]
|
McKernan, K.J., Peckham, H.E., Costa, G.L., McLaughlin, S.F., Fu, Y., et al. (2009) Sequence and structural variation in a human genome uncovered by short-read, massively parallel ligation sequencing using two-base encoding. Genome Research, 19, 1527-1541.
doi:10.1101/gr.091868.109
|
[14]
|
Eid, J., Fehr, A., Gray J., Luong, K., Lyle, J., et al. (2009) Real-time DNA sequencing from single polymerase molecules. Science, 323, 133-138.
doi:10.1126/science.1162986
|
[15]
|
Chan, E.Y. (2009) Next-generation sequencing methods: Impact of sequencing accuracy on SNP discovery. Methods in Molecular Biology, 578, 95-111.
doi:10.1007/978-1-60327-411-1_5
|
[16]
|
Dalloul, R.A., Long, J.A., Zimin, A.V., Aslam, L., Beal, K., et al. (2010) Multi-platform next generation sequencing of the domestic turkey (Meleagris gallopavo): Genome assembly and analysis. PLoS Biology, 8, e1000475.
doi:10.1371/journal.pbio.1000475
|
[17]
|
Nothnagel, M., Herrmann, A., Wolf, A., Schreiber, S., Platzer, M., Siebert, R., Krawczak, M. and Hampe, J. (2011) Technology-specific error signatures in the 1000 Genomes Project data. Human Genome, 130, 505-516.
doi:10.1007/s00439-011-0971-3
|
[18]
|
Ewing, B., Hillier, L., Wendl, M.C. and Green, P. (1998) Base-calling of automated sequencer traces using phred. I. Accuracy assessment. Genome Research, 8, 175-185.
doi:10.1101/gr.8.3.175
|
[19]
|
Castellana, S., Romani, M., Valente, E.M. and Mazza, T.A. (2012) Solid quality-control analysis of AB SOLiD short-read sequencing data. Brief Bioinformatics, 13, 1-12. doi:10.1093/bib/bbs048
|
[20]
|
Parkinson, N.J., Maslau, S., Ferneyhough, B., Zhang, G., Gregory, L., Buck, D., Ragoussis, J., Ponting, C.P. and Fischer, M.D. (2012) Preparation of high-quality next-generation sequencing libraries from picogram quantities of target DNA. Genome Research, 22, 125-133.
doi:10.1101/gr.124016.111
|
[21]
|
Allen, J.E., Pertea, M. and Salzberg, S.L. (2004) Computational gene prediction using multiple sources of evidence. Genome Research, 14, 142-148.
doi:10.1101/gr.1562804
|
[22]
|
Sleator, R.D. (2010) An overview of the current status of eukaryote gene prediction strategies. Gene, 461, 1-4.
doi:10.1016/j.gene.2010.04.008
|
[23]
|
Tompa, M. (1999) An exact method for finding short motifs in sequences, with application to the ribosome binding site problem. International Conference on Intelligent Systems for Molecular Biology, 1999, 262-271.
|
[24]
|
Tompa, M., Li, N., Bailey, T.L., Church G.M., Moor B.D., et al. (2005) Assessing computational tools for the discovery of transcription factor binding sites. Nature Biotechnology, 23, 137-144. doi:10.1038/nbt1053
|
[25]
|
Altschul, S.F., Gish, W., Miller, W., Myers, E.W. and Lipman, D. J. (1990) Basic local alignment search tool. Journal of Molecular Biology, 215, 403-410.
doi:10.1016/S0022-2836(05)80360-2
|
[26]
|
Camacho, C., Coulouris, G., Avagyan, V., Ma, N., Papadopoulos, J., et al. (2009) BLAST+: Architecture and applications. BMC Bioinformatics, 10, 421.
doi:10.1186/1471-2105-10-421
|
[27]
|
Flicek, P. and Birney, E. (2009) Sense from sequence reads: Methods for alignment and assembly. Nature Methods, 6, S6-S12. doi:10.1038/nmeth.1376
|
[28]
|
Lassmann, T., Hayashizaki, Y. and Daub C.O. (2011) SAMStat: Monitoring biases in next generation sequencing data. Bioinformatics, 27, 130-131.
doi:10.1093/bioinformatics/btq614
|
[29]
|
Krawitz, P., Rödelsperger, C., Jäger, M., Jostins, L., Bauer, S. and Robinson, P.N. (2010) Microindel detection in short-read sequence data. Bioinformatics, 26, 722-729. doi:10.1093/bioinformatics/btq027
|
[30]
|
Li, H. and Durbin, R. (2009) Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics, 25, 754-1760. doi:10.1093/bioinformatics/btp324
|
[31]
|
Pasaniuc, B., Zaitlen, N. and Halperin, E. (2011) Accurate estimation of expression levels of homologous genes in RNA-seq experiments. Journal of Computational Biology, 18, 459-468. doi:10.1089/cmb.2010.0259
|
[32]
|
Durbin, R.M., Abecasis, G.R., Altshuler, D.L., Auton, A., Brooks, L.D., et al. (2010) A map of human genome variation from population-scale sequencing. Nature, 467, 1061-1073. doi:10.1038/nature09534
|
[33]
|
Altschul, S.F., Madden, T.L., Schaffer, A.A., Zhang, J., Zhang, Z., Miller, W. and Lipman, D.J. (1997) Gapped BLAST and PSI-BLAST: A new generation of protein database search programs. Nucleic Acids Research, 25, 3389-3402. doi:10.1093/nar/25.17.3389
|
[34]
|
Hulo, N., Bairoch, A., Bulliard, V., Cerutti, L., Cuche, B.A., de Castro, E., Lachaize, C., Langendijk-Genevaux, P.S. and Sigrist, C.J. (2008) The 20 years of PROSITE. Nucleic Acids Research, 36, D245-D249.
doi:10.1093/nar/gkm977
|
[35]
|
Finn, R.D., Mistry, J., Tate, J., Coggill, P., Heger, A., Pollington, J.E., Gavin, O.L., Gunasekaran, P., Ceric, G., Forslund, K., Holm, L., Sonnhammer, E.L., Eddy, S.R. and Bateman, A. (2010) The Pfam protein families database. Nucleic Acids Research, 38, D211-222.
doi:10.1093/nar/gkp985
|
[36]
|
Pirovano, W. and Heringa, J. (2010) Protein secondary structure prediction. Methods in Molecular Biology, 609, 327-348. doi:10.1007/978-1-60327-241-4_19
|
[37]
|
Raghava, G.P., Searle, S.M., Audley, P.C., Barber, J.D. and Barton, G.J. (2003) OXBench: A benchmark for evaluation of protein multiple sequence alignment accuracy. BMC Bioinformatics, 4, 47.
doi:10.1186/1471-2105-4-47
|
[38]
|
Stebbings, L.A. and Mizuguchi, K. (2004) HOMSTRAD: Recent developments of the homologous protein structure alignment database. Nucleic Acids Research, 32, D203-D207. doi:10.1093/nar/gkh027
|
[39]
|
Edgar, R.C. (2004b) MUSCLE: Multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Research, 32, 1792-1797. doi:10.1093/nar/gkh340
|
[40]
|
Thompson, J.D., Koehl, P., Ripp, R. and Poch, O. (2005) BAliBASE 3.0: Latest developments of the multiple sequence alignment benchmark. Proteins, 61, 127-136.
doi:10.1002/prot.20527
|
[41]
|
Van Walle, I., Lasters, I. and Wyns, L. (2005) SABmarka benchmark for sequence alignment that covers the entire known fold space. Bioinformatics, 21, 1267-1268.
doi:10.1093/bioinformatics/bth493
|
[42]
|
Subramanian, A.R., Weyer-Menkhoff, J., Kaufmann, M. and Morgenstern, B. (2005) DIALIGN-T: An improved algorithm for segment-based multiple sequence alignment. BMC Bioinformatics, 6, 66. doi:10.1186/1471-2105-6-66
|
[43]
|
Stinchcombe, J.R. and Hoekstra, H.E. (2008) Combining population genomics and quantitative genetics: Finding the genes underlying ecologically important traits. Heredity, 100, 158-170. doi:10.1038/sj.hdy.6800937
|
[44]
|
Fridman, E. and Pichersky, E. (2005) Metabolomics, genomics, proteomics, and the identification of enzymes and their substrates and products. Current Opinion in Plant Biology, 8, 242-248. doi:10.1016/j.pbi.2005.03.004
|
[45]
|
Middleton, F.A., Rosenow, C., Vailaya, A., Kuchinsky, A., Pato, M.T. and Pato, C.N. (2007) Integrating genetic, functional genomic, and bioinformatics data in a systems biology approach to complex diseases: Application to schizophrenia. Methods in Molecular Biology, 401, 337-364. doi:10.1007/978-1-59745-520-6_18
|
[46]
|
Lahdesmakia, H., Hautaniemia, S., Shmulevichc, I. and Yli-Harja, O. (2006) Relationships between probabilistic Boolean networks and dynamic Bayesian networks as models of gene regulatory networks. Signal Processing, 86, 814-834. doi:10.1016/j.sigpro.2005.06.008
|
[47]
|
Goble, C. and Stevens, R. (2008) State of the nation in data integration for bioinformatics. Journal of Biomedical Informatics, 41, 687-693. doi:10.1016/j.jbi.2008.01.008
|
[48]
|
Zhang, Z., Cheung, K.H. and Townsend, J.P. (2009) Bringing Web 2.0 to bioinformatics. Brief Bioinformatics, 10, 1-10. doi:10.1093/bib/bbn041
|
[49]
|
Shah, S.P., Huang, Y., Xu, T., Yuen, M.M.S., Ling, J. and Ouellette B.F.F. (2005) Atlas—A data warehouse for integrative bioinformatics. BMC Bioinformatics, 6, 34.
doi:10.1186/1471-2105-6-34
|
[50]
|
Lee T.J., Pouliot, Y., Wagner, V., Gupta, P., Stringer-Calvert, D.W.J., Tenenbaum, J.D. and Karp, P.D. (2006) Biowarehouse: A bioinformatics database warehouse toolkit. BMC Bioinformatics, 7, 170.
doi:10.1186/1471-2105-7-170
|
[51]
|
Birkland, A. and Yona, G. (2006) BIOZON: A hub of heterogeneous biological data. Nucleic Acids Research, 34, D235-D242. doi:10.1093/nar/gkj153
|
[52]
|
Trissl, S., Rother, K., Müller, H., Steinke, T., Koch, I., Preissner, R., Frömmel, C. and Leser, U. (2005) Columba: An integrated database of proteins, structures, and annotations. BMC Bioinformatics, 6, 81.
doi:10.1186/1471-2105-6-81
|
[53]
|
Hariharaputran, S., Töpel, T., Brockschmidt, B. and Hofestädt, R. (2007) VINEdb: A data warehouse for integration and interactive exploration of life science data. Journal of Integrative Bioinformatics, 4, 63.
|
[54]
|
Haider, S., Ballester, B., Smedley, D., Zhang, J., Rice, P. and Kasprzyk, A. (2009) BioMart central portal-unified access to biological data. Nucleic Acids Research, 37, W23-W27. doi:10.1093/nar/gkp265
|
[55]
|
Haas, L.M., Schwarz, P.M., Kodali, P., Kotlar, E., Rice, J.E. and Swope, W.C. (2001) DiscoveryLink: A system for integrated access to life sciences data sources. IBM Systems Journal, 40, 489-511. doi:10.1147/sj.402.0489
|
[56]
|
Chung, S.Y., Wong, L. (1999) Kleisli: A new tool for data integration in biology. Trends in Biotechnology, 17, 351-355. doi:10.1016/S0167-7799(99)01342-6
|
[57]
|
Hekkelman, M.L. and Vriend, G. (2005) MRS: A fast and compact retrieval system for biological data. Nucleic Acids Research, 33, W766-W769. doi:10.1093/nar/gki422
|
[58]
|
Crasto, C.J. and Shepherd, G.M. (2007) Managing knowledge in neuroscience. Methods in Molecular Biology, 401, 3-21. doi:10.1007/978-1-59745-520-6_1
|
[59]
|
Bota, M. and Swanson, L.W. (2010) Collating and curating neuroanatomical nomenclatures: Principles and use of the brain architecture knowledge management system (BAMS). Frontier in Neuroinformatics, 4, 3.
doi:10.3389/fninf.2010.00003
|
[60]
|
Cheung, K.H., White, K., Hager, J., Gerstein, M., Reinke, V., Nelson, K., et al. (2002) YMD: A microarray database for large-scale gene expression analysis. AMIA Annual Symposium Proceedings, 2002, 140-144.
|
[61]
|
Zdobnov, E.M., Lopez, R., Apweiler, R. and Etzold T. (2002) The EBI SRS server-recent developments. Bioinformatics, 18, 368-373.
doi:10.1093/bioinformatics/18.2.368
|
[62]
|
Sigrist, C.J.A., Cerutti, L., De Castro, E., Langendijk-Genevaux, P.S., Bulliard, V., Bairoch, A. and Hulo, N. (2010) PROSITE, a protein domain database for functional characterization and annotation. Nucleic Acids Research, 38, D161-D166. doi:10.1093/nar/gkp885
|
[63]
|
BioMoby Consortium, Wilkinson, M.D., Senger, M., Kawas, E., Bruskiewich, R., et al. (2008) Interoperability with Moby 1.0—It’s better than sharing your toothbrush. Briefings in Bioinformatics, 9, 220-231.
doi:10.1093/bib/bbn003
|
[64]
|
Jenkinson, A.M., Albrecht, M., Birney, E., Blankenburg H., Down, T., et al. (2008) Integrating biological data— The Distributed Annotation System. BMC Bioinformatics, 9, S3. doi:10.1186/1471-2105-9-S8-S3
|
[65]
|
Messina, D.N. and Sonnhammer, E.L. (2009) DASher: A stand-alone protein sequence client for DAS, the Distributed Annotation System. Bioinformatics, 25, 1333-1334.
doi:10.1093/bioinformatics/btp153
|
[66]
|
Olason, P.I. (2005) Integrating protein annotation resources through the Distributed Annotation System. Nucleic Acids Research, 33, W468-W470.
doi:10.1093/nar/gki463
|
[67]
|
Oinn, T., Addis, M., Ferris, J., Marvin, D., Senger, M., et al. (2004) Taverna: A tool for the composition and enactment of bioinformatics workflows. Bioinformatics, 20, 3045-3054. doi:10.1093/bioinformatics/bth361
|
[68]
|
Hendler, J. (2003) Science and the semantic web. Science, 299, 520-521. doi:10.1126/science.1078874
|
[69]
|
Belleau, F., Nolin, M.A., Tourigny, N., Rigault, P. and Morissette, J. (2008) Bio2RDF: Towards a mashup to build bioinformatics knowledge systems. Journal of Biomedical Informatics, 41, 706-716.
doi:10.1016/j.jbi.2008.03.004
|
[70]
|
Cheung, K.H., Yip, K.Y., Smith, A., Deknikker, R., Masiar, A., Gerstein, M. (2008) YeastHub: A semantic web use case for integrating data in the life sciences domain. Bioinformatics, 21, 85-96.
doi:10.1093/bioinformatics/bti1026
|
[71]
|
Ruttenberg, A., Clark, T., Bug, W., Samwald, M., Bodenreider, O., et al. (2007) Advancing translational research with the semantic web. BMC Bioinformatics, 8, S2.
doi:10.1186/1471-2105-8-S3-S2
|
[72]
|
Schadt, E.E., Linderman, M.D., Sorenson, J., Lee, L. and Nolan, G.P. (2010) Computational solutions to large-scale data management and analysis. Nature Reviews Genetics, 11, 647-657. doi:10.1038/nrg2857
|
[73]
|
Wilkinson, M.D., McCarthy, L., Vandervalk, B., Withers, D., Kawas, E. and Samadian, S. (2010) SADI, SHARE, and the in silico scientific method. BMC Bioinformatics, 11, S7. doi:10.1186/1471-2105-11-S12-S7
|
[74]
|
Lee, T.L. (2008) Big data: Open-source format needed to aid wiki collaboration. Nature, 455, 461.
doi:10.1038/455461c
|
[75]
|
Potthast, M., Stein, B. and Gerling, R. (2008) Automatic vandalism detection in Wikipedia. Advances in Information Retrieval, 4956, 663-668.
doi:10.1007/978-3-540-78646-7_75
|
[76]
|
Kislyuk, A.O., Katz, L.S., Agrawal, S., Hagen, M.S., Conley, A.B., et al. (2010) A computational genomics pipeline for prokaryotic sequencing projects. Bioinformatics, 26, 1819-1826.
doi:10.1093/bioinformatics/btq284
|
[77]
|
Li, L., Shiga, M., Ching, W.K. and Mamitsuka, H. (2010) Annotating gene functions with integrative spectral clustering on microarray expressions and sequences. Genome Information, 22, 95-120.
doi:10.1142/9781848165786_0009
|
[78]
|
Lorenzi, H.A., Puiu, D., Miller, J.R., Brinkac, L.M., Amedeo, P., Hall, N. and Caler, E.V. (2010) New assembly, reannotation and analysis of the entamoeba histolytica genome reveal new genomic features and protein content information. PLoS Neglected Tropical Diseases, 4, e716. doi:10.1371/journal.pntd.0000716
|
[79]
|
Meyer, F., Goesmann, A., McHardy, A.C., Bartels, D., Bekel, T., et al. (2003) Gendb—An open source genome annotation system for prokaryote genomes. Nucleic Acids Research, 31, 2187-2195. doi:10.1093/nar/gkg312
|
[80]
|
Stothard, P. and Wishart, D.S. (2006) Automated bacterial genome analysis and annotation. Current Opinion in Microbiology, 9, 505-510.
doi:10.1016/j.mib.2006.08.002
|
[81]
|
Stein, L. (2001) Genome annotation: From sequence to biology. Nature Review in Genetics, 2, 493-503.
doi:10.1038/35080529
|
[82]
|
Overbeek, R., Begley, T., Butler, R.M., Choudhuri, J.V., Chuang, H.Y., et al. (2005) The subsystems approach to genome annotation and its use in the project to annotate 1000 genomes. Nucleic Acids Research, 33, 5691-5702.
doi:10.1093/nar/gki866
|
[83]
|
Gilks, W.R., Audit, B., De Angelis, D., Tsoka, S. and Ouzounis, C.A. (2002) Modeling the percolation of annotation errors in a database of protein sequences. Bioinformatics, 18, 1641-1649.
doi:10.1093/bioinformatics/18.12.1641
|
[84]
|
Prosdocimi, F. (2003) Bioinformática: Manual do usuario. Biotecnologia Ciência & Desenvolvimento, 2, 2.
|
[85]
|
Pareja, E., Pareja-Tobes, P., Manrique, M., Pareja-Tobes, E., Bonal, J. and Tobes, R. (2006) Extratrain: A database of extragenic regions and transcriptional information in prokaryotic organisms. BMC Microbiology, 6, 29.
doi:10.1186/1471-2180-6-29
|
[86]
|
Lerat, E. and Ochman, H. (2005) Recognizing the pseudogenes in bacterial genomes. Nucleic Acids Research, 33, 3125-3132. doi:10.1093/nar/gki631
|
[87]
|
Baxevanis, A.D. and Ouellette, F.F. (2001) A practical guide to the analysis of genes and proteins. Wiley: Bioinformatics, 2, 260-262.
|
[88]
|
Mazumder, R. and Vasudevan, S. (2008) Structure-guided comparative analysis of proteins: Principles, tools, and applications for predicting function. PLoS Computational Biology, 4, e1000151.
doi:10.1371/journal.pcbi.1000151
|
[89]
|
Karasavvas, K.A., Baldock, R. and Burger, A. (2004) Bioinformatics integration and agent technology. Journal of Biomedical Informatics, 37, 205-219.
doi:10.1016/j.jbi.2004.04.003
|
[90]
|
Li, A. (2006) Facing the challenges of data integration in biosciences. Engineering Letter, 13, 3.
|
[91]
|
Demir, E., Cary, M.P., Paley, S., Fukuda, K., Lemer C., et al. (2010) The BioPAX community standard for pathway data sharing. Nature Biotechnology, 28, 935-942.
doi:10.1038/nbt.1666
|
[92]
|
Rubin, D.L., Shah, N.H. and Noy, N.F. (2008) Biomedical ontologies: A functional perspective. Brief Bioinformatics, 9, 75-90. doi:10.1093/bib/bbm059
|
[93]
|
Sarkar, I.N., Egan, M.G., Coruzzi, G., Lee, E.K. and De-Salle, R. (2008) Automated simultaneous analysis phylogenetics (ASAP): An enabling tool for phlyogenomics. BMC bioinformatics, 9, 103.
doi:10.1186/1471-2105-9-103
|
[94]
|
Clark, T. (2007) Knowledge integration in biomedicine: Technology and community. Brief Bioinformatics, 8, E1-E3. doi:10.1093/bib/bbm019
|