Skip to main content

CLOUD COMPUTING AND BIG DATA ANALYTICS APPROACH TO GAIN INSIGHT FROM FISHERIES GENOMIC AND PROTEOMICS DATA


Sustainable Management of Aquatic Resources,Pages 427-449
Edited by: B.K.Mahapatra,A.K.Roy and N.C.Pramanik
copyright@2018,Narendra Publishing House,Delhi,India

CLOUD COMPUTING AND BIG DATA ANALYTICS APPROACH TO GAIN INSIGHT FROM FISHERIES GENOMIC AND PROTEOMICS DATA
Ajit Kumar Roy

In post genomic era a new language has been created for new biology viz., Genomics, Functional Genomics, Proteomics, cDNA, microarrays, Global Gene Expression Patterns. New Computational Tools are used for Sequencing, analyzing experimental data, Searching, Pattern matching, Data mining, Gene discovery, Function discovery aiming to Classify, identify patterns, predictions, create models & Prediction, Assessment and Comparison, Optimization, better utilize of existing knowledge. High-throughput techniques like high throughput protein crystallization, Massive parallel sequencing, Mass spectrometry, Microarray, High throughput cell imaging, & High throughput in vivo screening are applied to generate massive data on nucleotide sequences, protein sequences, proteins sequence patterns or motifs, macromolecular 3D structure, gene expression data, metabolic pathways, proteomics data. Major breakthroughs in Bioinformatics through innovations in Mathematics or Statistics as FASTA, BLAST, Phred/Phrap, BLOSUM, GenScan, PSI-BLAST, Threading, GRAIL etc. has been achieved. Science is changing because of the impact of information technology. Experimental, theoretical, and computational science are all being affected by the data deluge, and a fourth, “data-intensive” science paradigm is emerging. The revolution is mostly about treating biology as an information science, not only specific biochemical technologies. As we enter the period in which science is being driven by a data explosion, cloud computing has become a fundamental new enabling technology to advance human knowledge. The new wave of high-throughput technologies in genomics and proteomics are constantly improving and generating an unprecedented amount of data that can be termed as Big Data means large data sets in terms of volume, variety, velocity, variability, veracity, & complexity. Bioinformatics researchers are currently confronted with a huge challenge of handling, processing and moving these large-scale biological data, a problem that will only increase in coming years. Therefore, cloud computing bears great promise for effectively addressing issues of large-scale data generated by high-throughput technologies in the fields of genomics, proteomics and other biological research areas. Cloud-based bioinformatics resources have changed the approaches toward huge datasets, providing much faster data acquisition, analysis rates and storage. New cloud-based bioinformatics computing tools, algorithms, and workflows are consistently being developed and successfully deployed. Basically, the cloud refers to software and services that run on the internet instead of one's computer.
It is well recognized that fisheries catches have reached a plateau in recent years. Due to the high demand for fish and shellfish on the global market, aquaculture production contributes an increasing amount to the food supply. In the post genomic era, the use of DNA microarray technology in fish biology and aquaculture may have great significance and may be applied to discovery of novel genes, gene expression profiling from fish species of interest, and identification of the genomic responses to environmental stimulation in aquaculture. To achieve dramatic breakthroughs, new computational approaches will be required. We need to embrace the next, fourth paradigm of science that will help bring about a profound transformation of scientific research and insight. For scientists; this will mean deeper scientific insight, richer discovery, and faster breakthroughs. This paper reviews the current development of cloud based computational technologies that can be applied and pinpoints their potential beneficial applications as well as implications for fisheries management and aquaculture development. Big data Analytics platforms that offer implementations of the Map Reduce computational pattern e.g., Hadoop and Dryad make it easy for developers to perform data-intensive computations at scale is also highlighted.
Keywords: Big Data, Cloud Computing, Bioinformatics, Gemonic, Proteomics, Fisheries&Insight
CLOUD COMPUTING AND BIG DATA ANALYTICS APPROACH TO GAIN INSIGHT FROM FISHERIES GENOMIC AND PROTEOMICS DATA

Comments