CLOUD COMPUTING AND BIG DATA ANALYTICS APPROACH TO GAIN INSIGHT FROM FISHERIES GENOMIC AND PROTEOMICS DATA
Sustainable Management of Aquatic Resources,Pages 427-449
Edited by: B.K.Mahapatra,A.K.Roy and N.C.Pramanik
copyright@2018,Narendra Publishing House,Delhi,India
CLOUD COMPUTING AND BIG DATA ANALYTICS APPROACH TO
GAIN INSIGHT FROM FISHERIES GENOMIC AND PROTEOMICS DATA
In
post genomic era a new language has been created for new biology viz.,
Genomics, Functional Genomics, Proteomics, cDNA, microarrays, Global Gene
Expression Patterns. New
Computational Tools are used for Sequencing, analyzing experimental
data, Searching, Pattern matching, Data mining, Gene discovery, Function
discovery aiming to Classify, identify patterns, predictions, create models
& Prediction, Assessment and Comparison, Optimization, better utilize of existing
knowledge. High-throughput techniques like high
throughput protein crystallization, Massive
parallel sequencing, Mass spectrometry, Microarray, High throughput cell
imaging, & High throughput in vivo screening are applied to generate
massive data on nucleotide sequences, protein sequences, proteins sequence
patterns or motifs, macromolecular 3D structure, gene expression data,
metabolic pathways, proteomics data. Major breakthroughs in Bioinformatics
through innovations in Mathematics or Statistics as FASTA, BLAST, Phred/Phrap,
BLOSUM, GenScan, PSI-BLAST, Threading, GRAIL etc. has been achieved. Science is
changing because of the impact of information technology. Experimental,
theoretical, and computational science are all being affected by the data
deluge, and a fourth, “data-intensive” science paradigm is emerging. The
revolution is mostly about treating biology as an information science, not only
specific biochemical technologies. As we enter the period in which science is
being driven by a data explosion, cloud computing has become a fundamental new
enabling technology to advance human knowledge. The new wave of high-throughput
technologies in genomics and proteomics are constantly improving and generating
an unprecedented amount of data that can be termed as Big Data means large data sets in terms of volume, variety,
velocity, variability, veracity, & complexity. Bioinformatics researchers
are currently confronted with a huge challenge of handling, processing and
moving these large-scale biological data, a problem that will only increase in
coming years. Therefore, cloud computing
bears great promise for effectively addressing issues of large-scale data
generated by high-throughput technologies in the fields of genomics, proteomics
and other biological research areas. Cloud-based bioinformatics resources have
changed the approaches toward huge datasets, providing much faster data
acquisition, analysis rates and storage. New cloud-based bioinformatics
computing tools, algorithms, and workflows are consistently being developed and
successfully deployed. Basically, the cloud refers to software and services
that run on the internet instead of one's computer.
It
is well recognized that fisheries catches have reached a plateau in recent
years. Due to the high demand for fish and shellfish on the global market,
aquaculture production contributes an increasing amount to the food supply. In
the post genomic era, the use of DNA microarray technology in fish biology and
aquaculture may have great significance and may be applied to discovery of
novel genes, gene expression profiling from fish species of interest, and
identification of the genomic responses to environmental stimulation in
aquaculture. To achieve dramatic breakthroughs, new computational approaches
will be required. We need to embrace the next, fourth paradigm of science that
will help bring about a profound transformation of scientific research and
insight. For scientists; this will mean deeper scientific insight, richer
discovery, and faster breakthroughs. This paper reviews the current development
of cloud based computational technologies that can be applied and pinpoints
their potential beneficial applications as well as implications for fisheries
management and aquaculture development. Big data Analytics platforms that offer
implementations of the Map Reduce computational pattern e.g., Hadoop and Dryad
make it easy for developers to perform data-intensive computations at scale is
also highlighted.
Keywords: Big Data, Cloud
Computing, Bioinformatics, Gemonic, Proteomics, Fisheries&Insight
CLOUD
COMPUTING AND BIG DATA ANALYTICS APPROACH TO GAIN INSIGHT FROM FISHERIES
GENOMIC AND PROTEOMICS DATA
Comments