Abstract: A substantial part of the AGRON-OMICS consortium is devoted to profiling the growing Arabidopsis leaf in a number of environmental conditions. The TiMet consortium studies the link between circadian clock and metabolism, focused on both primary- and isoprenoid-metabolism. These international multi-institute projects generate a diverse range of quantitative molecular and phenotypic data. Vital to our analytical pipelines are adaptable database integrations that exploit standard and advanced features of the MySQL database engine and tools. These implementations are utilized for the processes of data and meta-data capture, validation, the tracking of provenance, for certain statistical-, mathematical-, and structural data transformations, for integration with R and for generating visualizations. Our systems provide access controlled user workspaces and the ability to run high performance queries across multiple and some high volume data sets. Interpreting novel datasets also requires the integration of pre-existing knowledge and consequently a range of annotations and classifications are included. Where detailed annotations were lacking, the Knowtator tool was used for curating phenotype-genotype-environment relations using ontologies. A number of scientific use-cases are presented that demonstrate the pivotal role that coherent integration can play in data quality control, project management and data analysis. Since the database engine and tools are freely available, the data, code and documentation can be simply and rapidly replicated for community dissemination and/or extension. These developments provide a useful template for a computational platform that has analytical value during a project and beyond.
Download: PDF Poster
Walsh S, Baerenfaller K, Graf A, Coman D, Hirsch-Hoffmann M, Kartal Ö, Sulpice R, Szakonyi D, Zielinski T, Granier C, Stitt M, Millar A, Hilson P, Gruissem W. Presented at ECCB’12 – the European Conference on Computational Biology -http://www.eccb12.org/