Data management has emerged as one of the central issues in the high-throughput processes of taking a protein target sequence through to a protein sample. To simplify this task, and following extensive consultation with the international structural genomics community, we describe here a model of the data related to protein production. The model is suitable for both large and small facilities for use in tracking samples, experiments, and results through the many procedures involved. The model is described in Unified Modeling Language (UML). In addition, we present relational database schemas derived from the UML. These relational schemas are already in use in a number of data management projects.

Original publication

DOI

10.1002/prot.20303

Type

Journal article

Journal

Proteins

Publication Date

02/2005

Volume

58

Pages

278 - 284

Addresses

EMBL Outstation, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, United Kingdom.

Keywords

Proteins, Data Interpretation, Statistical, Protein Engineering, Proteomics, Systems Biology, Genomics, Amino Acid Sequence, Algorithms, Models, Biological, Research, Internet, Software, Programming Languages, Software Design, Unified Medical Language System, Databases, Protein