April 2004 Newsletter
« March 2004 Newsletter | Main | June 2004 Newsletter »
April 2004 Newsletter
- The next meeting of SEED Developers
- Subsystem Annotations
The Next Meeting of SEED Developers
The next meeting of the SEED Developers will be in Bielefeld, Germany on July 5-10. The meeting will be split into two parts:
July 5-6 will be devoted to Annotation of Subsystems
The main activity during these two days will be an intensive tutorial directed towards researchers that wish to annotate specific subsystems using the SEED. We will help people begin to analyze their subsystem of interest as it manifests itself in the 250+ genomes in the current SEED. This tutorial, as well as the whole topic of subsystem annotation, will be discussed in below.
A second component of these two days will involve an open discussion between researchers currently using the SEED to annotate subsystems. The topics will include
- Which subsystems are now being analyzed?
- How should we prioritize the work?
- How should we be presenting the research conjectures that are already apparent from the existing efforts?
July 7-10 will be largely devoted towards linking systems
There are at least three systems that we will be working with. GenDB (the annotation system being developed at Bielefeld), the SEED, and Niels Larsen's "tree viewer". An over-simplified analysis would be that
GenDB is a "DNA-oriented" system, allowing one to identify and manipulate features on chromosomes (or contigs), the SEED is largely "protein-oriented", and Niels has focused on tools for presenting data.
The merge of the capabilities of GenDB and the SEED will be achieved (we hope) by architecting a fairly general interface for both systems (allowing GenDB to interface with other "protein-oriented" systems and the SEED to interface with other "DNA-oriented" systems). Hopefully, a user will be able to flip back and forth between environments, and work done in each environment will be communicated back to the other.
Niels' interface tools will be needed to display and browse large trees. This is particularly useful when overlaying gene sets onto functional overviews (e.g., during interpretation of microarray data, examination of complementary metabolisms between host and symbiont, and so forth). Niels is also building open source tools, so we believe that this represents a good overlap in interests. He will be visiting from Denmark.
The success of the efforts to merge the systems, get everyone familiar with the new Bielefeld server for calling genes in prokaryotes, and discuss servers for computing similarities will depend to some extent on how much time we devote in June to get ready. We are planning on having weekly discussions via conference calls (access grid communication is definitely needed and hopefully will become the basis for these meetings before to long).
It is likely that in parallel to these activities Ross, with help from his friends, will offer a tutorial in use of the SEED for students at Bielefeld. We previously had a very good experience with a 2-day intensive class at Franklin and Marshall College. This time, we plan on making it less intensive (1 hour per day) over a somewhat longer period. The basic idea would be to introduce the students to the SEED, suggest ways to find research topics, and so forth.
We really do not have an accurate idea yet who will be coming. It seems likely that there will be a number of people coming for only the first few days. In any event, please contact Alice McHardy, Folker Meyer, or Ross Overbeek if you plan on attending. Everyone will be welcome, but we do have to get a sense of who will be there in order to handle local arrangements. Alice can help you by suggesting a hotel.
Subsystem Annotations
Annotation of subsystems across many genomes is rapidly becoming a central component of the FIG collaboration. Researchers from at least six distinct institutions are now actively working on the project. This is pretty exciting, since the initial tools were released only last month.
We believe that the current tools are working well in the sense that
- backups now occur automatically,
- the peer-to-peer exchange of subsystems between completely different versions of the SEED (e.g., with different versions of RefSeq) seems to work well, and
- weekly updates introduce features that are rapidly reducing the effort required to construct a reasonable spreadsheet for a subsystem.
We believe that major new tools that will significantly increase the productivity of participants should be available within two weeks. After those become available, tested, and incrementally improved, we will make the entire set of tools available to researchers on a public server (probably at the Uinversity of Chicago, initially).
The key idea of subsystem annotation is simple: a person wishing to study a subsystem using comparative analysis analyzes exactly which organisms it occurs in, which alternative variants can be distinguished, which functional roles are present in each of the organisms, and which genes implement those roles. This is roughly the raw data behind what you see in many biological review articles. By organizing the data in a form that can be supported within the SEED, it becomes much easier to analyze new genomes as they become available, to clarify outstanding uncertainties, and to curate the system over time. We believe that something like what we are now implementing will be the foundation for many, many researchers analysis over the coming years. As thousands of diverse genomes become available, this becomes the framework for a person to maintain an accurate portrait of the system or systems with which he works.
We will hold a tutorial on how to use the system in early July in Bielefeld, as we mentioned above. We will probably hold one at Argonne National Lab in Chicago before then. The tutorial will include a general overview of how to use the SEED, so we suspect that it will be of general interest. If you would be interested in attending a 2-3 day tutorial in use of the SEED and annotation of subsystems, please contact Veronika or Ross .


