Main

Web Site Status Reports Archives

October 6, 2006

Development Server Status

So that development can proceed on version 15 while the load is taking place, the Development Site has been connected to the version 14 database. It still uses, however, the latest version 15 web pages, templates and code. Once the table load completes I will switch back to the new database.

October 8, 2006

Version 15 Development Site is Now Available

The version 15 database has been loaded and the appropriate files generated for The Development Site. The difference report is below the fold.

This time there is not a lot of new stuff. Seventy-five property name/value pairs were deleted and 17 new ones were added.

Continue reading "Version 15 Development Site is Now Available" »

October 21, 2006

Version 15 is Now Live

Development on version 16 will begin later this weekend.

October 22, 2006

Version 16 now loading

The version 16 data is now loading into the Development Web Site. This load incorporates three changes to the handling of features:

  1. Subsystem roles are now included in the search keyword list.
  2. Complex hyphenated compound names are stored in the keyword list both in their original form and split on hyphen boundaries.
  3. The primary functional assignment for a non-peg feature is taken from the alias list instead of the annotations.

October 29, 2006

New URLs for the NMPDR Development Blog

I have added subdomain URLs to make the Development Blog easier to navigate.

October 30, 2006

Version 16 Reloaded

The data for version 16 has been reloaded. The difference report is substantial this time, since over 160 subsystems dropped out when we changed the inclusion criteria. I have included it below the fold (which is blogspeak for "on the other side of the Continue Reading link").

You can now search for keywords like essential and iedb. There are still some glitches in the searching. In particular, when you ask for essential, a hyperlinked list of essentiality values should show up in the results. I will investigate this further tomorrow (as noted on the current to-do list).

Continue reading "Version 16 Reloaded" »

November 2, 2006

A Short Delay

Bruce went to see the dentist today and he is feeling much better; however, he was unable to return to work due to a mishap involving a tube of super-glue and a broken car mirror.

Respectfully submitted,

Ferdinand T. Cat

November 20, 2006

Version 16 Reloaded

I reloaded the NMPDR 16 database. The new difference report is below the fold.

Continue reading "Version 16 Reloaded" »

November 21, 2006

Version 16 moves to the Staging Site

After much delay, heartache, and gnashing of teeth, version 16 of the NMPDR has been moved to the Staging Site. Please give it a once-over to make sure your favorite features still work. If there are no problems, it will go live some time on Thursday.

The new drug target data is not present in this version. It will be added for the next version, which is scheduled to go live on December 4.

November 28, 2006

Version 17 Available on the Development Server

Version 17 of the NMPDR is now available on the development server. It includes the new drug targets pages, which can be seen at http://web-1.nmpdr.org/next/FIG/targets.cgi, though the information there has not yet been completely curated.

Not all of the attributes we want are available. Once the attribute system is fixed (hopefully in a day or two), I will reload the NMPDR property table.

December 2, 2006

Version 17 on Staging Server

NMPDR version 17 is now on the staging server.

December 6, 2006

Version 17 is Now Live

Bigger, faster, and richer in content, version 17 is now live.

To celebrate, I will be spending the next few hours huddled in a corner whimpering and shaking uncontrollably.

December 18, 2006

Version 18

This morning I will begin building version 18. There are several important changes I need to make before version 18 can be loaded, so for a while there will be no official data in the Development NMPDR; however, I don't want to make radical code changes in version 17 now that it's live, so we will have to limp along for a while.

January 2, 2007

We're Back

The holiday is over and NMPDR version 18 is now officially loaded on the development server. In addition, the keyword search now supports three-letter words. For example, in the old system, adenine RNA and adenine both return the same result set because the three-letter word RNA is ignored.. In the new system, adenine returns 1702 records and adenine RNA returns only 44 records.

The difference report for version 18 is here.

February 21, 2007

NMPDR version 18 Endgame

I am currently reloading the Feature table to insure that the RNAs have the correct assignments. Once that is done, I will run the standard post-load scripts to rebuild the cover pages. My goal is to have the staging site up some time tomorrow and then bring up version 18 on the following Tuesday (February 27). Leslie's serotype data will be lost, but I hope to have this information available on the attribute server next week so it can be made part of the load process in version 19.

February 22, 2007

Version 18 Moves to the Staging Server

NMPDR version 18 is now available for preview on the Staging Server. A complete comparison of the data differences between version 17 and version 18 is available in the revised difference report.

The current plan is to roll v18 into production on Tuesday, February 27. In the meantime, you can have some fun by typing riboswitch into the search box.

March 12, 2007

Version 19 in Gestation

Version 19 of the NMPDR is now set up on the Development server. At this point no data has been loaded into the database, however, so nothing works.

The goal for this release is to make the NMPDR compatible with the SEED viewer. This is a significant task that may require several database reloads, but it is very different from actually making the SEED viewer available on the NMPDR. All that has to be accomplished for this release is that the Sprout software support the requirements of the viewer.

March 29, 2007

Signature Genes Tool Upgrade plus Some Problems

The search system has been retooled so that while the search is in progress, status messages are sent to the user in real time. This means that if the search is taking a long time, then every so often text will be presented to the user explaining what is happening. This makes a long search more palatable, and it also prevents the long searches from timing out and presenting the user with an internal server or proxy error. When the search is complete, the search results page will immediately pop up. For fast searches, the results come up so quickly that the status messages never show up. I have not tested this change thoroughly, but I hope to later today. The word search and the signature genes search should work if you want to see how the new system operates.

There have been three additions to the Signature Genes Tool.

  1. You can specify that only PEGs should be considered. This has become an issue because of all the new gene types that are appearing in NMPDR genomes.
  2. The scores are now between 0 and 1 instead of between 1 and 2. This makes them more intuitive in the case where the statistical algorithm is not being used.
  3. You can now discriminate using similarities instead of BBHs. This is a much slower process, but the new real-time status thing makes it less painful.

The NMPDR organism pages have every genome marked as new, which means the counts on the front page are almost certainly wrong. I will investigate this when I wake up this afternoon.

The Annotation button is still in place because there are still some fixes I need to make to the SEED Viewer support. The next thing I have on my list is fixing the BLAST search.

I have been told there are nine incorrect genomes in the NMPDR. As soon as they are fixed, I will reload, run the difference report, and begin the cutover and testing process. The next thing after the BLAST fixes are the drug target objects. Whether those get into v19 or v20 will depend on how long I have before the reload starts.

April 3, 2007

Download-Search Capability

I have added a link to the search results page that allows you to download the entire search output as a tab-delimited file. I am adding features one at a time and testing them. Suggestions are welcome, but because it is a work in progress there is no need to worry that any anomalies are cast in stone.

April 12, 2007

Version 19 now on Staging Server

NMPDR version 19 has now been moved to the staging server. There was a slight delay in the original schedule due to a problem Leslie discovered with the organism pages. Bob fixed it this morning, however, so now we are back on track.

I have pushed the cut-over date forward to April 17 so that there will still be two full business days for testing.

April 18, 2007

Final Push for v19 in Progress

I am in the midst of applying the following changes.

  • The subsystem count at the bottom of the organism summary page will no longer be active as a link. Previously, this took the user into the SEED and there was no real way to get back from that. It was easier to simply disable the link in NMPDR than convert the entire subsys_vector script.
  • I am hacking in temporary fixes for the download page. The FTP directory for Listeria now has the full set of organisms. I am creating ZIP files on web-1 to be used as targets for the HTTP downloads. We need some automated way to keep this data up to date, and I have a request into Bob for any ideas he might have.
  • I'm adding an operon search. (I should not have done this, but I felt I had to code it before I forgot how.) This will not be open to the public, but will be made available for Ron Taylor so he can download the SEED data he wants from the NMPDR.

Once these fixes are in and I've tested them on the development server, I will copy them to the staging server, test them there, and then roll the version.

April 19, 2007

NMPDR v19 is Now Live

and the world is automatically a better place.

May 1, 2007

Version 20 Fully Loaded

The version 20 web site is now available on the Development Server. Nothing has been tested, but if you don't look at it too hard it should work fine.

May 15, 2007

Version 20: A Very Special Rollover

Over the weekend, the power will be turned off somewhere because of something, with the result that nmpdr-3 and web-3 will be unavailable. Therefore, this coming Friday (May 18), we will be redirecting http://www.nmpdr.org to point to the development server. There will therefore be no development server over the weekend. Since this is also my development sandbox, I will not be working on the 18th and 19th. On Monday the 20th, I will meet with Bob and Bill again to restore the normal order of things.

The key thing is that this time there will not be a Staging Server. Instead, we have to hammer on the Development Server as much as we can from now until then.

May 18, 2007

NMPDR Version 20 is Now Live

NMPDR version 20, the first to use our new single-server technology, is now live. There is now a much shorter delay before the search progress page shows up, and this makes the whole thing seem snappier and more responsive. A word search for Vibrio presented the progress page after 10 seconds and completed the search (47912 results) in only 28 seconds. Searches with smaller result sets (eg dnaK) respond in only a few seconds.

Bill is currently working with the configuration to reduce the 10-second delay, but now that the entire NMPDR web site is on a single server, we have a lot more options than we did before.

May 24, 2007

Version 21 Now Loading

Version 21 of the NMPDR is now loading on the Development Server. The current target date to go live is June 11. The load is expected to finish around May 28.

June 4, 2007

Attribute Analysis

I have created a page on the Development Server that contains information useful in understanding the meaning and format of the various attributes. The report is, for now, restricted to attributes of features and attributes related to drug targets.

At the bottom of the page is a report showing how many of the NMPDR features and how many of the NMPDR Core features have which attributes. The report gives us an idea of which attributes may be practical for use in searching. For example, we have CELLO data on 68% of the NMPDR core features, so it would be reasonable to allow a user to search on or ask for CELLO data if he's restricting his attention to core genomes. On the other hand, we have molecular weight information on 18% of the total features and less than 2% of the core features, so an attempt to search on molecular weight would not provide any useful results.

At this point it looks like the only practical things to add would be the CELLO and CDD attributes. Therefore, I am adding these to the database and they should appear on the Advanced Search Page by the end of the week.

June 7, 2007

Version 21 Loading, Version 20 Getting Help

Sprout version 21 is currently intended to be the first version from which a PPO database can be generated. Among the changes were

  • Ripping out the whole PCH family of tables in favor of the PCH server
  • Adding a CDD table and putting CELLO data in the Feature table
  • Converting all of the keyed array fields (feature alias, compound name, role EC number) into separate entities

This last was because PPO does not support keyed array fields. Because of this change, we should be able to generate a working PPO database from the NMPDR database definition, and this will give us a template to shoot for in accomplishing the integration.

In the meantime, I am working on bug fixes to the live NMPDR using the mirror version of NMPDR. One big problem was that the Sprout attribute call did not support the full capabilities of the new attribute system. This caused a problem with incorrect literature counts on the subsystem display page. In addition, it was causing CDD codes to appear in the evidence column for the commentary of a pin page.

The final problem has to do with an incompatibility of the diagrams. I hope to resolve this tomorrow and will then copy the mirror to the live site over the weekend.

It would be tricky, but possible, to reload the version 20 database to get the latest information. This would fix the problem with outdated abbreviations and stuff in the subsystems. It would also give us an opportunity to add serotypes to the names of the core genomes and possibly get any core genome updates out of the pipeline. At the current time, however, I am assuming this would not be done. Please correct me if I'm wrong.

June 8, 2007

Mirror Site Diagrams Fixed

The subsystem diagrams in the Sprout now use the new diagram technology. There is, unfortunately, a slight glitch due to the fact that the roles stored in the version 20 Sprout do not necessarily match the ones in the SEED. The symptom is that some of the role tooltips don't work. This has been fixed on the Mirror Site. If you go to this subsystem page, the first diagram (De Novo &c) is an old-style diagram, and the second (Arginine &c) is a new-style diagram. Both diagrams will invoke the proper CGI script so that they display correctly.

This is the last of the problems reported by Olga in her EMAIL of a few days ago. The fixes will most likely be transferred to the live site some time on Sunday.

June 11, 2007

Version 21 Load Complete, Version 20 Becomes Favorite Child

I had to reload several table groups in Sprout version 21 due to misunderstandings on my part as to how some relationships work. In particular, some RNA roles are still in the aliases, so aliases are a many-to-many relationship rather than one-to-many.

It is also the case that some CAS IDs and compound names belong to multiple compounds. These relationships are now many-to-many.

There is an EC number with a tab inside it: 2.6.1.2. I don't know which subsystem it's in, but the role name is putative alanine transaminase (glutamyc pyruvic transaminase). I've added code in the Sprout loader to fix this, but it's probably better to fix the actual subsystem spreadsheet if someone can find it.

There is also a feature somewhere that has a right curly brace in its alias list. Because of the way the aliases are generated, the right curly brace does not have any features attached to it in Sprout, so it's not going to affect anything; however, the down side is that I have no idea which feature has the bogus alias.

It will be several days before version 21 is ready for testing. Because version 21 contains a major database change, I am currently targeting changes to version 20. The plan is to slipstream in a version 20a in the next few days. The development copy of version 20a is on the mirror site. The major change for 20a is the incorporation of the scan-for-matches search into the BLAST search page. There are also some bug reports from Leslie and Claudia that I am still researching.

The Mirror Site is currently all messed up, but the plan is to have it ready for hammering in a few hours. (Ready for hammering means the bug fixes are in but the new search is not yet done.) I will do another blog post when it's hammer time, at which time we need to find any bugs so I can fix them before we post to the live site.

Version 20a Available on Mirror Site

The new version 20 is now available for testing at The Mirror Site. The missing ingredient at this time is the new, improved Blast/Pattern search. Everything else, however, should be investigated to make sure we're ready for the upcoming conference-like thing.

June 22, 2007

Reload in Progress

The Development Site is going to be completely unusable for the next few days. First and foremost, I am reloading the subsystems table to fix the Role abbreviation problem. Previously, abbreviations were properties of the Role: each Role used the same abbreviation in all subsystems. This has been changed so that the Role can have a different abbreviation in each subsystem, which more closely matches the SEED data structures.

There are also numerous search problems which result from the search code being in an incomplete state at the time I started the load. I will get back to the search fixes as soon as I finish doing the laundry.

June 28, 2007

Attribute Report for v21

I passed out before taking my medications last night. I've been told it's Thursday, but I'm not yet sure which Thursday that is.

In the meantime, the latest attribute report is no available here. For NMPDR core organisms, we have 92% coverage on CELLO data and 50% coverage on TMPRED, which is much better than we had in the previous report.

July 9, 2007

Pattern Scan Update

Mark and I spent the afternoon tuning the pattern scan and the compare regions on the development NMPDR. The entry point for the pattern scan / blast search is here. From the results page, you can get to a standard protein page using the NMPDR button and Mark's context display using the Context button. We were consistently able to get the response time under 20 seconds for both DNA and protein searches.

I am currently working on the code that allows the user to decide which ID (uniprot, locus tag, etc.) should be displayed in the results. Once that's done, I will fold it into the display code for the various feature searches as part of getting them fixed and adding rollover hints to the search forms. (This is all based on a discussion with Folker and Liz last week.) In the meantime, I am running an attribute report to find out which attributes can be added to the searching. Once that's done, I will need to reload the feature table to get the new attributes, at which point the site will be mostly working except for the help pages and stuff.

July 12, 2007

Progress

Things are slowly coming together for NMPDR version 21. The word search is mostly working, and I am finishing the fixes to the other feature-based search. Once this is done, the remaining tasks are

  • Adding an XML download format.
  • Reinstalling the FASTA download format for feature-based searches.
  • Fixing the Drug Search. This is not a search for drug targets, but rather a report on the docking results we already have.

There was a bug in the pattern search because the formatter was removing spaces from the pattern string. This has been fixed.

I am currently running the difference report. I added coupling and BBH statistics, both of which have slowed the report considerably, which is why it is still going on. Since this report is not run very frequently, I don't consider that a big problem. Once the report is done, the data will appear here.

This weekend I will re-run the attribute report, which will give me the information I need to begin implementing a drug target search (as opposed to a docking result report). I don't wan to start the attribute report until the difference report is finished.

Finally, I am meeting with Ross on Monday to discuss implementing a close-strain comparison tool. This has become a little more urgent because one of the site users requested it.

July 18, 2007

Ready to Roll on NMPDR v21

Version 21 is now considered stable enough for testing. Among the new features that will be available in this release are:

  • BlastN, DNA Scan, and Protein Scan are now available on the Tool search page (which replaces the old BLAST page).
  • Search output for the Tool Search page contains more information (including the matching sequence for scans and the alignment for BLAST). Each row includes a link to a compare-region display.
  • It is now possible to download search results in XML format.
  • The search results use real buttons instead of fake buttons simulated using styles. The buttons are automatically converted to URLs when the results are downloaded in XML or tab-delimited format.
  • You can specify the type of ID to be used when displaying feature IDs in search results. If a feature does not have an ID of the specified type, it will fall back to the FIG ID.
  • Coloring is available on new-style subsystem diagrams.
  • The numbers required for the semi-annual report will now automatically be generated as part of the version change process.

There was also a major upgrade to the underlying code for the search. Previously, the search was heavily biased toward searches that return features. Under the new system, the type of result is decoupled from the type of search, which makes it easier to create searches for new types of objects (such as subsystems and genomes) in the future.

The SEED Viewer has been re-activated, and can be found here. There is also a link to it in the sidebar on the front page of the Development Blog. This is the old viewer, and it is currently being used to insure that the NMPDR can support the functions needed by the SEED Viewer. The idea here is that when the new viewer is ready, we will be in a better position to couple it to the rest of the site. In the meantime, because there is no direct link to it on the website, we can test and modify it without disrupting anything on the real web site.

August 25, 2007

NMPDR v21 Finally Available

NMPDR version 21 is officially live. In addition, I have added SOPs to the development wiki for staging an NMPDR for testing, propagating corrections to the staged site, and rolling over a new version.

I will probably not be able to come in on Monday as that is the day the plumbers are coming to pump the water out from the basement and fix the damage.

September 9, 2007

NMPDR Development Server in Motion

I have been in the process of setting up anno-2.nmpdr.org as the new NMPDR development server. This is a faster server, and nmpdr-1 is slated for some different sort of thing that I don't fully understand. For a while, things will be a little crazy as I sort out which links point to where.

September 10, 2007

Final Slipstream Fixes

Two fixes have been slipstreamed into the production NMPDR.

  • The hit locations in the context view now show up as gray instead of bright green. To see this, click here and then select the Context button for the first search result.
  • The drug targets pages were not displaying any results because the drug target data files had not been copied over to the production server. This has been fixed. Click here to see the results.

The next step is to start setting up version 22 on the new development server.

September 15, 2007

Version 22 Now on the New Development Server

Version 22 of the NMPDR has been created on the New Development Server at http://anno-2.nmpdr.org/next/. Note that at this time, no data has been loaded into the databases. The intent is to bring the Sprout database one step closer to looking like a PPO database before we load it.

November 26, 2007

Version 22 Status Report

  • Data Load. The new NMPDR data is now fully loaded, and we show features in subsystems for all of the new organism groups. The difference report is here. The report shows what has changed in each of the core organism groups, the new and deleted subsystems, the new and deleted organisms, and the statistics used on our semi-annual reports. In addition, it flags unusual situations. In this case, there were 16 organisms without BBHs. Bob is currently fixing them.
  • Performance. It takes about 3 seconds to display a protein page. The SEED viewer genome browser takes about 8 seconds. This is comparable to the performance of GBrowse on both the Mirror system (which is the same server as the new version) and the Live system. There has been some concern about the SEED viewer performance, but during the weekend I was running the difference report, which would have slowed things down considerably.
  • Functionality. We were having problems with Image::Magick, but this is now fixed. The plan is for Tobi and I to stress-test the viewer this afternoon and fix any problems we find.
  • Schedule. The current plan is to stage the new NMPDR on Wednesday the 28th after the programmers' meeting, then roll it over on November 30. If any problems occur, we will roll over on a later date and change the front page to reflect the new information.

About Web Site Status Reports

This page contains an archive of all entries posted to NMPDR Development Blog in the Web Site Status Reports category. They are listed from oldest to newest.

Tips and Tricks is the previous category.

Many more can be found on the main index page or by looking through the archives.

Powered by
Movable Type 4.01