A few fixes to the lucene search have been posted to the Development NMPDR.
RNAs once again work correctly on the NMPDR Development version.
To test the fix,
fig|100226.1.rna.10 into the search box.Subsystem diagrams now display as NMPDR pages when invoked from Sprout. To test this fix, go to this subsystem page, then click on the diagram link.
Subsystems displayed in the signature genes tool now contain a link to the appropriate subsystem page. To test this, go here, then
Click Find Discriminating Proteins. You should see at least three linked subsystems in the discriminating set.
The essentiality page works again. You can find it here.
Last Monday, after a phone conference, I removed the property and feature type controls from the feature filter in the new search system in order to insure we only returned meaningful results and to reduce the problem that most property searches return no results. This greatly streamlined the code, but it broke the essentiality page, which was using the feature filter to search for genes with specific properties. To fix the problem, I added a new search that looks for the occurrence of specific property name/value pairs in a chosen genome. This is not a very useful search for users visiting the site, but it does make it possible to do the essentiality searches using the old essentiality properties. I am now leaning toward the idea that the generated search page (found here on the development server) will only be used by developers, and we can have another page for the user-friendly searches that can have more involved explanations and examples.
Anyway, the important thing is that essentiality is back.
The organism page search boxes have been fixed. To test this, go to the main page on the Development Server and type 2.7.6.3 into the search box. Several hundred results will come back. Next, go to the campylobacter page and try the same search. You will get a much smaller set of results, and all of them will be for campylobacters.
Hopefully this will be the last Lucene fix and we'll be using the new search in version 16.
I have changed the delimiter for subsystem classifications from space to colon. This fixes the problem with the classification names being truncated to one word. In addition, I updated the genome statistics page to show the name as well as the number of the genome.
To test this, go to the subsystem summaries page, select Listeria monocytogenes EGD-e, and click Show Subsystems.
PSI-Blast is sort of working on the NMPDR Development Site.
PSI-Blast was reported non-functional in v14. The reason had to do with the fact that the NCBI tools assume the first form on the web page is the NCBI tool's form, when in fact it was the third form when the results were loaded into an NMPDR template. All of the protein page tools use a single template, so the fix was to change that template to eschew the search and bug reporting forms normally present on every NMPDR page.
Unfortunately, the Sprout database is still loading, so this fix can't be tested directly. Instead, you must use the URL you would get if you clicked on the PSI-Blast link at the bottom of a protein page once the protein page is working. For the infamous fig|83333.1.peg.4, click here to get the desired page. Clicking on the FORMAT button from this location used to take you to a blank search results page; now it takes you to the PSI-Blast results.
My intent is to test this as soon as the load completes and then slipstream the fix into the live site. If the load is still going on tomorrow, then I will slipstream without first testing on the development site, which is scary, but required if we're to be ready for a Wednesday demo.
Two web page changes have been slipstreamed into the live site.
These changes have also been applied to the development site.
The inclusion criterion for subsystems has been changed. Starting with the next load, only NMPDR subsystems will be included in the Sprout database. Previously, any usable subsystem was included. The total for this load would have been 549 under the old scheme; now it will be 393.
This is actually a reversal of a change made on July 13 of this year. The What Goes Into Sprout article has been updated accordingly.
The subsystem classification problem has been fixed in version 16. To test this, go to the Subsystem Summaries Page, select Vibrio cholerae O1 biovar eltor str. N16961 and click Show Subsystems. Alanine, serine, and glycine will now be a subclass of Amino Acids and Derivatives rather than part of the classification name.
The default keyword search mode has changed from OR to AND. I accomplished this by programming the search to put a plus sign in front of each bare word. The result is that Vibrio 2.7.6.3 will now only return features with EC role 2.7.6.3 in Vibrio organisms. To get features in Vibrio OR with role 2.7.6.3, you put parentheses around the words: (Vibrio) (2.7.6.3).
The search form now appears after the results on the search page. This turned out to be a very easy fix: I just changed the template (SproutSearch_tmpl.php).
The Search button in the Genome Menu has been removed. When doing a string-select search in the genome menu, the selection will take place as soon as the focus moves away from the text box. The instructions have been updated to say that you type in a string and press TAB instead of typing in a string and clicking the button. The search has also been made case-insensitive.
The Advanced button has been sprinkled around the site again. It takes the user to the FidSearch module. An advanced-search button is also now available on the word search page, since that's where a standard search request will land the user.
Campylobacter "membrane protein". Previously, this would return everything in Campylobacter, because the membrane protein phrase was considered optional. With the fix in place, only membrane proteins for Campylobacter will b/e returned. To get the old behavior, use parentheses: Campylobacter ("membrane protein").templates/targets_tmpl.php.Later tonight I will run through the attribute stuff and reload the property table.
The text index on the Feature table was missing, which was making all the searches very slow. I have rebuilt the indexes on both the Development and Staging servers, and it helps.
fig|100226.1.% would compute BBHs for all features of Streptomyces coelicolor. To control the bandwidth in cases like this, you can also specify a list of target genomes. Only BBHs that land in the target genomes will be returned by the server. This dramatically improves the performance of the Signature Genes Tool.These fixes, along with a whole bunch of web page improvements by Leslie, have been moved to the staging server.
The load files have been created for all the tables except the Feature table. The Feature table load stalled because of the latency required to communicate with the attribute server. I changed it to retrieve attributes once per genome instead of once per feature and it is now moving considerably faster.
I have commented out the overflow-x and overflow-y rules in the main NMPDR CSS file. The directional overflow rules are still in the not-quite-supported category of CSS. In the Mac version of Firefox 2, when the content height became too large, the browser went berserk and created a ghastly blank page with things popping in and out of existence. This is, in defiance of all sense and logic, only a problem on the Mac. On Windows the page displayed just fine.
The page that exhibited the problem was the database design page.
Thanks to Ross and Kaitlyn for donating their laptops so I could test this problem.
In other news, there will be a delay in the schedule for implementing attributes and rolling out NMPDR version 18 because I still can't use VPN. Finding a way to work around that will be my main focus tomorrow.
The festival will continue later, after I fix an emergency problem with attributes on the NMPDR.
ShowCounts script has been updated to output the genome list control to all_genomes.inc instead of subsystems.inc. The organism summary page was not being updated properly due to the inconsistency of the file names.The organism group pages are now fixed. The problem is that the structure of the Genome records changed, so attempts to compare the new data to the old data got ridiculous results. I believe I know how to overcome this problem, but it will take some work. For now, the important thing is that all of the pages are correct.
The subsystem tree has been changed so that show-clusters is on by default.
More to come later...
I fixed the version-comparison utilities so that the old database uses the old version of the DBD. This fixed the problem with the bogus NEW! markers in the group pages. I am running the main difference report even as we speak, so we can see what's changed on a genome-by-genome basis between version 18 and version 19.
The tentative cutover date is April 6, after Leslie and Claudia's visit.
I've added an animated banner to the intermediate search status page. My experience when I was in industry is that if there's a moving picture involved, it's easier to believe the search is working. I have also disabled the Annotation button so it won't show up in the final result.
The difference report is now available. It has also been put in the Quick Links list on Development Blog. The main anomaly is that none of the 29 new genomes have BBHs.
I am currently working on a glitch in the subsystem tree and a method for downloading search results. (This latter I hope will help us to understand the peculiarities of the signature genes tool.)
The annotation history links have been missing from the NMPDR Protein Page for some time. This is now fixed.
We now have the ability to download search results to a tab-delimited file. On each search results page (for example, this one), there is now a link titled Click here to download the full search results. Clicking the link will open a download-file box and then download the search results to a specified location on your workstation. The downloaded text is cleaned of links and does not include the button columns. There is one anomaly. In Firefox, the OK button in the download dialog box appears disabled, but will work properly when clicked. (Even Tobi does not know how to fix this problem, so I don't expect an answer soon.) In the meantime, this is a useful feature I have wanted for some time and is worth keeping.
I am still working on the problem Ross found with the NMPDR PINS display not matching the SEED display. I know which subroutine in the Chromosomal Clusters CGI is failing, but I don't yet know where or why.
application/octet-stream instead of text/plain. This means it is more well-behaved in the browser, but it also means the line-endings have to be set to the proper value by the search CGI script rather than by the browser. The web page uses Javascript to detect the client operating system and then passes that information to the CGI script, which puts the correct line-endings in the file.Remaining tasks include finding the problem in the Sprout version of the Pins (I now know it has something to do with setting the colors) and adding the serotypes to the NMPDR organism group pages.
DBObject to ERDBObject so that it no longer conflicts with PPO, the new Sprout methods for SEED Viewer support, the fix to Pins and Compare Regions so that they work properly in Sprout, and a bunch of other small stuff.The Development NMPDR is now code complete for version 19 (which is due to move to the staging server on Wednesday night). This means all we have left are bug fixes and cover page improvements.
The PEGs Only check box has been removed from the signature genes tool. In almost all cases it makes no difference. When that situation changes, we can add it back.
uni, gi) for searches. The favored alias type is shown at the beginning of the alias list and is also used as the gene ID in the FASTA downloads.These changes have been copied to the Staging Server,
All of this has been moved to the staging server, and the links herein are staging server links.
There are still some things to be fixed, but I'm working on it.
I believe this covers the NMPDR-specific issues Leslie had with version 21. The remaining issues deal with the protein page and will be addressed separately.
This page contains an archive of all entries posted to NMPDR Development Blog in the Fixes category. They are listed from oldest to newest.
Drug Targets News is the previous category.
Important Dates is the next category.
Many more can be found on the main index page or by looking through the archives.