I have completed proof-of-concept testing for keyword searching in Sprout. Each feature will have a keyword list in the database that is processed by MySQL into a text-search index. In addition to simply listing keywords, it is possible to put modifiers on the words. For example, dnaK -hypothetical would return all dnaK features which are not hypothetical. A complete description of the operators is available here, but the point is we are already ahead of what Lucene can deliver, and it's better controlled. For example, if you ask for all hypothetical features for NMPDR genomes that belong to a specific subsystem, the search tool will combine the keyword search with the other criteria so that we get the full benefit of the database indexing.
The keyword search is currently inoperative while I load the keywords into the feature table. However, once its ready it will be automatically incorporated into all search tools that support feature filtering. There will also be a special keyword-only search tool designed to replace the ubiquitous NMPDR search box.
Leave a comment