FIGRulesTable of Contents

FIG Rules ModuleIntroductionPublic MethodsBatchBBHsBBHDataComputeEolDecodeScoreEncodeScoreFIGCompareGetBBHServerURLGetHopeReactionsGetNetworkSimsGetTempFileNameNetCouplingDataNewSessionIDnmpdr_modeNormalizeAliasParseFeatureIDSortedFidsUpstreamwikipedia_linkFIG Rules ModuleIntroductionThis module contains methods that are shared by both FIG.pm and Sprout.pm.

Public MethodsNormalizeAliasmy ($newAlias, $flag) = FIGRules::NormalizeAlias($alias);

Convert a feature alias to a normalized form. The incoming alias is examined to determin whether it is a FIG feature name, a UNIPROT feature name, or a GenBank feature name. prefix is then applied to convert the alias to the form in which it occurs in the Sprou database. The supported feature name styles are as follows.

fig|dd..d.dd..d.peg.dd..d where "dd..d" is a sequence of one or mor digits, is a FIG feature name.

dd..dd where "dd..d" is a sequence of one or more digits, is a GenBank feature name.

XXXXXX where "XXXXXX" is a sequence of exactly 6 letters and/or digits, is a UNIPRO feature name.

aliasAlias to be converted to its normal form.

RETURNReturns a two-element list. The first element (newAlias) is the normalized alias; the second (flag) is 1 if the alias is a FIG feature name, 0 if it is not. Thus, if the flag value i 1, the alias will be expected in the Feature(id) field of the Sprout data, and if it i 0, the alias will be expected in the Feature(alias) field.

Upstreammy $dna = FIGRules::Upstream($fig, $genome, $location, $upstream, $coding);

Return the DNA immediately upstream of a location. This method contains code lifted fro the upstream.pl script.

figFIG-like object that can be used to access DNA and feature data. For example, a SFXlate object or a true FIG object.

genomeID of the genome containing the location's contig.

locationLocation string describing the location whose upstream data is desired, in the standar form contig_begdirend used throughout FIG and Sprout.

upstreamNumber of base pairs considered upstream.

codingNumber of base pairs inside the feature to be included in the upstream region.

RETURNReturns the DNA sequence upstream of the location's begin point and extending into the codin region. Letters inside a feature are in upper case and inter-genic letters are in lower case A hyphen separates the true upstream letters from the coding region.

FIGComparemy $cmp = FIGCompare($aPeg, $bPeg);

Compare two FIG IDs. This method is designed for use in sorting a list of FIG-styl feature IDs. For example, to sort the list @pegs, you would use.

my @sortedPegs = sort { &FIGCompare($a,$b) } @pegs;

aPegFirst feature ID to compare.

bPegSecond feature ID to compare.

RETURNReturns a negative number if aPeg should sort before bPeg, a positive number if aPegbPeg, and zero if both should sort to the same place.

NetCouplingDatamy @data = FIGRules::NetCouplingData($function, %parms);

Request data from the PCH server. The PCH server takes as input a function and set of parameters, and returns one or more lines of tab-separated n-tuples. Th n-tuples are then parsed out and returned by this method in the form of a list.

functionName of the coupling function to invoke. These are coupled_to to get list of coupled PEGs for a given PEG, coupling_evidence for a lis of physically close homologs for a given pair of coupled PEGs, an coupling_and_evidence for a list of coupled PEGs, each with list of evidence.

parmsHash of the parameters to pass, keyed by parameter name.

RETURNReturns a list of n-tuples transmitted by the server.

ParseFeatureIDmy ($genomeID,$type,$pegNum) = FIGRules::ParseFeatureID($fid);

Parse out the components of a FIG feature ID.

fidFIG ID of a feature.

RETURNReturns a three-element list consisting of the feature's parent genome ID, it type, and its ID number.

BBHDatamy @bbhList = FIGRules::BBHData($peg, $cutoff);

Return a list of the bi-directional best hits relevant to the specified PEG.

pegID of the feature whose bidirectional best hits are desired.

cutoffSimilarity cutoff. If omitted, 1e-10 is used.

RETURNReturns a list of 3-tuples. The first element of the list is the best-hit PEG the second element is the score. A lower score indicates a better match. The thir element is the normalized bit score for the pair, and is normalized to the lengt of the protein.

BatchBBHsmy @bbhList = FIGRules::BatchBBHs($pattern, $cutoff, @targets);

Return a list of bidirectional best hits. The BBHs will be for features whose I matches the specified SQL pattern, are below a specified cutoff score, an are in at least one of the specified target genomes. If no target genome are specified, all BBHs for matching features will be returned.

patternSQL pattern to match against feature IDs. Generally, this will be either a rea feature ID or somthing like fig|100226.1.% to get all features for a specifie genome.

cutoffMaximum permissible score for a BBH to be returned.

targetsA list of zero or more genome IDs. Only BBHs that land in the target genomes will b returned.

RETURNReturns a list of 4-tuples. Each tuple will contain an originating feature ID a target feature ID, a P-score, and an N-score.

GetBBHServerURLmy $url = FIGRules::GetBBHServerURL();

Return the URL of the BBH server.

GetNetworkSimsmy $sims = FIGRules::GetNetworkSims($fig, $id, \%seen, $maxN, $maxP, $select, $max_expand, $filters);

Retrieve similarities from the network similarity server. The similarity retrieva is performed using an HTTP user agent that returns similarity data in multipl chunks. An anonymous subroutine is passed to the user agent that parses an reformats the chunks as they come in. The similarites themselves are returne as Sim objects. Sim objects are actually list references with 15 elements The Sim object methods allow access to the elements by name.

Similarities can be either raw or expanded. The raw similarities are basi hits between features with similar DNA. Expanding a raw similarity drags in an features considered substantially identical. So, for example, if features A1 A2, and A3 are all substatially identical to A, then a raw similarit [C,A] would be expanded to [C,A] [C,A1] [C,A2] [C,A3].

Specify the trace type nsims to trace this method.

figAn object that supports the is_deleted_fid method to determine whether or no a feature exists in the data store, or a raw Sprout object.

idID of the feature whose similarities are desired, or reference to a lis of the IDs of the features whose similarities are desired.

seenReference to a hash keyed by feature ID that returns a value of 1 for feature to be discarded when constructing the return list. This parameter is provided s that the caller can avoid doing any hard work on similarities that are alread known.

maxNMaximum number of similarities to return.

maxPThe maximum allowable similarity score.

selectSelection criterion: raw means only raw similarities are returned; figall means all expande similarities are returned; and figx means similarities are expanded until th number of FIG features equals the maximum.

max_expandThe maximum number of features to expand.

filtersReference to a hash containing filter information, or a subroutine that can b used to filter the sims.

RETURNReturns a reference to a list of similarity objects, or undef if an erro occurred.

wikipedia_linkmy $url = FIGRules::wikipedia_link($organism_name);

Return the URL of a Wikipedia page for the specified organism or undef if no Wikipedia page exists.

organism_nameWord or phrase to look for in Wikipedia.

RETURNReturns the Wikipedia URL for the specified organism, or undef if no Wikipedi page for the organism exists.

SortedFidsmy @fids = FIGRules::SortedFids(@fidList);

Convert a list of feature IDs to a sorted list with duplicates removed.

fidListA list of feature IDs.

RETURNReturns the original list, sorted in feature order with no duplicates.

EncodeScoremy $scoreString = FIGRules::EncodeScore($score);

Convert a BLAST score to a sortable string. The sortable string will float lowe scores to the beginning, and is a fixed-length string so that there are n comparison anomalies.

scoreFloating-point score to convert to a string. It must be a value greater than or equa to zero and less than 1.

RETURNSortable string created from the incoming score.

DecodeScoremy $score = FIGRules::DecodeScore($scoreString);

Convert a sortable score string to a real score. The sortable score strin is of the form XXX.YYY where XXX is the exponent subtracted fro 1000 and YYY is the mantissa multiplied by 100. So, for example, a incoming value of 987.810 turns out to be 8.1e-13.

scoreStringSortable string encoding a BLAST score.

RETURNThe BLAST score corresponding to the sortable string, or undef if the string is invalid.

NewSessionIDmy $id = FIGRules::NewSessionID();

Generate a new session ID for the current user.

GetTempFileNamemy $fileName = FIGRules::GetTempFileName(%options);

Return a temporary file name. The file name will consist of a long, hashed-up he string followed by a file name extension, and it will be in the FIG temporar directory. This method accepts a single hash as a parameter, with the followin possible options.

sessionIDA string that may be used to generate a unique file name. If none is specified a string will be generated using the </NewSessionID> method.

extensionThe name to be used for the file extension. If none is specified, the extensio will be tmp.

ComputeEolmy $eol = FIGRules::ComputeEol($osType);

Compute the correct end-of-line character for the specified operating system.

osTypeOperating system type, currently either MacIntosh, Windows, or Unix.

RETURNReturns the end-of-line character string for the specified operating system. If the operatin system name is not recognized, \n will be assumed.

nmpdr_modemy $flag = FIGRules::nmpdr_mode($cgi);

Return TRUE if this is the NMPDR environment and FALSE otherwise. A NMPDR environment is possible only if the NMPDR variables exist i FIG_Config. Otherwise, if a CGI object is specified or there is a HTTP_HOST provided in the environment variables, we look for a value fo the query variable SPROUT and return that. If there is no CGI objec or no HTTP_HOST provided in the environment variables, we look for value for the SPROUT environment variable and return that. If none o these methods work, we return the value of the FIG_Config::nmpdr_modeWhat this means is that each FIG installation has a default mode-- NMPD or SEED-- based on its FIG_Config. For a command-line script, this mod can be overridden by an environment variable. For a CGI script, this mod can be overridden by a query variable. In both cases, the overridin variable is SPROUT.

Note that currently NMPDR mode determines our style of display and whethe or not the data is coming from Sprout. This may not always be the case in which case we'll have some serious updating to do.

cgi (optional)CGI object used to access query parameters. If no object is specified an we are running or emulating a web script, one will be created an interrogated. Otherwise, it will be assumed we are running a command-lin script and the CGI object will not be used.

RETURNReturns TRUE if this is Sprout/NMPDR, else FALSE.

GetHopeReactionsmy $reactionHash = FIGRules::GetHopeReactions($subsysObject, $directory);

This method returns a reference to a hash that maps each subsystem rol to a list of EC numbers representing Hope reactions. These reactions ar useful in analyzing scenarios.

subsysObjectSubsystem or SproutSubsys object for the subsystem in question.

directoryDirectory for the subsystem in the FIG disk cluster.

RETURNReturns a reference to a hash that maps role names to lists. The list fo each role name contains the EC numbers for that role's Hope reactions.