OmegaScript adds processed text-generation commands to text templates (which will usually be HTML, but can be XML or another textual format). Commands take the form $command{comma,separated,arguments} or $simplecommand, for example:
<html> <head><title>Sample</title></head> <body> <p> You searched for '$html{$query}'. </p> </body> </html>
Where appropriate, arguments themselves can contain OmegaScript commands. Where an argument is treated as a string, the string is precisely the contents of that argument - there is no string delimiter (such as the double-quote character '"' in C and similar languages). This can make complex OmegaScript slightly difficult to read at times.
When a command takes no arguments, the braces must be omitted (i.e. $msize rather than $msize{} - the latter is a command with a single empty argument). If you want to have the value of $msize immediately followed by a letter, digit, or "_", you can use an empty comment (${}) to prevent the parser treating the following character as part of a command name. E.g. _$msize${}_ rather than _$msize_
It is important to realise that all whitespace is significant in OmegaScript - e.g. if you put whitespace around a "," which separates two command arguments then the whitespace will be part of the respective arguments.
Note that (by design) OmegaScript has no unbounded looping constructs. You can loop over entries in a list, but you can't loop until some arbitrary condition is met. This means that it's not possible to accidentally (or deliberately!) write an OmegaScript template which contains an infinite loop.
$$ - literal '$' $( - literal '{' $) - literal '}' $. - literal ','
In the following descriptions, a LIST is a string of tab-separated values.
return UTF-8 for the given Unicode codepoint, e.g. $chr{127866} should display as a beer mug if the font has a suitable glyph.
Since ASCII is a subset of Unicode, you can also produce control characters, e.g. $chr{13} gives a carriage return character.
To convert a UTF-8 character to a Unicode codepoint, see $ord.
Added in Omega 1.3.4.
number of other documents collapsed into current hit inside $hitlist, which might be used like so:
$if{$ne{$collapsed,0},at least $collapsed hidden results ($value{$cgi{COLLAPSE}})}
return position of first occurrence of STRING1 in STRING2, if present. Else return an empty string. Examples:
$contains{fish,goldfish} gives "4"
$contains{fish,shark} gives ""
encode STRING for use as a field in a CSV file. By default, escaping is done as described in RFC4180, except that we treat any byte value not otherwise mentioned as being 'TEXTDATA' (so %x00-%x09, %x0B-%x0C, %x0E-%x1F, %x7F-%xFF are also permitted there). Examples:
$csv{Safe in CSV!} gives Safe in CSV!
$csv{Not "safe"} gives "Not ""safe"""
$csv{3$. 2$. 1} gives "3, 2, 1"
Some CSV consumers don't follow the RFC, in which case you may need to encode additional values. For this reason, $csv provides an highly conservative alternative mode in which any double quote characters in the string are doubled, and the result always wrapped in double quotes. To select this mode, pass a second non-empty argument. Examples:
$csv{Quote anyway,1} gives "Quote anyway"
$csv{Not "safe",1} gives "Not ""safe"""
Added in Omega 1.3.4.
returns a list of docids of any documents with document length zero (such documents probably only contain scanned images, rather than machine readable text, or suggest the input filter isn't working well). If TERM is specified, only consider documents matching TERM, otherwise all documents are considered (so Tapplication/pdf reports all PDF files for which no text was found).
If you're using omindex, note that it skips files with zero size, so these won't get reported here as they aren't present in the database.
lookup field NAME in document DOCID. If DOCID is omitted then the current hit is used (which only works inside $hitlist).
If multiple instances of field exist the field values are returned tab separated, which means you can pass the results to $map, e.g.:
$map{$field{keywords},<b>$html{$_}</b><br>}
list of all terms in the database with prefix PREFIX, intended to be used to allow drop-down lists and sets of radio buttons to be dynamically generated, e.g.:
Hostname: <SELECT NAME="B"> <OPTION VALUE="" $if{$map{$cgilist{B},$eq{$substr{$_,0,1},H}},,SELECTED}> Any $map{$filterterms{H}, <OPTION VALUE="$html{$_}" $if{$find{$cgilist{B},$html{$_}},SELECTED}> $html{$substr{$_,1}} </OPTION> } </SELECT>
specify an additional HTTP header to be generated by Omega. For example:
$httpheader{Cache-Control,max-age=0$.private}
If Content-Type is not specified by the template, it defaults to text/html. Headers must be specified before any other output from the OmegaScript template - any $httpheader{} commands found later in the template will be silently ignored.
encode STRING as a JSON string (not including the enclosing quotes), e.g. $json{The path is "C:\"} gives The path is \"C:\\\"
Added in Omega 1.3.1.
encodes LIST (a string of tab-separated values) as a JSON array, e.g. $jsonarray{$split{a "b" c:\}} gives ["a","\"b\"","c:\\"]
Added in Omega 1.3.1, but buggy until 1.3.4.
pretty print list. If LIST contains 1, 2, 3, 4 then:
"$list{LIST,$. }" = "1, 2, 3, 4" "$list{LIST,$. , and }" = "1, 2, 3 and 4" "$list{LIST,List ,$. ,.}" = "List 1, 2, 3, 4." "$list{LIST,List ,$. , and ,.}" = "List 1, 2, 3 and 4."
NB $list returns an empty string for an empty list (so the last two forms aren't redundant as it may at first appear).
Return the tag corresponding to key KEY in the CDB file CDBFILE. If the file doesn't exist, or KEY isn't a key in it, then $lookup expands to nothing. CDB files are compact disk based hashtables. For more information and public domain software which can create CDB files, please visit: http://www.corpit.ru/mjt/tinycdb.html
An example of how this might be used is to map top-level domains to country names. Create a CDB file tld_en which maps "fr" to "France", "de" to "Germany", etc and then you can translate a country code to the English country name like so:
"$or{$lookup{tld_en,$field{tld}},.$field{tld}}"
If a tld isn't in the CDB (e.g. "com"), this will expand to ".com".
You can take this further and prepare a set of CDBs mapping tld codes to names in other languages - tld_fr for French, tld_de for German. Then if you have the ISO language code in $opt{lang} you can replace tld_en with tld_$or{$opt{lang},en} and automatically translate into the currently set language, or English if no language is set.
map a list into the evaluated argument. If LIST is 1, 2 then:
"$map{LIST,x$_ = $_; }" = "x1 = 1; x2 = 2; "
Note that $map{} returns a list (this is a change from older versions). If the tabs are a problem, use $list{$map{...},} to get rid of them.
peform a regex match using Perl-compatible regular expressions. Returns true if a match is found, else it returns an empty string.
The optional OPTIONS argument can contain zero or more of the letters imsx, which have the same meanings as the corresponding Perl regexp modifiers:
return codepoint for first character of UTF-8 string. If the argument is an empty string, then an empty string is returned.
For example, $ord{One more time} gives 79.
To convert a Unicode code point into a UTF-8 string, see $chr.
Added in Omega 1.3.4.
percentage score of current hit (in range 1-100).
You probably don't want to show these percentage scores to end users in new applications - they're not really a percentage of anything meaningful, and research seems to suggest that users don't find numeric scores in search results useful.
list of query strings for prefix PREFIX. Any tab characters in the query strings are converted to spaces before adding them to the list (since an OmegaScript list is a string with tabs in).
If PREFIX is omitted or empty, this is built from CGI P variable(s) plus possible added terms from ADD and X.
If PREFIX is non-empty, this is built from CGI P.PREFIX variables.
Note: In Omega < 1.3.3, $query simply joins together the query strings with spaces rather than returning a list.
set option value which may be looked up using $opt. You can use options as variables (for example, to store values you want to reuse without recomputing). There are also several which Omega looks at and which you can set or use:
Omega 1.2.5 and later support the following options, which can be set to a non-empty value to enable the corresponding QueryParser flag. Omega sets flag_default to true by default - you can set it to an empty value to turn it off ($set{flag_default,}):
Omega 1.2.7 added support for search fields with a probabilistic prefix, and you can set different QueryParser flags for each prefix - for example, for the XFOO prefix use XFOO:flag_pure_not, etc. The unprefixed constants provide a default value for these. If a flag is set in the default, the prefix specific flag can unset it if it is set to the empty value (e.g. $set{flag_pure_not,1}$set{XFOO:flag_pure_not,}).
You can use :flag_partial, etc to set or unset a flag just for unprefixed fields.
Similarly, XFOO:stemmer specifies the stemmer to use for field XFOO, with stemmer providing a default.
set a map of option values which may be looked up against using $opt{MAP,NAME} (maps with the same name are merged rather than the old map being completely replaced).
You can create and use of maps in your own templates, but Omega also has several standard maps used to control building the query:
Omega uses the "prefix" map to set the prefixes understood by the query parser. So if you wish to translate a prefix of "author:" to A and "title:" to "S" you would use:
$setmap{prefix,author,A,title,S}
In Omega 1.3.0 and later, you can map a prefix in the query string to more than one term prefix by specifying an OmegaScript list, for example to search unprefixed and S prefix by default use this (this also shows how you can map from an empty query string prefix, and also that you can map to an empty term prefix - these don't require Omega 1.3.0, but become much more useful in combination with this new feature):
$setmap{prefix,,$split{ S}}
Similarly, if you want to be able to restrict a search with a boolean filter from the text query (e.g. "group:" to "G") you would use:
$setmap{boolprefix,group,G}
Don't be tempted to add whitespace around the commas, unless you want it to be included in the names and values!
Another map (added in Omega 1.3.4) allows specifying any boolean prefixes which are non-exclusive, i.e. multiple filters of that type should be combined with OP_AND rather than OP_OR. For example, if you have have a boolean filter on "material" using the XM` prefix, and the items being searched are made of multiple materials, you likely want multiple material filters to restrict to items matching all the materials (the default it to restrict to any of the materials). To specify this use $setmap{nonexclusiveprefix,XM,true} (any non-empty value can be used in place of true) - this feature affect both filters from B CGI parameters (e.g. B=XMglass&B=XMwood` and those from parsing the query (e.g. material:glass material:wood if $setmap{boolprefix,material,XM} is also in effect).
Note: you must set the prefix-related maps before the query is parsed. This is done as late as possible - the following commands require the query to be parsed: $prettyterm, $query, $querydescription, $queryterms, $relevant, $relevants, $setrelevant, $unstem, and also these commands require the match to be run which requires the query to be parsed: $freqs, $hitlist, $last, $lastpage, $msize, $msizeexact, $terms, $thispage, $time, $topdoc, $topterms.
returns the elements from LIST at the positions listed in the second list POSITIONS. The first item is at position 0. Any positions which are out of range will be ignored.
For example, if LIST contains a, b, c, d then:
"$slice{LIST,2}" = "c" "$slice{LIST,1 3}" = "b d" "$slice{LIST,$range{1,3}}" = "b c d" "$slice{LIST,$range{-10,10}}" = "a b c d"
$split{STRING}
returns a list by splitting the string STRING into elements at each occurrence of the substring SPLIT. If SPLIT isn't specified, it defaults to a single space. If SPLIT is empty, STRING is split into individual bytes.
For example:
"$split{one two three}" = "one two three"
returns the substring of STRING which starts at byte position START (the start of the string being 0) and is LENGTH bytes long (or to the end of STRING if STRING is less than START``+``LENGTH bytes long). If LENGTH is omitted, the substring from START to the end of STRING is returned.
If START is negative, it counts back from the end of STRING (so $substr{hello,-1} is o).
If LENGTH is negative, it instead specifies the number of bytes to omit from the end of STRING (so "$substr{example,2,-2}" is "amp"). Note that this means that "$substr{STRING,0,N}$substr{STRING,N}" is "STRING" whether N is positive, negative or zero.
transform string using Perl-compatible regular expressions. This command is sort of like the Perl code:
my $string = STRING; $string =~ s/REGEXP/SUBST/; print $string;
In SUBST, \1 to \9 are substituted by the 1st to 9th bracket grouping (or are empty if there is no such bracket grouping). \\ is a literal backslash.
The optional OPTIONS argument is supported by Omega 1.3.4 and later. It can contain zero or more of the letters gimsx, which have the same meanings as the corresponding Perl regexp modifiers:
- g - replace all occurrences of the pattern in the string
- i - make the pattern matching case-insensitive
- m - make ^/$ match after/before embedded newlines
- s - allows . in the pattern to match a linefeed
- x - allow whitespace and #-comments in the pattern
truncate STRING to LEN bytes, but try to break after a word (unless that would mean truncating to much less than LEN). If we have to split a word, then IND is appended (if specified). If we have to truncate (but don't split a word) then IND2 is appended (if specified). For example:
$truncate{$field{text},500,..., ...}
converts a 4 byte big-endian binary string to a number, for example:
$date{$unpack{$value{0}}}
$mul{A,B,...} multiply arguments together