Back to index
Provides a language independent way to break UNICODE text into meaningful semantic units (e.g. More...
|void||start (in string characterSet)|
|boolean||next (in wstring text, in long length, in long pos, in boolean isLastBuffer, out long begin, out long end)|
|next() Get the begin / end offset of the next unit in the current text |
Provides a language independent way to break UNICODE text into meaningful semantic units (e.g.
|boolean nsISemanticUnitScanner::next||(||in wstring||text,|
next() Get the begin / end offset of the next unit in the current text
|text||the text to be scanned|
|length||the number of characters in the text to be processed|
|pos||the current position|
|isLastBuffer,the||buffer is the last one|
|begin||the begin offset of the next unit|
|begin||the end offset of the next unit|
Starts up the semantic unit scanner with an optional character set, which acts as a hint to optimize the heuristics used to determine the language(s) of the processed text.
|characterSet||the character set the text was originally encoded in (can be NULL)|