DSIUnicodeLikeHelper Class Reference

Inherits Simba::DSI::DSIUnicodeCollator.

Inherited by DSIUnicodeLikeMatcher [private].

List of all members.

Static Public Member Functions

static void ParsePatternAndEscapeStrings (const void *in_pattern, simba_int32 in_patternLength, const void *in_escapeChar, simba_int32 in_escapeCharLength, EncodingType in_encoding, IndexVector &io_metaCharVector, simba_wstring &io_uPattern)
 Parse the pattern string and push all metacharacters to an Index Vector.

Static Public Attributes

static const simba_int32 CODE_UNIT_SIZE
 the encoding byte length for pattern and match byte arrays. Must be ICU's internal encodings.
static const simba_wstring MULTIPLE_WILDCARD
 Static variable for the multiple wild card metacharacter.
static const simba_wstring SINGLE_WILDCARD
 Static variable for the single wild card metacharacter.
static const simba_wstring SPACE_CHAR
 Static variable for the wide space character.

Protected Types

typedef SearchContext
< DSIUnicodeLikeHelper
LikeSearchContext
 The typeDefinition of the search optimization structure to use.

Protected Member Functions

 DSIUnicodeLikeHelper (const Simba::DSI::DSICollatingSequence &in_collatingSequence, EncodingType in_encoding=simba_wstring::GetInternalEncoding())
 Constructor, initializes DSIUniodeCollator.
bool EndsWith (const LikeVector::const_iterator &in_LikeNodeIter, const void *in_string, const simba_int32 in_stringByteLength, const simba_int32 in_startIndexInBytes, simba_int32 &out_resultLengthInBytes) const
 Match the pattern to the end of the input string. Ignores all trailing spaces after last char. All input must be in ICU's internal encoding.
EncodingType GetEncoding () const
 Get the encoding from the Unicode collator.
bool Search (simba_int32 in_offset, const void *in_pattern, const simba_int32 in_patternByteLength, const void *in_string, const simba_int32 in_stringByteLength, simba_int32 &out_resultLengthInBytes, simba_int32 &out_indexInBytes) const
 Search for pattern within string.
bool SkipN (const void *in_string, const simba_int32 in_stringByteLength, simba_int32 in_currentIndexInBytes, simba_int32 in_skipGraphemes, simba_int32 &out_numberofBytesSkipped) const
 Skip 'n' grapheme clusters in the input string.
bool StartsWith (const void *in_pattern, const simba_int32 in_patternByteLength, const void *in_string, const simba_int32 in_stringByteLength, const simba_int32 in_startIndexInBytes, simba_int32 &out_resultLengthInBytes) const
 Match the pattern to the beginning of the input string. All input must be in ICU's internal encoding.
 ~DSIUnicodeLikeHelper ()
 Destructor.

Member Typedef Documentation

The typeDefinition of the search optimization structure to use.


Constructor & Destructor Documentation

DSIUnicodeLikeHelper ( const Simba::DSI::DSICollatingSequence in_collatingSequence,
EncodingType  in_encoding = simba_wstring::GetInternalEncoding() 
) [protected]

Constructor, initializes DSIUniodeCollator.

Parameters:
in_collatingSequence The current coerced collating sequence to use during collation and comparisons.
in_encoding The current encoding type.
~DSIUnicodeLikeHelper (  )  [protected]

Destructor.


Member Function Documentation

bool EndsWith ( const LikeVector::const_iterator &  in_LikeNodeIter,
const void *  in_string,
const simba_int32  in_stringByteLength,
const simba_int32  in_startIndexInBytes,
simba_int32 &  out_resultLengthInBytes 
) const [protected]

Match the pattern to the end of the input string. Ignores all trailing spaces after last char. All input must be in ICU's internal encoding.

Parameters:
in_likeNodeIter A const iterator pointing to the final node in the LikeVector.
in_string Void pointer to the string being searched. Cannot be NULL. (NOT OWN)
in_stringByteLength The length of the string in bytes.
in_startIndexInBytes The offset where searching will start, in bytes.
out_resultLengthInBytes Return the length of matched segment in bytes.
Returns:
a boolean value. TRUE if match found, FALSE otherwise.
EncodingType GetEncoding (  )  const [inline, protected]

Get the encoding from the Unicode collator.

Returns:
the encoding as defined in the UncicodeCollator class.
static void ParsePatternAndEscapeStrings ( const void *  in_pattern,
simba_int32  in_patternLength,
const void *  in_escapeChar,
simba_int32  in_escapeCharLength,
EncodingType  in_encoding,
IndexVector io_metaCharVector,
simba_wstring io_uPattern 
) [static]

Parse the pattern string and push all metacharacters to an Index Vector.

Parameters:
in_pattern A byte array containing the pattern. Cannot be NULL. (NOT OWN)
in_patternLength The length of in_pattern in bytes.
in_escapeChar THe escape character for the pattern. If NULL, assume no escape char. (NOT OWN)
in_escapeCharLength The length of in_escapeChar in bytes.
io_metaCharVector A IndexVector which will be populated with all non-escaped metaCharacters.
io_uPattern The cleaned pattern string (no escape characters)
bool Search ( simba_int32  in_offset,
const void *  in_pattern,
const simba_int32  in_patternByteLength,
const void *  in_string,
const simba_int32  in_stringByteLength,
simba_int32 &  out_resultLengthInBytes,
simba_int32 &  out_indexInBytes 
) const [protected]

Search for pattern within string.

The collations of the two given strings are already known at the creation time of the Collator. All input must be in ICU's internal encoding.

Parameters:
in_offset The offset to use when searching.
in_pattern Void pointer to the pattern. Cannot be NULL. (NOT OWN)
in_patternByteLength The length of the pattern in bytes.
in_string Void pointer to the string being searched. Cannot be NULL. (NOT OWN)
in_stringByteLength The length of the string in bytes.
out_resultLengthInBytes Return the length of matched segment in bytes.
out_indexInBytes the index position in bytes if match found. If no match found, output SIMBA_NPOS.
Returns:
true if match was found.
bool SkipN ( const void *  in_string,
const simba_int32  in_stringByteLength,
simba_int32  in_currentIndexInBytes,
simba_int32  in_skipGraphemes,
simba_int32 &  out_numberofBytesSkipped 
) const [protected]

Skip 'n' grapheme clusters in the input string.

Uses grapheme clusters as defined by Unicode Standard Annex #29 http://www.unicode.org/reports/tr29/ All input must be in ICU's internal encoding.

Parameters:
in_string Void pointer to the string being searched. Cannot be NULL. (NOT OWN)
in_stringByteLength The length of the string in bytes.
in_startIndexInBytes The offset where searching will start, in bytes.
in_skipGraphemes The number of grapheme clusters to skip.
out_numberofBytesSkipped The number of bytes skipped based on the grapheme cluster analysis of the string
Returns:
true if the skip succeeded.
bool StartsWith ( const void *  in_pattern,
const simba_int32  in_patternByteLength,
const void *  in_string,
const simba_int32  in_stringByteLength,
const simba_int32  in_startIndexInBytes,
simba_int32 &  out_resultLengthInBytes 
) const [protected]

Match the pattern to the beginning of the input string. All input must be in ICU's internal encoding.

Parameters:
in_pattern Void pointer to the pattern. Cannot be NULL. (NOT OWN)
in_patternByteLength The length of the pattern in bytes.
in_string Void pointer to the string being searched. Cannot be NULL. (NOT OWN)
in_stringByteLength The length of the string in bytes.
in_startIndexInBytes The offset where searching will start, in bytes.
out_resultLengthInBytes Return the length of the matched segment in bytes.
Returns:
a boolean value. TRUE if matched found, FALSE otherwise.

Member Data Documentation

const simba_int32 CODE_UNIT_SIZE [static]

the encoding byte length for pattern and match byte arrays. Must be ICU's internal encodings.

Static variable for the multiple wild card metacharacter.

Static variable for the single wild card metacharacter.

const simba_wstring SPACE_CHAR [static]

Static variable for the wide space character.


The documentation for this class was generated from the following file:

Generated on Wed May 17 14:21:15 2017 for SimbaEngine 10.1.3.1011 by simba