Back to index

lightning-sunbird  0.9+nobinonly
Public Types | Public Member Functions | Public Attributes | Private Member Functions | Static Private Member Functions | Private Attributes
txExprLexer Class Reference

A class for splitting an "Expr" String into tokens and performing basic Lexical Analysis. More...

#include <ExprLexer.h>

Collaboration diagram for txExprLexer:
Collaboration graph
[legend]

List of all members.

Public Types

enum  _TrivialTokens {
  D_QUOTE = '\"', S_QUOTE = '\'', L_PAREN = '(', R_PAREN = ')',
  L_BRACKET = '[', R_BRACKET = ']', L_ANGLE = '<', R_ANGLE = '>',
  COMMA = ',', PERIOD = '.', ASTERIX = '*', FORWARD_SLASH = '/',
  EQUAL = '=', BANG = '!', VERT_BAR = '|', AT_SIGN = '@',
  DOLLAR_SIGN = '$', PLUS = '+', HYPHEN = '-', COLON = ':',
  SPACE = ' ', TX_TAB = '\t', TX_CR = '\n', TX_LF = '\r'
}
 Trivial Tokens. More...
typedef
nsASingleFragmentString::const_char_iterator 
iterator

Public Member Functions

 txExprLexer ()
 Lexical analyzer for XPath expressions.
 ~txExprLexer ()
 Destroys this instance of an txExprLexer.
nsresult parse (const nsASingleFragmentString &aPattern)
 Parse the given string.
TokennextToken ()
 Functions for iterating over the TokenList.
Tokenpeek ()
void pushBack ()
PRBool hasMoreTokens ()

Public Attributes

iterator mPosition

Private Member Functions

void addToken (Token *aToken)
PRBool nextIsOperatorToken (Token *aToken)
 Returns true if the following Token should be an operator.

Static Private Member Functions

static PRBool isXPathDigit (PRUnichar ch)
 Returns true if the given character represents a numeric letter (digit) Implemented in ExprLexerChars.cpp.

Private Attributes

TokenmCurrentItem
TokenmFirstItem
TokenmLastItem
int mTokenCount

Detailed Description

A class for splitting an "Expr" String into tokens and performing basic Lexical Analysis.

This class was ported from XSL:P, an open source Java based XSL processor

Definition at line 163 of file ExprLexer.h.


Member Typedef Documentation

typedef nsASingleFragmentString::const_char_iterator txExprLexer::iterator

Definition at line 179 of file ExprLexer.h.


Member Enumeration Documentation

Trivial Tokens.

Enumerator:
D_QUOTE 
S_QUOTE 
L_PAREN 
R_PAREN 
L_BRACKET 
R_BRACKET 
L_ANGLE 
R_ANGLE 
COMMA 
PERIOD 
ASTERIX 
FORWARD_SLASH 
EQUAL 
BANG 
VERT_BAR 
AT_SIGN 
DOLLAR_SIGN 
PLUS 
HYPHEN 
COLON 
SPACE 
TX_TAB 
TX_CR 
TX_LF 

Definition at line 201 of file ExprLexer.h.

                        {
        D_QUOTE        = '\"',
        S_QUOTE        = '\'',
        L_PAREN        = '(',
        R_PAREN        = ')',
        L_BRACKET      = '[',
        R_BRACKET      = ']',
        L_ANGLE        = '<',
        R_ANGLE        = '>',
        COMMA          = ',',
        PERIOD         = '.',
        ASTERIX        = '*',
        FORWARD_SLASH  = '/',
        EQUAL          = '=',
        BANG           = '!',
        VERT_BAR       = '|',
        AT_SIGN        = '@',
        DOLLAR_SIGN    = '$',
        PLUS           = '+',
        HYPHEN         = '-',
        COLON          = ':',
        //-- whitespace tokens
        SPACE          = ' ',
        TX_TAB            = '\t',
        TX_CR             = '\n',
        TX_LF             = '\r'
    };

Constructor & Destructor Documentation

Lexical analyzer for XPath expressions.

Creates a new ExprLexer

Definition at line 52 of file ExprLexer.cpp.

Destroys this instance of an txExprLexer.

Definition at line 63 of file ExprLexer.cpp.

{
  //-- delete tokens
  Token* tok = mFirstItem;
  while (tok) {
    Token* temp = tok->mNext;
    delete tok;
    tok = temp;
  }
  mCurrentItem = nsnull;
}

Member Function Documentation

void txExprLexer::addToken ( Token aToken) [private]

Definition at line 91 of file ExprLexer.cpp.

{
  if (mLastItem) {
    aToken->mPrevious = mLastItem;
    mLastItem->mNext = aToken;
  }
  if (!mFirstItem) {
    mFirstItem = aToken;
    mCurrentItem = aToken;
  }
  mLastItem = aToken;
  ++mTokenCount;
}

Here is the caller graph for this function:

Definition at line 192 of file ExprLexer.h.

    {
        return (mCurrentItem->mType != Token::END);
    }
static PRBool txExprLexer::isXPathDigit ( PRUnichar  ch) [inline, static, private]

Returns true if the given character represents a numeric letter (digit) Implemented in ExprLexerChars.cpp.

Definition at line 250 of file ExprLexer.h.

    {
        return (ch >= '0' && ch <= '9');
    }

Here is the caller graph for this function:

Returns true if the following Token should be an operator.

This is a helper for the first bullet of [XPath 3.7] Lexical Structure

Definition at line 111 of file ExprLexer.cpp.

{
  if (!aToken || aToken->mType == Token::NULL_TOKEN) {
    return PR_FALSE;
  }
  /* This relies on the tokens having the right order in ExprLexer.h */
  return aToken->mType < Token::COMMA ||
    aToken->mType > Token::UNION_OP;

}

Here is the caller graph for this function:

Functions for iterating over the TokenList.

Definition at line 76 of file ExprLexer.cpp.

{
  NS_ASSERTION(mCurrentItem, "nextToken called beyoned the end");
  Token* token = mCurrentItem;
  mCurrentItem = mCurrentItem->mNext;
  return token;
}

Here is the caller graph for this function:

Parse the given string.

Parses the given string into a sequence of Tokens.

returns an error result if lexing failed. The given string must outlive the use of the lexer, as the generated Tokens point to Substrings of it. mPosition points to the offending location in case of an error.

Definition at line 126 of file ExprLexer.cpp.

{
  iterator start, end;
  start = aPattern.BeginReading(mPosition);
  aPattern.EndReading(end);

  //-- initialize previous token, this will automatically get
  //-- deleted when it goes out of scope
  Token nullToken(nsnull, nsnull, Token::NULL_TOKEN);

  Token::Type defType;
  Token* newToken = nsnull;
  Token* prevToken = &nullToken;
  PRBool isToken;

  while (mPosition < end) {

    defType = Token::CNAME;
    isToken = PR_TRUE;

    if (*mPosition == DOLLAR_SIGN) {
      if (++mPosition == end || !XMLUtils::isLetter(*mPosition)) {
        return NS_ERROR_XPATH_INVALID_VAR_NAME;
      }
      defType = Token::VAR_REFERENCE;
    } 
    // just reuse the QName parsing, which will use defType 
    // the token to construct

    if (XMLUtils::isLetter(*mPosition)) {
      // NCName, can get QName or OperatorName;
      //  FunctionName, NodeName, and AxisSpecifier may want whitespace,
      //  and are dealt with below
      start = mPosition;
      while (++mPosition < end && XMLUtils::isNCNameChar(*mPosition)) {
        /* just go */
      }
      if (mPosition < end && *mPosition == COLON) {
        // try QName or wildcard, might need to step back for axis
        if (++mPosition == end) {
          return NS_ERROR_XPATH_UNEXPECTED_END;
        }
        if (XMLUtils::isLetter(*mPosition)) {
          while (++mPosition < end && XMLUtils::isNCNameChar(*mPosition)) {
            /* just go */
          }
        }
        else if (*mPosition == '*' && defType != Token::VAR_REFERENCE) {
          // eat wildcard for NameTest, bail for var ref at COLON
          ++mPosition;
        }
        else {
          --mPosition; // step back
        }
      }
      if (nextIsOperatorToken(prevToken)) {
        NS_ConvertUTF16toUTF8 opUTF8(Substring(start, mPosition));
        if (txXPathAtoms::_and->EqualsUTF8(opUTF8)) {
          defType = Token::AND_OP;
        }
        else if (txXPathAtoms::_or->EqualsUTF8(opUTF8)) {
          defType = Token::OR_OP;
        }
        else if (txXPathAtoms::mod->EqualsUTF8(opUTF8)) {
          defType = Token::MODULUS_OP;
        }
        else if (txXPathAtoms::div->EqualsUTF8(opUTF8)) {
          defType = Token::DIVIDE_OP;
        }
        else {
          // XXX QUESTION: spec is not too precise
          // badops is sure an error, but is bad:ops, too? We say yes!
          return NS_ERROR_XPATH_OPERATOR_EXPECTED;
        }
      }
      newToken = new Token(start, mPosition, defType);
    }
    else if (isXPathDigit(*mPosition)) {
      start = mPosition;
      while (++mPosition < end && isXPathDigit(*mPosition)) {
        /* just go */
      }
      if (mPosition < end && *mPosition == '.') {
        while (++mPosition < end && isXPathDigit(*mPosition)) {
          /* just go */
        }
      }
      newToken = new Token(start, mPosition, Token::NUMBER);
    }
    else {
      switch (*mPosition) {
        //-- ignore whitespace
      case SPACE:
      case TX_TAB:
      case TX_CR:
      case TX_LF:
        ++mPosition;
        isToken = PR_FALSE;
        break;
      case S_QUOTE :
      case D_QUOTE :
        start = mPosition;
        while (++mPosition < end && *mPosition != *start) {
          // eat literal
        }
        if (mPosition == end) {
          mPosition = start;
          return NS_ERROR_XPATH_UNCLOSED_LITERAL;
        }
        newToken = new Token(start + 1, mPosition, Token::LITERAL);
        ++mPosition;
        break;
      case PERIOD:
        // period can be .., .(DIGITS)+ or ., check next
        if (++mPosition == end) {
          newToken = new Token(mPosition - 1, Token::SELF_NODE);
        }
        else if (isXPathDigit(*mPosition)) {
          start = mPosition - 1;
          while (++mPosition < end && isXPathDigit(*mPosition)) {
            /* just go */
          }
          newToken = new Token(start, mPosition, Token::NUMBER);
        }
        else if (*mPosition == PERIOD) {
          ++mPosition;
          newToken = new Token(mPosition - 2, mPosition, Token::PARENT_NODE);
        }
        else {
          newToken = new Token(mPosition - 1, Token::SELF_NODE);
        }
        break;
      case COLON: // QNames are dealt above, must be axis ident
        if (++mPosition >= end || *mPosition != COLON ||
            prevToken->mType != Token::CNAME) {
          return NS_ERROR_XPATH_BAD_COLON;
        }
        prevToken->mType = Token::AXIS_IDENTIFIER;
        ++mPosition;
        isToken = PR_FALSE;
        break;
      case FORWARD_SLASH :
        if (++mPosition < end && *mPosition == FORWARD_SLASH) {
          ++mPosition;
          newToken = new Token(mPosition - 2, mPosition, Token::ANCESTOR_OP);
        }
        else {
          newToken = new Token(mPosition - 1, Token::PARENT_OP);
        }
        break;
      case BANG : // can only be !=
        if (++mPosition < end && *mPosition == EQUAL) {
          ++mPosition;
          newToken = new Token(mPosition - 2, mPosition, Token::NOT_EQUAL_OP);
          break;
        }
        // Error ! is not not()
        return NS_ERROR_XPATH_BAD_BANG;
      case EQUAL:
        newToken = new Token(mPosition, Token::EQUAL_OP);
        ++mPosition;
        break;
      case L_ANGLE:
        if (++mPosition == end) {
          return NS_ERROR_XPATH_UNEXPECTED_END;
        }
        if (*mPosition == EQUAL) {
          ++mPosition;
          newToken = new Token(mPosition - 2, mPosition,
                               Token::LESS_OR_EQUAL_OP);
        }
        else {
          newToken = new Token(mPosition - 1, Token::LESS_THAN_OP);
        }
        break;
      case R_ANGLE:
        if (++mPosition == end) {
          return NS_ERROR_XPATH_UNEXPECTED_END;
        }
        if (*mPosition == EQUAL) {
          ++mPosition;
          newToken = new Token(mPosition - 2, mPosition,
                               Token::GREATER_OR_EQUAL_OP);
        }
        else {
          newToken = new Token(mPosition - 1, Token::GREATER_THAN_OP);
        }
        break;
      case HYPHEN :
        newToken = new Token(mPosition, Token::SUBTRACTION_OP);
        ++mPosition;
        break;
      case ASTERIX:
        if (nextIsOperatorToken(prevToken)) {
          newToken = new Token(mPosition, Token::MULTIPLY_OP);
        }
        else {
          newToken = new Token(mPosition, Token::CNAME);
        }
        ++mPosition;
        break;
      case L_PAREN:
        if (prevToken->mType == Token::CNAME) {
          NS_ConvertUTF16toUTF8 utf8Value(prevToken->Value());
          if (txXPathAtoms::comment->EqualsUTF8(utf8Value)) {
            prevToken->mType = Token::COMMENT;
          }
          else if (txXPathAtoms::node->EqualsUTF8(utf8Value)) {
            prevToken->mType = Token::NODE;
          }
          else if (txXPathAtoms::processingInstruction->EqualsUTF8(utf8Value)) {
            prevToken->mType = Token::PROC_INST;
          }
          else if (txXPathAtoms::text->EqualsUTF8(utf8Value)) {
            prevToken->mType = Token::TEXT;
          }
          else {
            prevToken->mType = Token::FUNCTION_NAME;
          }
        }
        newToken = new Token(mPosition, Token::L_PAREN);
        ++mPosition;
        break;
      case R_PAREN:
        newToken = new Token(mPosition, Token::R_PAREN);
        ++mPosition;
        break;
      case L_BRACKET:
        newToken = new Token(mPosition, Token::L_BRACKET);
        ++mPosition;
        break;
      case R_BRACKET:
        newToken = new Token(mPosition, Token::R_BRACKET);
        ++mPosition;
        break;
      case COMMA:
        newToken = new Token(mPosition, Token::COMMA);
        ++mPosition;
        break;
      case AT_SIGN :
        newToken = new Token(mPosition, Token::AT_SIGN);
        ++mPosition;
        break;
      case PLUS:
        newToken = new Token(mPosition, Token::ADDITION_OP);
        ++mPosition;
        break;
      case VERT_BAR:
        newToken = new Token(mPosition, Token::UNION_OP);
        ++mPosition;
        break;
      default:
        // Error, don't grok character :-(
        return NS_ERROR_XPATH_ILLEGAL_CHAR;
      }
    }
    if (isToken) {
      NS_ENSURE_TRUE(newToken, NS_ERROR_OUT_OF_MEMORY);
      NS_ENSURE_TRUE(newToken != mLastItem, NS_ERROR_FAILURE);
      prevToken = newToken;
      addToken(newToken);
    }
  }

  // add a endToken to the list
  newToken = new Token(end, end, Token::END);
  if (!newToken) {
    return NS_ERROR_OUT_OF_MEMORY;
  }
  addToken(newToken);

  return NS_OK;
}

Here is the call graph for this function:

Here is the caller graph for this function:

Token* txExprLexer::peek ( ) [inline]

Definition at line 187 of file ExprLexer.h.

    {
        return mCurrentItem;
    }

Here is the caller graph for this function:

Definition at line 85 of file ExprLexer.cpp.

Here is the caller graph for this function:


Member Data Documentation

Definition at line 231 of file ExprLexer.h.

Definition at line 232 of file ExprLexer.h.

Definition at line 233 of file ExprLexer.h.

Definition at line 180 of file ExprLexer.h.

Definition at line 235 of file ExprLexer.h.


The documentation for this class was generated from the following files: