Back to index

lightning-sunbird  0.9+nobinonly
Public Types | Public Member Functions | Private Attributes
CalculateUTF8Length Class Reference

A character sink (see |copy_string| in nsAlgorithm.h) for computing the length of the UTF-16 string equivalent to a UTF-8 string. More...

#include <nsUTF8Utils.h>

List of all members.

Public Types

typedef nsACString::char_type value_type

Public Member Functions

 CalculateUTF8Length ()
size_t Length () const
PRUint32 NS_ALWAYS_INLINE write (const value_type *start, PRUint32 N)

Private Attributes

size_t mLength
PRBool mErrorEncountered

Detailed Description

A character sink (see |copy_string| in nsAlgorithm.h) for computing the length of the UTF-16 string equivalent to a UTF-8 string.

Definition at line 217 of file nsUTF8Utils.h.


Member Typedef Documentation

typedef nsACString::char_type CalculateUTF8Length::value_type

Definition at line 220 of file nsUTF8Utils.h.


Constructor & Destructor Documentation

Definition at line 222 of file nsUTF8Utils.h.


Member Function Documentation

size_t CalculateUTF8Length::Length ( ) const [inline]

Definition at line 224 of file nsUTF8Utils.h.

{ return mLength; }

Here is the caller graph for this function:

Definition at line 226 of file nsUTF8Utils.h.

      {
          // ignore any further requests
        if ( mErrorEncountered )
            return N;

        // algorithm assumes utf8 units won't
        // be spread across fragments
        const value_type* p = start;
        const value_type* end = start + N;
        for ( ; p < end /* && *p */; ++mLength )
          {
            if ( UTF8traits::isASCII(*p) )
                p += 1;
            else if ( UTF8traits::is2byte(*p) )
                p += 2;
            else if ( UTF8traits::is3byte(*p) )
                p += 3;
            else if ( UTF8traits::is4byte(*p) ) {
                p += 4;
                // Because a UTF-8 sequence of 4 bytes represents a codepoint
                // greater than 0xFFFF, it will become a surrogate pair in the
                // UTF-16 string, so add 1 more to mLength.
                // This doesn't happen with is5byte and is6byte because they
                // are illegal UTF-8 sequences (greater than 0x10FFFF) so get
                // converted to a single replacement character.
                //
                // XXX: if the 4-byte sequence is an illegal non-shortest form,
                //      it also gets converted to a replacement character, so
                //      mLength will be off by one in this case.
                ++mLength;
            }
            else if ( UTF8traits::is5byte(*p) )
                p += 5;
            else if ( UTF8traits::is6byte(*p) )
                p += 6;
            else
              {
                break;
              }
          }
        if ( p != end )
          {
            NS_ERROR("Not a UTF-8 string. This code should only be used for converting from known UTF-8 strings.");
            mErrorEncountered = PR_TRUE;
            return N;
          }
        return p - start;
      }

Here is the call graph for this function:

Here is the caller graph for this function:


Member Data Documentation

Definition at line 278 of file nsUTF8Utils.h.

size_t CalculateUTF8Length::mLength [private]

Definition at line 277 of file nsUTF8Utils.h.


The documentation for this class was generated from the following file: