Back to index

nux  3.0.0
Public Member Functions | Private Member Functions | Private Attributes
nux::NUTF16 Class Reference

Convert UTF-8 to UTF-16. More...

#include <NUTF.h>

List of all members.

Public Member Functions

 NUTF16 (const char *Source)
 NUTF16 (const std::string &Source)
 ~NUTF16 ()
 operator const UNICHAR * ()

Private Member Functions

void Convert (const char *)

Private Attributes

UNICHARunicode

Detailed Description

Convert UTF-8 to UTF-16.

Definition at line 78 of file NUTF.h.


Constructor & Destructor Documentation

nux::NUTF16::NUTF16 ( const char *  Source) [explicit]

Definition at line 222 of file NUTF.cpp.

  {
    Convert (Source);
  }

Here is the call graph for this function:

nux::NUTF16::NUTF16 ( const std::string &  Source) [explicit]

Definition at line 227 of file NUTF.cpp.

  {
    Convert (Source.c_str() );
  }

Here is the call graph for this function:

Definition at line 308 of file NUTF.cpp.

  {
    delete [] unicode;
  }

Member Function Documentation

void nux::NUTF16::Convert ( const char *  Source) [private]

Definition at line 232 of file NUTF.cpp.

  {
    //     U-00000000  U-0000007F:       0xxxxxxx
    //     U-00000080  U-000007FF:       110xxxxx 10xxxxxx
    //     U-00000800  U-0000FFFF:       1110xxxx 10xxxxxx 10xxxxxx
    //     U-00010000  U-001FFFFF:       11110xxx 10xxxxxx 10xxxxxx 10xxxxxx
    //     U-00200000  U-03FFFFFF:       111110xx 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx
    //     U-04000000  U-7FFFFFFF:       1111110x 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx

    unsigned char MSB;
    int temp = 0;
    int numbytes = 0; // Number of bytes used to represent the unicode char
    int pos = 0;

    size_t len = strlen (Source) + 1; // +1 for NULL char
    unicode = new UNICHAR[len*6];

    // Loop through the characters in the string and decode them
    for (size_t n = 0; n < len; ++n)
    {
      // Find the hexadecimal number following the equals sign
      MSB = Source[n];

      if (MSB <= 0x7F)
      {
        unicode[pos++] = (UNICHAR) MSB;
      }
      else
      {
        // 2 bytes
        if (MSB >= 0xC0 && MSB <= 0xDF)
        {
          temp = (MSB - 0xC0) << 6;
          numbytes = 2;
        }
        // 3 bytes
        else if (MSB >= 0xE0 && MSB <= 0xEF)
        {
          temp = (MSB - 0xE0) << 12;
          numbytes = 3;
        }
        // 4 bytes
        else if (MSB >= 0xF0 && MSB <= 0xF7)
        {
          temp = (MSB - 0xF0) << 18;
          numbytes = 4;
        }
        // 5 bytes
        else if (MSB >= 0xF8 && MSB <= 0xFB)
        {
          temp = (MSB - 0xF8) << 24;
          numbytes = 5;
        }
        // 6 bytes
        else if (MSB >= 0xFC && MSB <= 0xFD)
        {
          temp = (MSB - 0xFC) << 30;
          numbytes = 6;
        }

        // Loop through the remaining hexadecimal numbers representing the next unicode character
        for (int i = 0, shift = (numbytes - 2) * 6; shift >= 0; i++, shift -= 6)
        {
          int nVal = ( ( (unsigned char) Source[n+1+i]) - 0x80 ) << shift;
          temp += nVal;
        }

        // Add the unicode character to the final string
        unicode[pos++] = (UNICHAR) temp;

        // Move the character index in the source to the next unicode character
        n += (numbytes - 1);
      }
    }
  }

Here is the caller graph for this function:

nux::NUTF16::operator const UNICHAR * ( )

Definition at line 313 of file NUTF.cpp.

  {
    return unicode;
  }

Member Data Documentation

Definition at line 89 of file NUTF.h.


The documentation for this class was generated from the following files: