Back to index

python3.2  3.2.2
Classes | Functions | Variables
codecs Namespace Reference

Classes

class  CodecInfo
 Codec base classes (defining the API) More...
class  Codec
class  IncrementalEncoder
class  BufferedIncrementalEncoder
class  IncrementalDecoder
class  BufferedIncrementalDecoder
class  StreamWriter
class  StreamReader
class  StreamReaderWriter
class  StreamRecoder

Functions

def open
def EncodedFile
def getencoder
 Helpers for codec lookup.
def getdecoder
def getincrementalencoder
def getincrementaldecoder
def getreader
def getwriter
def iterencode
def iterdecode
def make_identity_dict
 Helpers for charmap-based codecs.
def make_encoding_map

Variables

list __all__
string BOM_UTF8 = '\xef\xbb\xbf'
 Constants.
string BOM_LE = '\xff\xfe'
string BOM_BE = '\xfe\xff'
string BOM_UTF32_LE = '\xff\xfe\x00\x00'
string BOM_UTF32_BE = '\x00\x00\xfe\xff'
 BOM = BOM_UTF16BOM_UTF16_LE
 BOM_UTF32 = BOM_UTF32_LE
 BOM32_LE = BOM_UTF16_LE
 BOM32_BE = BOM_UTF16_BE
 BOM64_LE = BOM_UTF32_LE
 BOM64_BE = BOM_UTF32_BE
tuple strict_errors = lookup_error("strict")
 error handlers
tuple ignore_errors = lookup_error("ignore")
tuple replace_errors = lookup_error("replace")
tuple xmlcharrefreplace_errors = lookup_error("xmlcharrefreplace")
tuple backslashreplace_errors = lookup_error("backslashreplace")
int _false = 0

Detailed Description

codecs -- Python Codec Registry, API and helpers.


Written by Marc-Andre Lemburg (mal@lemburg.com).

(c) Copyright CNRI, All Rights Reserved. NO WARRANTY.

Function Documentation

def codecs.EncodedFile (   file,
  data_encoding,
  file_encoding = None,
  errors = 'strict' 
)
Return a wrapped version of file which provides transparent
    encoding translation.

    Strings written to the wrapped file are interpreted according
    to the given data_encoding and then written to the original
    file as string using file_encoding. The intermediate encoding
    will usually be Unicode but depends on the specified codecs.

    Strings are read from the file using file_encoding and then
    passed back to the caller as string using data_encoding.

    If file_encoding is not given, it defaults to data_encoding.

    errors may be given to define the error handling. It defaults
    to 'strict' which causes ValueErrors to be raised in case an
    encoding error occurs.

    The returned wrapped file object provides two extra attributes
    .data_encoding and .file_encoding which reflect the given
    parameters of the same name. The attributes can be used for
    introspection by Python programs.

Definition at line 893 of file codecs.py.

00893 
00894 def EncodedFile(file, data_encoding, file_encoding=None, errors='strict'):
00895 
00896     """ Return a wrapped version of file which provides transparent
00897         encoding translation.
00898 
00899         Strings written to the wrapped file are interpreted according
00900         to the given data_encoding and then written to the original
00901         file as string using file_encoding. The intermediate encoding
00902         will usually be Unicode but depends on the specified codecs.
00903 
00904         Strings are read from the file using file_encoding and then
00905         passed back to the caller as string using data_encoding.
00906 
00907         If file_encoding is not given, it defaults to data_encoding.
00908 
00909         errors may be given to define the error handling. It defaults
00910         to 'strict' which causes ValueErrors to be raised in case an
00911         encoding error occurs.
00912 
00913         The returned wrapped file object provides two extra attributes
00914         .data_encoding and .file_encoding which reflect the given
00915         parameters of the same name. The attributes can be used for
00916         introspection by Python programs.
00917 
00918     """
00919     if file_encoding is None:
00920         file_encoding = data_encoding
00921     data_info = lookup(data_encoding)
00922     file_info = lookup(file_encoding)
00923     sr = StreamRecoder(file, data_info.encode, data_info.decode,
00924                        file_info.streamreader, file_info.streamwriter, errors)
00925     # Add attributes to simplify introspection
00926     sr.data_encoding = data_encoding
00927     sr.file_encoding = file_encoding
00928     return sr

Here is the call graph for this function:

Here is the caller graph for this function:

def codecs.getdecoder (   encoding)
Lookup up the codec for the given encoding and return
    its decoder function.

    Raises a LookupError in case the encoding cannot be found.

Definition at line 941 of file codecs.py.

00941 
00942 def getdecoder(encoding):
00943 
00944     """ Lookup up the codec for the given encoding and return
00945         its decoder function.
00946 
00947         Raises a LookupError in case the encoding cannot be found.
00948 
00949     """
00950     return lookup(encoding).decode

Here is the call graph for this function:

Here is the caller graph for this function:

def codecs.getencoder (   encoding)

Helpers for codec lookup.

Lookup up the codec for the given encoding and return
    its encoder function.

    Raises a LookupError in case the encoding cannot be found.

Definition at line 931 of file codecs.py.

00931 
00932 def getencoder(encoding):
00933 
00934     """ Lookup up the codec for the given encoding and return
00935         its encoder function.
00936 
00937         Raises a LookupError in case the encoding cannot be found.
00938 
00939     """
00940     return lookup(encoding).encode

Here is the call graph for this function:

Here is the caller graph for this function:

def codecs.getincrementaldecoder (   encoding)
Lookup up the codec for the given encoding and return
    its IncrementalDecoder class or factory function.

    Raises a LookupError in case the encoding cannot be found
    or the codecs doesn't provide an incremental decoder.

Definition at line 965 of file codecs.py.

00965 
00966 def getincrementaldecoder(encoding):
00967 
00968     """ Lookup up the codec for the given encoding and return
00969         its IncrementalDecoder class or factory function.
00970 
00971         Raises a LookupError in case the encoding cannot be found
00972         or the codecs doesn't provide an incremental decoder.
00973 
00974     """
00975     decoder = lookup(encoding).incrementaldecoder
00976     if decoder is None:
00977         raise LookupError(encoding)
00978     return decoder

Here is the call graph for this function:

Here is the caller graph for this function:

def codecs.getincrementalencoder (   encoding)
Lookup up the codec for the given encoding and return
    its IncrementalEncoder class or factory function.

    Raises a LookupError in case the encoding cannot be found
    or the codecs doesn't provide an incremental encoder.

Definition at line 951 of file codecs.py.

00951 
00952 def getincrementalencoder(encoding):
00953 
00954     """ Lookup up the codec for the given encoding and return
00955         its IncrementalEncoder class or factory function.
00956 
00957         Raises a LookupError in case the encoding cannot be found
00958         or the codecs doesn't provide an incremental encoder.
00959 
00960     """
00961     encoder = lookup(encoding).incrementalencoder
00962     if encoder is None:
00963         raise LookupError(encoding)
00964     return encoder

Here is the call graph for this function:

Here is the caller graph for this function:

def codecs.getreader (   encoding)
Lookup up the codec for the given encoding and return
    its StreamReader class or factory function.

    Raises a LookupError in case the encoding cannot be found.

Definition at line 979 of file codecs.py.

00979 
00980 def getreader(encoding):
00981 
00982     """ Lookup up the codec for the given encoding and return
00983         its StreamReader class or factory function.
00984 
00985         Raises a LookupError in case the encoding cannot be found.
00986 
00987     """
00988     return lookup(encoding).streamreader

Here is the call graph for this function:

Here is the caller graph for this function:

def codecs.getwriter (   encoding)
Lookup up the codec for the given encoding and return
    its StreamWriter class or factory function.

    Raises a LookupError in case the encoding cannot be found.

Definition at line 989 of file codecs.py.

00989 
00990 def getwriter(encoding):
00991 
00992     """ Lookup up the codec for the given encoding and return
00993         its StreamWriter class or factory function.
00994 
00995         Raises a LookupError in case the encoding cannot be found.
00996 
00997     """
00998     return lookup(encoding).streamwriter

Here is the call graph for this function:

Here is the caller graph for this function:

def codecs.iterdecode (   iterator,
  encoding,
  errors = 'strict',
  kwargs 
)
Decoding iterator.

Decodes the input strings from the iterator using a IncrementalDecoder.

errors and kwargs are passed through to the IncrementalDecoder
constructor.

Definition at line 1017 of file codecs.py.

01017 
01018 def iterdecode(iterator, encoding, errors='strict', **kwargs):
01019     """
01020     Decoding iterator.
01021 
01022     Decodes the input strings from the iterator using a IncrementalDecoder.
01023 
01024     errors and kwargs are passed through to the IncrementalDecoder
01025     constructor.
01026     """
01027     decoder = getincrementaldecoder(encoding)(errors, **kwargs)
01028     for input in iterator:
01029         output = decoder.decode(input)
01030         if output:
01031             yield output
01032     output = decoder.decode(b"", True)
01033     if output:
01034         yield output

Here is the call graph for this function:

Here is the caller graph for this function:

def codecs.iterencode (   iterator,
  encoding,
  errors = 'strict',
  kwargs 
)
Encoding iterator.

Encodes the input strings from the iterator using a IncrementalEncoder.

errors and kwargs are passed through to the IncrementalEncoder
constructor.

Definition at line 999 of file codecs.py.

00999 
01000 def iterencode(iterator, encoding, errors='strict', **kwargs):
01001     """
01002     Encoding iterator.
01003 
01004     Encodes the input strings from the iterator using a IncrementalEncoder.
01005 
01006     errors and kwargs are passed through to the IncrementalEncoder
01007     constructor.
01008     """
01009     encoder = getincrementalencoder(encoding)(errors, **kwargs)
01010     for input in iterator:
01011         output = encoder.encode(input)
01012         if output:
01013             yield output
01014     output = encoder.encode("", True)
01015     if output:
01016         yield output

Here is the call graph for this function:

Here is the caller graph for this function:

def codecs.make_encoding_map (   decoding_map)
Creates an encoding map from a decoding map.

    If a target mapping in the decoding map occurs multiple
    times, then that target is mapped to None (undefined mapping),
    causing an exception when encountered by the charmap codec
    during translation.

    One example where this happens is cp875.py which decodes
    multiple character to \u001a.

Definition at line 1050 of file codecs.py.

01050 
01051 def make_encoding_map(decoding_map):
01052 
01053     """ Creates an encoding map from a decoding map.
01054 
01055         If a target mapping in the decoding map occurs multiple
01056         times, then that target is mapped to None (undefined mapping),
01057         causing an exception when encountered by the charmap codec
01058         during translation.
01059 
01060         One example where this happens is cp875.py which decodes
01061         multiple character to \u001a.
01062 
01063     """
01064     m = {}
01065     for k,v in decoding_map.items():
01066         if not v in m:
01067             m[v] = k
01068         else:
01069             m[v] = None
01070     return m

Here is the caller graph for this function:

Helpers for charmap-based codecs.

make_identity_dict(rng) -> dict

    Return a dictionary where elements of the rng sequence are
    mapped to themselves.

Definition at line 1037 of file codecs.py.

01037 
01038 def make_identity_dict(rng):
01039 
01040     """ make_identity_dict(rng) -> dict
01041 
01042         Return a dictionary where elements of the rng sequence are
01043         mapped to themselves.
01044 
01045     """
01046     res = {}
01047     for i in rng:
01048         res[i]=i
01049     return res

def codecs.open (   filename,
  mode = 'rb',
  encoding = None,
  errors = 'strict',
  buffering = 1 
)
Open an encoded file using the given mode and return
    a wrapped version providing transparent encoding/decoding.

    Note: The wrapped version will only accept the object format
    defined by the codecs, i.e. Unicode objects for most builtin
    codecs. Output is also codec dependent and will usually be
    Unicode as well.

    Files are always opened in binary mode, even if no binary mode
    was specified. This is done to avoid data loss due to encodings
    using 8-bit values. The default file mode is 'rb' meaning to
    open the file in binary read mode.

    encoding specifies the encoding which is to be used for the
    file.

    errors may be given to define the error handling. It defaults
    to 'strict' which causes ValueErrors to be raised in case an
    encoding error occurs.

    buffering has the same meaning as for the builtin open() API.
    It defaults to line buffered.

    The returned wrapped file object provides an extra attribute
    .encoding which allows querying the used encoding. This
    attribute is only available if an encoding was specified as
    parameter.

Definition at line 849 of file codecs.py.

00849 
00850 def open(filename, mode='rb', encoding=None, errors='strict', buffering=1):
00851 
00852     """ Open an encoded file using the given mode and return
00853         a wrapped version providing transparent encoding/decoding.
00854 
00855         Note: The wrapped version will only accept the object format
00856         defined by the codecs, i.e. Unicode objects for most builtin
00857         codecs. Output is also codec dependent and will usually be
00858         Unicode as well.
00859 
00860         Files are always opened in binary mode, even if no binary mode
00861         was specified. This is done to avoid data loss due to encodings
00862         using 8-bit values. The default file mode is 'rb' meaning to
00863         open the file in binary read mode.
00864 
00865         encoding specifies the encoding which is to be used for the
00866         file.
00867 
00868         errors may be given to define the error handling. It defaults
00869         to 'strict' which causes ValueErrors to be raised in case an
00870         encoding error occurs.
00871 
00872         buffering has the same meaning as for the builtin open() API.
00873         It defaults to line buffered.
00874 
00875         The returned wrapped file object provides an extra attribute
00876         .encoding which allows querying the used encoding. This
00877         attribute is only available if an encoding was specified as
00878         parameter.
00879 
00880     """
00881     if encoding is not None and \
00882        'b' not in mode:
00883         # Force opening of the file in binary mode
00884         mode = mode + 'b'
00885     file = builtins.open(filename, mode, buffering)
00886     if encoding is None:
00887         return file
00888     info = lookup(encoding)
00889     srw = StreamReaderWriter(file, info.streamreader, info.streamwriter, errors)
00890     # Add attributes to simplify introspection
00891     srw.encoding = encoding
00892     return srw

Here is the call graph for this function:

Here is the caller graph for this function:


Variable Documentation

Initial value:
00001 ["register", "lookup", "open", "EncodedFile", "BOM", "BOM_BE",
00002            "BOM_LE", "BOM32_BE", "BOM32_LE", "BOM64_BE", "BOM64_LE",
00003            "BOM_UTF8", "BOM_UTF16", "BOM_UTF16_LE", "BOM_UTF16_BE",
00004            "BOM_UTF32", "BOM_UTF32_LE", "BOM_UTF32_BE",
00005            "strict_errors", "ignore_errors", "replace_errors",
00006            "xmlcharrefreplace_errors",
00007            "register_error", "lookup_error"]

Definition at line 19 of file codecs.py.

Definition at line 1089 of file codecs.py.

Definition at line 1078 of file codecs.py.

codecs.BOM = BOM_UTF16BOM_UTF16_LE

Definition at line 53 of file codecs.py.

codecs.BOM32_BE = BOM_UTF16_BE

Definition at line 68 of file codecs.py.

codecs.BOM32_LE = BOM_UTF16_LE

Definition at line 67 of file codecs.py.

Definition at line 70 of file codecs.py.

Definition at line 69 of file codecs.py.

string codecs.BOM_BE = '\xfe\xff'

Definition at line 42 of file codecs.py.

string codecs.BOM_LE = '\xff\xfe'

Definition at line 39 of file codecs.py.

Definition at line 56 of file codecs.py.

string codecs.BOM_UTF32_BE = '\x00\x00\xfe\xff'

Definition at line 48 of file codecs.py.

string codecs.BOM_UTF32_LE = '\xff\xfe\x00\x00'

Definition at line 45 of file codecs.py.

string codecs.BOM_UTF8 = '\xef\xbb\xbf'

Constants.

Definition at line 36 of file codecs.py.

Definition at line 1075 of file codecs.py.

Definition at line 1076 of file codecs.py.

error handlers

Definition at line 1074 of file codecs.py.

Definition at line 1077 of file codecs.py.