Back to index

python3.2  3.2.2
Classes | Functions | Variables
pickletools Namespace Reference

Classes

class  ArgumentDescriptor
class  StackObject
 Object descriptors. More...
class  OpcodeInfo
 Descriptors for pickle opcodes. More...
class  _Example

Functions

def read_uint1
def read_uint2
def read_int4
def read_stringnl
def read_stringnl_noescape
def read_stringnl_noescape_pair
def read_string4
def read_string1
def read_unicodestringnl
def read_unicodestring4
def read_decimalnl_short
def read_decimalnl_long
def read_floatnl
def read_float8
def read_long1
def read_long4
def assure_pickle_consistency
def genops
 A pickle opcode generator.
def optimize
 A pickle optimizer.
def dis
 A symbolic pickle disassembler.
def _test

Variables

list __all__ = ['dis', 'genops', 'optimize']
 bytes_types = pickle.bytes_types
int UP_TO_NEWLINE = 1
 Some pickle opcodes have an argument, following the opcode in the bytestream.
int TAKEN_FROM_ARGUMENT1 = 2
int TAKEN_FROM_ARGUMENT4 = 3
tuple uint1
tuple uint2
tuple int4
tuple stringnl
tuple stringnl_noescape
tuple stringnl_noescape_pair
tuple string4
tuple string1
tuple unicodestringnl
tuple unicodestring4
tuple decimalnl_short
tuple decimalnl_long
tuple floatnl
tuple float8
tuple long1
tuple long4
tuple pyint
tuple pylong
tuple pyinteger_or_bool
tuple pybool
tuple pyfloat
tuple pystring
tuple pybytes
tuple pyunicode
tuple pynone
tuple pytuple
tuple pylist
tuple pydict
tuple anyobject
tuple markobject
tuple stackslice
 I = OpcodeInfo
list opcodes
string code = 'R'
 arg = None,
list stack_before = [anyobject, anyobject]
list stack_after = [anyobject]
int proto = 0
string doc
dictionary name2i = {}
dictionary code2i = {}
dictionary code2op = {}
 Build a code2op dict, mapping opcode characters to OpcodeInfo records.
string _dis_test = r""
string _memo_test = r""
dictionary __test__
tuple parser
string nargs = '*'
string help = 'the file where the output should be written'
tuple args = parser.parse_args()
int annotate = 30
dictionary memo = {}
tuple preamble = args.preamble.format(name=f.name)

Detailed Description

"Executable documentation" for the pickle module.

Extensive comments about the pickle protocols and pickle-machine opcodes
can be found here.  Some functions meant for external use:

genops(pickle)
   Generate all the opcodes in a pickle, as (opcode, arg, position) triples.

dis(pickle, out=None, memo=None, indentlevel=4)
   Print a symbolic disassembly of a pickle.

Function Documentation

def pickletools._test ( ) [private]

Definition at line 2361 of file pickletools.py.

02361 
02362 def _test():
02363     import doctest
02364     return doctest.testmod()

Here is the call graph for this function:

def pickletools.assure_pickle_consistency (   verbose = False)

Definition at line 1783 of file pickletools.py.

01783 
01784 def assure_pickle_consistency(verbose=False):
01785 
01786     copy = code2op.copy()
01787     for name in pickle.__all__:
01788         if not re.match("[A-Z][A-Z0-9_]+$", name):
01789             if verbose:
01790                 print("skipping %r: it doesn't look like an opcode name" % name)
01791             continue
01792         picklecode = getattr(pickle, name)
01793         if not isinstance(picklecode, bytes) or len(picklecode) != 1:
01794             if verbose:
01795                 print(("skipping %r: value %r doesn't look like a pickle "
01796                        "code" % (name, picklecode)))
01797             continue
01798         picklecode = picklecode.decode("latin-1")
01799         if picklecode in copy:
01800             if verbose:
01801                 print("checking name %r w/ code %r for consistency" % (
01802                       name, picklecode))
01803             d = copy[picklecode]
01804             if d.name != name:
01805                 raise ValueError("for pickle code %r, pickle.py uses name %r "
01806                                  "but we're using name %r" % (picklecode,
01807                                                               name,
01808                                                               d.name))
01809             # Forget this one.  Any left over in copy at the end are a problem
01810             # of a different kind.
01811             del copy[picklecode]
01812         else:
01813             raise ValueError("pickle.py appears to have a pickle opcode with "
01814                              "name %r and code %r, but we don't" %
01815                              (name, picklecode))
01816     if copy:
01817         msg = ["we appear to have pickle opcodes that pickle.py doesn't have:"]
01818         for code, d in copy.items():
01819             msg.append("    name %r with code %r" % (d.name, code))
01820         raise ValueError("\n".join(msg))
01821 
assure_pickle_consistency()

Here is the call graph for this function:

def pickletools.dis (   pickle,
  out = None,
  memo = None,
  indentlevel = 4,
  annotate = 0 
)

A symbolic pickle disassembler.

Produce a symbolic disassembly of a pickle.

'pickle' is a file-like object, or string, containing a (at least one)
pickle.  The pickle is disassembled from the current position, through
the first STOP opcode encountered.

Optional arg 'out' is a file-like object to which the disassembly is
printed.  It defaults to sys.stdout.

Optional arg 'memo' is a Python dict, used as the pickle's memo.  It
may be mutated by dis(), if the pickle contains PUT or BINPUT opcodes.
Passing the same memo object to another dis() call then allows disassembly
to proceed across multiple pickles that were all created by the same
pickler with the same memo.  Ordinarily you don't need to worry about this.

Optional arg 'indentlevel' is the number of blanks by which to indent
a new MARK level.  It defaults to 4.

Optional arg 'annotate' if nonzero instructs dis() to add short
description of the opcode on each line of disassembled output.
The value given to 'annotate' must be an integer and is used as a
hint for the column where annotation should start.  The default
value is 0, meaning no annotations.

In addition to printing the disassembly, some sanity checks are made:

+ All embedded opcode arguments "make sense".

+ Explicit and implicit pop operations have enough items on the stack.

+ When an opcode implicitly refers to a markobject, a markobject is
  actually on the stack.

+ A memo entry isn't referenced before it's defined.

+ The markobject isn't stored in the memo.

+ A memo entry isn't redefined.

Definition at line 1910 of file pickletools.py.

01910 
01911 def dis(pickle, out=None, memo=None, indentlevel=4, annotate=0):
01912     """Produce a symbolic disassembly of a pickle.
01913 
01914     'pickle' is a file-like object, or string, containing a (at least one)
01915     pickle.  The pickle is disassembled from the current position, through
01916     the first STOP opcode encountered.
01917 
01918     Optional arg 'out' is a file-like object to which the disassembly is
01919     printed.  It defaults to sys.stdout.
01920 
01921     Optional arg 'memo' is a Python dict, used as the pickle's memo.  It
01922     may be mutated by dis(), if the pickle contains PUT or BINPUT opcodes.
01923     Passing the same memo object to another dis() call then allows disassembly
01924     to proceed across multiple pickles that were all created by the same
01925     pickler with the same memo.  Ordinarily you don't need to worry about this.
01926 
01927     Optional arg 'indentlevel' is the number of blanks by which to indent
01928     a new MARK level.  It defaults to 4.
01929 
01930     Optional arg 'annotate' if nonzero instructs dis() to add short
01931     description of the opcode on each line of disassembled output.
01932     The value given to 'annotate' must be an integer and is used as a
01933     hint for the column where annotation should start.  The default
01934     value is 0, meaning no annotations.
01935 
01936     In addition to printing the disassembly, some sanity checks are made:
01937 
01938     + All embedded opcode arguments "make sense".
01939 
01940     + Explicit and implicit pop operations have enough items on the stack.
01941 
01942     + When an opcode implicitly refers to a markobject, a markobject is
01943       actually on the stack.
01944 
01945     + A memo entry isn't referenced before it's defined.
01946 
01947     + The markobject isn't stored in the memo.
01948 
01949     + A memo entry isn't redefined.
01950     """
01951 
01952     # Most of the hair here is for sanity checks, but most of it is needed
01953     # anyway to detect when a protocol 0 POP takes a MARK off the stack
01954     # (which in turn is needed to indent MARK blocks correctly).
01955 
01956     stack = []          # crude emulation of unpickler stack
01957     if memo is None:
01958         memo = {}       # crude emulation of unpicker memo
01959     maxproto = -1       # max protocol number seen
01960     markstack = []      # bytecode positions of MARK opcodes
01961     indentchunk = ' ' * indentlevel
01962     errormsg = None
01963     annocol = annotate  # columnt hint for annotations
01964     for opcode, arg, pos in genops(pickle):
01965         if pos is not None:
01966             print("%5d:" % pos, end=' ', file=out)
01967 
01968         line = "%-4s %s%s" % (repr(opcode.code)[1:-1],
01969                               indentchunk * len(markstack),
01970                               opcode.name)
01971 
01972         maxproto = max(maxproto, opcode.proto)
01973         before = opcode.stack_before    # don't mutate
01974         after = opcode.stack_after      # don't mutate
01975         numtopop = len(before)
01976 
01977         # See whether a MARK should be popped.
01978         markmsg = None
01979         if markobject in before or (opcode.name == "POP" and
01980                                     stack and
01981                                     stack[-1] is markobject):
01982             assert markobject not in after
01983             if __debug__:
01984                 if markobject in before:
01985                     assert before[-1] is stackslice
01986             if markstack:
01987                 markpos = markstack.pop()
01988                 if markpos is None:
01989                     markmsg = "(MARK at unknown opcode offset)"
01990                 else:
01991                     markmsg = "(MARK at %d)" % markpos
01992                 # Pop everything at and after the topmost markobject.
01993                 while stack[-1] is not markobject:
01994                     stack.pop()
01995                 stack.pop()
01996                 # Stop later code from popping too much.
01997                 try:
01998                     numtopop = before.index(markobject)
01999                 except ValueError:
02000                     assert opcode.name == "POP"
02001                     numtopop = 0
02002             else:
02003                 errormsg = markmsg = "no MARK exists on stack"
02004 
02005         # Check for correct memo usage.
02006         if opcode.name in ("PUT", "BINPUT", "LONG_BINPUT"):
02007             assert arg is not None
02008             if arg in memo:
02009                 errormsg = "memo key %r already defined" % arg
02010             elif not stack:
02011                 errormsg = "stack is empty -- can't store into memo"
02012             elif stack[-1] is markobject:
02013                 errormsg = "can't store markobject in the memo"
02014             else:
02015                 memo[arg] = stack[-1]
02016 
02017         elif opcode.name in ("GET", "BINGET", "LONG_BINGET"):
02018             if arg in memo:
02019                 assert len(after) == 1
02020                 after = [memo[arg]]     # for better stack emulation
02021             else:
02022                 errormsg = "memo key %r has never been stored into" % arg
02023 
02024         if arg is not None or markmsg:
02025             # make a mild effort to align arguments
02026             line += ' ' * (10 - len(opcode.name))
02027             if arg is not None:
02028                 line += ' ' + repr(arg)
02029             if markmsg:
02030                 line += ' ' + markmsg
02031         if annotate:
02032             line += ' ' * (annocol - len(line))
02033             # make a mild effort to align annotations
02034             annocol = len(line)
02035             if annocol > 50:
02036                 annocol = annotate
02037             line += ' ' + opcode.doc.split('\n', 1)[0]
02038         print(line, file=out)
02039 
02040         if errormsg:
02041             # Note that we delayed complaining until the offending opcode
02042             # was printed.
02043             raise ValueError(errormsg)
02044 
02045         # Emulate the stack effects.
02046         if len(stack) < numtopop:
02047             raise ValueError("tries to pop %d items from stack with "
02048                              "only %d items" % (numtopop, len(stack)))
02049         if numtopop:
02050             del stack[-numtopop:]
02051         if markobject in after:
02052             assert markobject not in before
02053             markstack.append(pos)
02054 
02055         stack.extend(after)
02056 
02057     print("highest protocol among opcodes =", maxproto, file=out)
02058     if stack:
02059         raise ValueError("stack not empty after STOP: %r" % stack)
02060 
# For use in the doctest, simply as an example of a class to pickle.

Here is the call graph for this function:

def pickletools.genops (   pickle)

A pickle opcode generator.

Generate all the opcodes in a pickle.

'pickle' is a file-like object, or string, containing the pickle.

Each opcode in the pickle is generated, from the current pickle position,
stopping after a STOP opcode is delivered.  A triple is generated for
each opcode:

    opcode, arg, pos

opcode is an OpcodeInfo record, describing the current opcode.

If the opcode has an argument embedded in the pickle, arg is its decoded
value, as a Python object.  If the opcode doesn't have an argument, arg
is None.

If the pickle has a tell() method, pos was the value of pickle.tell()
before reading the current opcode.  If the pickle is a bytes object,
it's wrapped in a BytesIO object, and the latter's tell() result is
used.  Else (the pickle doesn't have a tell(), and it's not obvious how
to query its current position) pos is None.

Definition at line 1827 of file pickletools.py.

01827 
01828 def genops(pickle):
01829     """Generate all the opcodes in a pickle.
01830 
01831     'pickle' is a file-like object, or string, containing the pickle.
01832 
01833     Each opcode in the pickle is generated, from the current pickle position,
01834     stopping after a STOP opcode is delivered.  A triple is generated for
01835     each opcode:
01836 
01837         opcode, arg, pos
01838 
01839     opcode is an OpcodeInfo record, describing the current opcode.
01840 
01841     If the opcode has an argument embedded in the pickle, arg is its decoded
01842     value, as a Python object.  If the opcode doesn't have an argument, arg
01843     is None.
01844 
01845     If the pickle has a tell() method, pos was the value of pickle.tell()
01846     before reading the current opcode.  If the pickle is a bytes object,
01847     it's wrapped in a BytesIO object, and the latter's tell() result is
01848     used.  Else (the pickle doesn't have a tell(), and it's not obvious how
01849     to query its current position) pos is None.
01850     """
01851 
01852     if isinstance(pickle, bytes_types):
01853         import io
01854         pickle = io.BytesIO(pickle)
01855 
01856     if hasattr(pickle, "tell"):
01857         getpos = pickle.tell
01858     else:
01859         getpos = lambda: None
01860 
01861     while True:
01862         pos = getpos()
01863         code = pickle.read(1)
01864         opcode = code2op.get(code.decode("latin-1"))
01865         if opcode is None:
01866             if code == b"":
01867                 raise ValueError("pickle exhausted before seeing STOP")
01868             else:
01869                 raise ValueError("at position %s, opcode %r unknown" % (
01870                                  pos is None and "<unknown>" or pos,
01871                                  code))
01872         if opcode.arg is None:
01873             arg = None
01874         else:
01875             arg = opcode.arg.reader(pickle)
01876         yield opcode, arg, pos
01877         if code == b'.':
01878             assert opcode.name == 'STOP'
01879             break

Here is the caller graph for this function:

def pickletools.optimize (   p)

A pickle optimizer.

Definition at line 1883 of file pickletools.py.

01883 
01884 def optimize(p):
01885     'Optimize a pickle string by removing unused PUT opcodes'
01886     gets = set()            # set of args used by a GET opcode
01887     puts = []               # (arg, startpos, stoppos) for the PUT opcodes
01888     prevpos = None          # set to pos if previous opcode was a PUT
01889     for opcode, arg, pos in genops(p):
01890         if prevpos is not None:
01891             puts.append((prevarg, prevpos, pos))
01892             prevpos = None
01893         if 'PUT' in opcode.name:
01894             prevarg, prevpos = arg, pos
01895         elif 'GET' in opcode.name:
01896             gets.add(arg)
01897 
01898     # Copy the pickle string except for PUTS without a corresponding GET
01899     s = []
01900     i = 0
01901     for arg, start, stop in puts:
01902         j = stop if (arg in gets) else start
01903         s.append(p[i:j])
01904         i = stop
01905     s.append(p[i:])
01906     return b''.join(s)

Here is the call graph for this function:

Here is the caller graph for this function:

Definition at line 518 of file pickletools.py.

00518 
00519 def read_decimalnl_long(f):
00520     r"""
00521     >>> import io
00522 
00523     >>> read_decimalnl_long(io.BytesIO(b"1234L\n56"))
00524     1234
00525 
00526     >>> read_decimalnl_long(io.BytesIO(b"123456789012345678901234L\n6"))
00527     123456789012345678901234
00528     """
00529 
00530     s = read_stringnl(f, decode=False, stripquotes=False)
00531     if s[-1:] == b'L':
00532         s = s[:-1]
00533     return int(s)
00534 

Here is the call graph for this function:

Definition at line 489 of file pickletools.py.

00489 
00490 def read_decimalnl_short(f):
00491     r"""
00492     >>> import io
00493     >>> read_decimalnl_short(io.BytesIO(b"1234\n56"))
00494     1234
00495 
00496     >>> read_decimalnl_short(io.BytesIO(b"1234L\n56"))
00497     Traceback (most recent call last):
00498     ...
00499     ValueError: trailing 'L' not allowed in b'1234L'
00500     """
00501 
00502     s = read_stringnl(f, decode=False, stripquotes=False)
00503     if s.endswith(b"L"):
00504         raise ValueError("trailing 'L' not allowed in %r" % s)
00505 
00506     # It's not necessarily true that the result fits in a Python short int:
00507     # the pickle may have been written on a 64-bit box.  There's also a hack
00508     # for True and False here.
00509     if s == b"00":
00510         return False
00511     elif s == b"01":
00512         return True
00513 
00514     try:
00515         return int(s)
00516     except OverflowError:
00517         return int(s)

Here is the call graph for this function:

Definition at line 581 of file pickletools.py.

00581 
00582 def read_float8(f):
00583     r"""
00584     >>> import io, struct
00585     >>> raw = struct.pack(">d", -1.25)
00586     >>> raw
00587     b'\xbf\xf4\x00\x00\x00\x00\x00\x00'
00588     >>> read_float8(io.BytesIO(raw + b"\n"))
00589     -1.25
00590     """
00591 
00592     data = f.read(8)
00593     if len(data) == 8:
00594         return _unpack(">d", data)[0]
00595     raise ValueError("not enough data in stream to read float8")
00596 

Definition at line 559 of file pickletools.py.

00559 
00560 def read_floatnl(f):
00561     r"""
00562     >>> import io
00563     >>> read_floatnl(io.BytesIO(b"-1.25\n6"))
00564     -1.25
00565     """
00566     s = read_stringnl(f, decode=False, stripquotes=False)
00567     return float(s)

Here is the call graph for this function:

def pickletools.read_int4 (   f)

Definition at line 247 of file pickletools.py.

00247 
00248 def read_int4(f):
00249     r"""
00250     >>> import io
00251     >>> read_int4(io.BytesIO(b'\xff\x00\x00\x00'))
00252     255
00253     >>> read_int4(io.BytesIO(b'\x00\x00\x00\x80')) == -(2**31)
00254     True
00255     """
00256 
00257     data = f.read(4)
00258     if len(data) == 4:
00259         return _unpack("<i", data)[0]
00260     raise ValueError("not enough data in stream to read int4")

Here is the caller graph for this function:

Definition at line 619 of file pickletools.py.

00619 
00620 def read_long1(f):
00621     r"""
00622     >>> import io
00623     >>> read_long1(io.BytesIO(b"\x00"))
00624     0
00625     >>> read_long1(io.BytesIO(b"\x02\xff\x00"))
00626     255
00627     >>> read_long1(io.BytesIO(b"\x02\xff\x7f"))
00628     32767
00629     >>> read_long1(io.BytesIO(b"\x02\x00\xff"))
00630     -256
00631     >>> read_long1(io.BytesIO(b"\x02\x00\x80"))
00632     -32768
00633     """
00634 
00635     n = read_uint1(f)
00636     data = f.read(n)
00637     if len(data) != n:
00638         raise ValueError("not enough data in stream to read long1")
00639     return decode_long(data)

Here is the call graph for this function:

Definition at line 651 of file pickletools.py.

00651 
00652 def read_long4(f):
00653     r"""
00654     >>> import io
00655     >>> read_long4(io.BytesIO(b"\x02\x00\x00\x00\xff\x00"))
00656     255
00657     >>> read_long4(io.BytesIO(b"\x02\x00\x00\x00\xff\x7f"))
00658     32767
00659     >>> read_long4(io.BytesIO(b"\x02\x00\x00\x00\x00\xff"))
00660     -256
00661     >>> read_long4(io.BytesIO(b"\x02\x00\x00\x00\x00\x80"))
00662     -32768
00663     >>> read_long1(io.BytesIO(b"\x00\x00\x00\x00"))
00664     0
00665     """
00666 
00667     n = read_int4(f)
00668     if n < 0:
00669         raise ValueError("long4 byte count < 0: %d" % n)
00670     data = f.read(n)
00671     if len(data) != n:
00672         raise ValueError("not enough data in stream to read long4")
00673     return decode_long(data)

Here is the call graph for this function:

Definition at line 395 of file pickletools.py.

00395 
00396 def read_string1(f):
00397     r"""
00398     >>> import io
00399     >>> read_string1(io.BytesIO(b"\x00"))
00400     ''
00401     >>> read_string1(io.BytesIO(b"\x03abcdef"))
00402     'abc'
00403     """
00404 
00405     n = read_uint1(f)
00406     assert n >= 0
00407     data = f.read(n)
00408     if len(data) == n:
00409         return data.decode("latin-1")
00410     raise ValueError("expected %d bytes in a string1, but only %d remain" %
00411                      (n, len(data)))

Here is the call graph for this function:

Definition at line 361 of file pickletools.py.

00361 
00362 def read_string4(f):
00363     r"""
00364     >>> import io
00365     >>> read_string4(io.BytesIO(b"\x00\x00\x00\x00abc"))
00366     ''
00367     >>> read_string4(io.BytesIO(b"\x03\x00\x00\x00abcdef"))
00368     'abc'
00369     >>> read_string4(io.BytesIO(b"\x00\x00\x00\x03abcdef"))
00370     Traceback (most recent call last):
00371     ...
00372     ValueError: expected 50331648 bytes in a string4, but only 6 remain
00373     """
00374 
00375     n = read_int4(f)
00376     if n < 0:
00377         raise ValueError("string4 byte count < 0: %d" % n)
00378     data = f.read(n)
00379     if len(data) == n:
00380         return data.decode("latin-1")
00381     raise ValueError("expected %d bytes in a string4, but only %d remain" %
00382                      (n, len(data)))

Here is the call graph for this function:

def pickletools.read_stringnl (   f,
  decode = True,
  stripquotes = True 
)

Definition at line 268 of file pickletools.py.

00268 
00269 def read_stringnl(f, decode=True, stripquotes=True):
00270     r"""
00271     >>> import io
00272     >>> read_stringnl(io.BytesIO(b"'abcd'\nefg\n"))
00273     'abcd'
00274 
00275     >>> read_stringnl(io.BytesIO(b"\n"))
00276     Traceback (most recent call last):
00277     ...
00278     ValueError: no string quotes around b''
00279 
00280     >>> read_stringnl(io.BytesIO(b"\n"), stripquotes=False)
00281     ''
00282 
00283     >>> read_stringnl(io.BytesIO(b"''\n"))
00284     ''
00285 
00286     >>> read_stringnl(io.BytesIO(b'"abcd"'))
00287     Traceback (most recent call last):
00288     ...
00289     ValueError: no newline found when trying to read stringnl
00290 
00291     Embedded escapes are undone in the result.
00292     >>> read_stringnl(io.BytesIO(br"'a\n\\b\x00c\td'" + b"\n'e'"))
00293     'a\n\\b\x00c\td'
00294     """
00295 
00296     data = f.readline()
00297     if not data.endswith(b'\n'):
00298         raise ValueError("no newline found when trying to read stringnl")
00299     data = data[:-1]    # lose the newline
00300 
00301     if stripquotes:
00302         for q in (b'"', b"'"):
00303             if data.startswith(q):
00304                 if not data.endswith(q):
00305                     raise ValueError("strinq quote %r not found at both "
00306                                      "ends of %r" % (q, data))
00307                 data = data[1:-1]
00308                 break
00309         else:
00310             raise ValueError("no string quotes around %r" % data)
00311 
00312     if decode:
00313         data = codecs.escape_decode(data)[0].decode("ascii")
00314     return data

Here is the call graph for this function:

Here is the caller graph for this function:

Definition at line 325 of file pickletools.py.

00325 
00326 def read_stringnl_noescape(f):
00327     return read_stringnl(f, stripquotes=False)

Here is the call graph for this function:

Here is the caller graph for this function:

Definition at line 339 of file pickletools.py.

00339 
00340 def read_stringnl_noescape_pair(f):
00341     r"""
00342     >>> import io
00343     >>> read_stringnl_noescape_pair(io.BytesIO(b"Queue\nEmpty\njunk"))
00344     'Queue Empty'
00345     """
00346 
00347     return "%s %s" % (read_stringnl_noescape(f), read_stringnl_noescape(f))

Here is the call graph for this function:

Definition at line 207 of file pickletools.py.

00207 
00208 def read_uint1(f):
00209     r"""
00210     >>> import io
00211     >>> read_uint1(io.BytesIO(b'\xff'))
00212     255
00213     """
00214 
00215     data = f.read(1)
00216     if data:
00217         return data[0]
00218     raise ValueError("not enough data in stream to read uint1")

Here is the caller graph for this function:

Definition at line 226 of file pickletools.py.

00226 
00227 def read_uint2(f):
00228     r"""
00229     >>> import io
00230     >>> read_uint2(io.BytesIO(b'\xff\x00'))
00231     255
00232     >>> read_uint2(io.BytesIO(b'\xff\xff'))
00233     65535
00234     """
00235 
00236     data = f.read(2)
00237     if len(data) == 2:
00238         return _unpack("<H", data)[0]
00239     raise ValueError("not enough data in stream to read uint2")

Definition at line 449 of file pickletools.py.

00449 
00450 def read_unicodestring4(f):
00451     r"""
00452     >>> import io
00453     >>> s = 'abcd\uabcd'
00454     >>> enc = s.encode('utf-8')
00455     >>> enc
00456     b'abcd\xea\xaf\x8d'
00457     >>> n = bytes([len(enc), 0, 0, 0])  # little-endian 4-byte length
00458     >>> t = read_unicodestring4(io.BytesIO(n + enc + b'junk'))
00459     >>> s == t
00460     True
00461 
00462     >>> read_unicodestring4(io.BytesIO(n + enc[:-1]))
00463     Traceback (most recent call last):
00464     ...
00465     ValueError: expected 7 bytes in a unicodestring4, but only 6 remain
00466     """
00467 
00468     n = read_int4(f)
00469     if n < 0:
00470         raise ValueError("unicodestring4 byte count < 0: %d" % n)
00471     data = f.read(n)
00472     if len(data) == n:
00473         return str(data, 'utf-8', 'surrogatepass')
00474     raise ValueError("expected %d bytes in a unicodestring4, but only %d "
00475                      "remain" % (n, len(data)))

Here is the call graph for this function:

Definition at line 424 of file pickletools.py.

00424 
00425 def read_unicodestringnl(f):
00426     r"""
00427     >>> import io
00428     >>> read_unicodestringnl(io.BytesIO(b"abc\\uabcd\njunk")) == 'abc\uabcd'
00429     True
00430     """
00431 
00432     data = f.readline()
00433     if not data.endswith(b'\n'):
00434         raise ValueError("no newline found when trying to read "
00435                          "unicodestringnl")
00436     data = data[:-1]    # lose the newline
00437     return str(data, 'raw-unicode-escape')


Variable Documentation

Definition at line 17 of file pickletools.py.

Initial value:
00001 {'disassembler_test': _dis_test,
00002             'disassembler_memo_test': _memo_test,
00003            }

Definition at line 2357 of file pickletools.py.

Definition at line 2065 of file pickletools.py.

Definition at line 2328 of file pickletools.py.

Definition at line 2398 of file pickletools.py.

Initial value:
00001 StackObject(
00002                 name='any',
00003                 obtype=object,
00004                 doc="Any kind of object whatsoever.")

Definition at line 785 of file pickletools.py.

pickletools.arg = None,

Definition at line 1547 of file pickletools.py.

tuple pickletools.args = parser.parse_args()

Definition at line 2394 of file pickletools.py.

Definition at line 19 of file pickletools.py.

Definition at line 1546 of file pickletools.py.

Definition at line 1758 of file pickletools.py.

Build a code2op dict, mapping opcode characters to OpcodeInfo records.

Also ensure we've got the same stuff as pickle.py, although the introspection here is dicey.

Definition at line 1778 of file pickletools.py.

Initial value:
00001 ArgumentDescriptor(
00002                      name='decimalnl_long',
00003                      n=UP_TO_NEWLINE,
00004                      reader=read_decimalnl_long,
00005                      doc="""A newline-terminated decimal integer literal.                         This has a trailing 'L', and can represent integers                         of any size.                         """)

Definition at line 548 of file pickletools.py.

Initial value:
00001 ArgumentDescriptor(
00002                       name='decimalnl_short',
00003                       n=UP_TO_NEWLINE,
00004                       reader=read_decimalnl_short,
00005                       doc="""A newline-terminated decimal integer literal.                          This never has a trailing 'L', and the integer fit                          in a short Python int on the box where the pickle                          was written -- but there's no guarantee it will fit                          in a short Python int on the box where the pickle                          is read.                          """)

Definition at line 535 of file pickletools.py.

Initial value:
00001 """Push an object built from a callable and an argument tuple.
00002 
00003       The opcode is named to remind of the __reduce__() method.
00004 
00005       Stack before: ... callable pytuple
00006       Stack after:  ... callable(*pytuple)
00007 
00008       The callable and the argument tuple are the first two items returned
00009       by a __reduce__ method.  Applying the callable to the argtuple is
00010       supposed to reproduce the original object, or at least get it started.
00011       If the __reduce__ method returns a 3-tuple, the last component is an
00012       argument to be passed to the object's __setstate__, and then the REDUCE
00013       opcode is followed by code to create setstate's argument, and then a
00014       BUILD opcode to apply  __setstate__ to that argument.
00015 
00016       If not isinstance(callable, type), REDUCE complains unless the
00017       callable has been registered with the copyreg module's
00018       safe_constructors dict, or the callable has a magic
00019       '__safe_for_unpickling__' attribute with a true value.  I'm not sure
00020       why it does this, but I've sure seen this complaint often enough when
00021       I didn't want to <wink>.
00022       """

Definition at line 1551 of file pickletools.py.

Initial value:
00001 ArgumentDescriptor(
00002              name='float8',
00003              n=8,
00004              reader=read_float8,
00005              doc="""An 8-byte binary representation of a float, big-endian.             The format is unique to Python, and shared with the struct             module (format string '>d') "in theory" (the struct and pickle             implementations don't share the code -- they should).  It's             strongly related to the IEEE-754 double format, and, in normal             cases, is in fact identical to the big-endian 754 double format.             On other boxes the dynamic range is limited to that of a 754             double, and "add a half and chop" rounding is used to reduce             the precision to 53 bits.  However, even on a 754 box,             infinities, NaNs, and minus zero may not be handled correctly             (may not survive roundtrip pickling intact).             """)

Definition at line 597 of file pickletools.py.

Initial value:
00001 ArgumentDescriptor(
00002               name='floatnl',
00003               n=UP_TO_NEWLINE,
00004               reader=read_floatnl,
00005               doc="""A newline-terminated decimal floating literal.              In general this requires 17 significant digits for roundtrip              identity, and pickling then unpickling infinities, NaNs, and              minus zero doesn't work across boxes, or on some boxes even              on itself (e.g., Windows can't read the strings it produces              for infinities or NaNs).              """)

Definition at line 568 of file pickletools.py.

string pickletools.help = 'the file where the output should be written'

Definition at line 2374 of file pickletools.py.

Definition at line 884 of file pickletools.py.

Initial value:
00001 ArgumentDescriptor(
00002            name='int4',
00003            n=4,
00004            reader=read_int4,
00005            doc="Four-byte signed integer, little-endian, 2's complement.")

Definition at line 261 of file pickletools.py.

Initial value:
00001 ArgumentDescriptor(
00002     name="long1",
00003     n=TAKEN_FROM_ARGUMENT1,
00004     reader=read_long1,
00005     doc="""A binary long, little-endian, using 1-byte size.    This first reads one byte as an unsigned size, then reads that    many bytes and interprets them as a little-endian 2's-complement long.    If the size is 0, that's taken as a shortcut for the long 0L.    """)

Definition at line 640 of file pickletools.py.

Initial value:
00001 ArgumentDescriptor(
00002     name="long4",
00003     n=TAKEN_FROM_ARGUMENT4,
00004     reader=read_long4,
00005     doc="""A binary representation of a long, little-endian.    This first reads four bytes as a signed size (but requires the    size to be >= 0), then reads that many bytes and interprets them    as a little-endian 2's-complement long.  If the size is 0, that's taken    as a shortcut for the int 0, although LONG1 should really be used    then instead (and in any case where # of bytes < 256).    """)

Definition at line 674 of file pickletools.py.

Initial value:
00001 StackObject(
00002                  name="mark",
00003                  obtype=StackObject,
00004                  doc="""'The mark' is a unique object.                 Opcodes that operate on a variable number of objects                 generally don't embed the count of objects in the opcode,                 or pull it off the stack.  Instead the MARK opcode is used                 to push a special marker object on the stack, and then                 some other opcodes grab all the objects from the top of                 the stack down to (but not including) the topmost marker                 object.                 """)

Definition at line 790 of file pickletools.py.

Definition at line 2405 of file pickletools.py.

Definition at line 1757 of file pickletools.py.

Definition at line 2371 of file pickletools.py.

Definition at line 885 of file pickletools.py.

Initial value:
00001 argparse.ArgumentParser(
00002         description='disassemble one or more pickle files')

Definition at line 2367 of file pickletools.py.

Definition at line 2407 of file pickletools.py.

Definition at line 1550 of file pickletools.py.

Initial value:
00001 StackObject(
00002              name='bool',
00003              obtype=(bool,),
00004              doc="A Python bool object.")

Definition at line 740 of file pickletools.py.

Initial value:
00001 StackObject(
00002                name='bytes',
00003                obtype=bytes,
00004                doc="A Python bytes object.")

Definition at line 755 of file pickletools.py.

Initial value:
00001 StackObject(
00002              name="dict",
00003              obtype=dict,
00004              doc="A Python dict object.")

Definition at line 780 of file pickletools.py.

Initial value:
00001 StackObject(
00002               name='float',
00003               obtype=float,
00004               doc="A Python float object.")

Definition at line 745 of file pickletools.py.

Initial value:
00001 StackObject(
00002             name='int',
00003             obtype=int,
00004             doc="A short (as opposed to long) Python integer object.")

Definition at line 724 of file pickletools.py.

Initial value:
00001 StackObject(
00002                         name='int_or_bool',
00003                         obtype=(int, bool),
00004                         doc="A Python integer object (short or long), or "
00005                             "a Python bool.")

Definition at line 734 of file pickletools.py.

Initial value:
00001 StackObject(
00002              name="list",
00003              obtype=list,
00004              doc="A Python list object.")

Definition at line 775 of file pickletools.py.

Initial value:
00001 StackObject(
00002              name='long',
00003              obtype=int,
00004              doc="A long (as opposed to short) Python integer object.")

Definition at line 729 of file pickletools.py.

Initial value:
00001 StackObject(
00002              name="None",
00003              obtype=type(None),
00004              doc="The Python None object.")

Definition at line 765 of file pickletools.py.

Initial value:
00001 StackObject(
00002                name='string',
00003                obtype=bytes,
00004                doc="A Python (8-bit) string object.")

Definition at line 750 of file pickletools.py.

Initial value:
00001 StackObject(
00002               name="tuple",
00003               obtype=tuple,
00004               doc="A Python tuple object.")

Definition at line 770 of file pickletools.py.

Initial value:
00001 StackObject(
00002                 name='str',
00003                 obtype=str,
00004                 doc="A Python (Unicode) string object.")

Definition at line 760 of file pickletools.py.

Definition at line 1549 of file pickletools.py.

Definition at line 1548 of file pickletools.py.

Initial value:
00001 StackObject(
00002                  name="stackslice",
00003                  obtype=StackObject,
00004                  doc="""An object representing a contiguous slice of the stack.                 This is used in conjuction with markobject, to represent all                 of the stack following the topmost markobject.  For example,                 the POP_MARK opcode changes the stack from                     [..., markobject, stackslice]                 to                     [...]                 No matter how many object are on the stack after the topmost                 markobject, POP_MARK gets rid of all of them (including the                 topmost markobject too).                 """)

Definition at line 804 of file pickletools.py.

Initial value:
00001 ArgumentDescriptor(
00002               name="string1",
00003               n=TAKEN_FROM_ARGUMENT1,
00004               reader=read_string1,
00005               doc="""A counted string.              The first argument is a 1-byte unsigned int giving the number              of bytes in the string, and the second argument is that many              bytes.              """)

Definition at line 412 of file pickletools.py.

Initial value:
00001 ArgumentDescriptor(
00002               name="string4",
00003               n=TAKEN_FROM_ARGUMENT4,
00004               reader=read_string4,
00005               doc="""A counted string.              The first argument is a 4-byte little-endian signed int giving              the number of bytes in the string, and the second argument is              that many bytes.              """)

Definition at line 383 of file pickletools.py.

Initial value:
00001 ArgumentDescriptor(
00002                name='stringnl',
00003                n=UP_TO_NEWLINE,
00004                reader=read_stringnl,
00005                doc="""A newline-terminated string.                   This is a repr-style string, with embedded escapes, and                   bracketing quotes.                   """)

Definition at line 315 of file pickletools.py.

Initial value:
00001 ArgumentDescriptor(
00002                         name='stringnl_noescape',
00003                         n=UP_TO_NEWLINE,
00004                         reader=read_stringnl_noescape,
00005                         doc="""A newline-terminated string.                        This is a str-style string, without embedded escapes,                        or bracketing quotes.  It should consist solely of                        printable ASCII characters.                        """)

Definition at line 328 of file pickletools.py.

Initial value:
00001 ArgumentDescriptor(
00002                              name='stringnl_noescape_pair',
00003                              n=UP_TO_NEWLINE,
00004                              reader=read_stringnl_noescape_pair,
00005                              doc="""A pair of newline-terminated strings.                             These are str-style strings, without embedded                             escapes, or bracketing quotes.  They should                             consist solely of printable ASCII characters.                             The pair is returned as a single string, with                             a single blank separating the two strings.                             """)

Definition at line 348 of file pickletools.py.

Definition at line 168 of file pickletools.py.

Definition at line 169 of file pickletools.py.

Initial value:
00001 ArgumentDescriptor(
00002             name='uint1',
00003             n=1,
00004             reader=read_uint1,
00005             doc="One-byte unsigned integer.")

Definition at line 219 of file pickletools.py.

Initial value:
00001 ArgumentDescriptor(
00002             name='uint2',
00003             n=2,
00004             reader=read_uint2,
00005             doc="Two-byte unsigned integer, little-endian.")

Definition at line 240 of file pickletools.py.

Initial value:
00001 ArgumentDescriptor(
00002                     name="unicodestring4",
00003                     n=TAKEN_FROM_ARGUMENT4,
00004                     reader=read_unicodestring4,
00005                     doc="""A counted Unicode string.                    The first argument is a 4-byte little-endian signed int                    giving the number of bytes in the string, and the second                    argument-- the UTF-8 encoding of the Unicode string --                    contains that many bytes.                    """)

Definition at line 476 of file pickletools.py.

Initial value:
00001 ArgumentDescriptor(
00002                       name='unicodestringnl',
00003                       n=UP_TO_NEWLINE,
00004                       reader=read_unicodestringnl,
00005                       doc="""A newline-terminated Unicode string.                      This is raw-unicode-escape encoded, so consists of                      printable ASCII characters, and may contain embedded                      escape sequences.                      """)

Definition at line 438 of file pickletools.py.

Some pickle opcodes have an argument, following the opcode in the bytestream.

An argument is of a specific type, described by an instance of ArgumentDescriptor. These are not to be confused with arguments taken off the stack -- ArgumentDescriptor applies only to arguments embedded in the opcode stream, immediately following an opcode.

Definition at line 164 of file pickletools.py.