Back to index

python3.2  3.2.2
Public Member Functions | Public Attributes | Static Public Attributes
distutils.text_file.TextFile Class Reference

List of all members.

Public Member Functions

def __init__
def open
def close
def gen_error
def error
def warn
def readline
def readlines
def unreadline

Public Attributes

 filename
 file
 current_line
 linebuf

Static Public Attributes

dictionary default_options

Detailed Description

Provides a file-like object that takes care of all the things you
   commonly want to do when processing a text file that has some
   line-by-line syntax: strip comments (as long as "#" is your
   comment character), skip blank lines, join adjacent lines by
   escaping the newline (ie. backslash at end of line), strip
   leading and/or trailing whitespace.  All of these are optional
   and independently controllable.

   Provides a 'warn()' method so you can generate warning messages that
   report physical line number, even if the logical line in question
   spans multiple physical lines.  Also provides 'unreadline()' for
   implementing line-at-a-time lookahead.

   Constructor is called as:

       TextFile (filename=None, file=None, **options)

   It bombs (RuntimeError) if both 'filename' and 'file' are None;
   'filename' should be a string, and 'file' a file object (or
   something that provides 'readline()' and 'close()' methods).  It is
   recommended that you supply at least 'filename', so that TextFile
   can include it in warning messages.  If 'file' is not supplied,
   TextFile creates its own using 'io.open()'.

   The options are all boolean, and affect the value returned by
   'readline()':
     strip_comments [default: true]
       strip from "#" to end-of-line, as well as any whitespace
       leading up to the "#" -- unless it is escaped by a backslash
     lstrip_ws [default: false]
       strip leading whitespace from each line before returning it
     rstrip_ws [default: true]
       strip trailing whitespace (including line terminator!) from
       each line before returning it
     skip_blanks [default: true}
       skip lines that are empty *after* stripping comments and
       whitespace.  (If both lstrip_ws and rstrip_ws are false,
       then some lines may consist of solely whitespace: these will
       *not* be skipped, even if 'skip_blanks' is true.)
     join_lines [default: false]
       if a backslash is the last non-newline character on a line
       after stripping comments and whitespace, join the following line
       to it to form one "logical line"; if N consecutive lines end
       with a backslash, then N+1 physical lines will be joined to
       form one logical line.
     collapse_join [default: false]
       strip leading whitespace from lines that are joined to their
       predecessor; only matters if (join_lines and not lstrip_ws)
     errors [default: 'strict']
       error handler used to decode the file content

   Note that since 'rstrip_ws' can strip the trailing newline, the
   semantics of 'readline()' must differ from those of the builtin file
   object's 'readline()' method!  In particular, 'readline()' returns
   None for end-of-file: an empty string might just be a blank line (or
   an all-whitespace line), if 'rstrip_ws' is true but 'skip_blanks' is
   not.

Definition at line 10 of file text_file.py.


Constructor & Destructor Documentation

def distutils.text_file.TextFile.__init__ (   self,
  filename = None,
  file = None,
  options 
)
Construct a new TextFile object.  At least one of 'filename'
   (a string) and 'file' (a file-like object) must be supplied.
   They keyword argument options are described above and affect
   the values returned by 'readline()'.

Definition at line 78 of file text_file.py.

00078 
00079     def __init__(self, filename=None, file=None, **options):
00080         """Construct a new TextFile object.  At least one of 'filename'
00081            (a string) and 'file' (a file-like object) must be supplied.
00082            They keyword argument options are described above and affect
00083            the values returned by 'readline()'."""
00084         if filename is None and file is None:
00085             raise RuntimeError("you must supply either or both of 'filename' and 'file'")
00086 
00087         # set values for all options -- either from client option hash
00088         # or fallback to default_options
00089         for opt in self.default_options.keys():
00090             if opt in options:
00091                 setattr(self, opt, options[opt])
00092             else:
00093                 setattr(self, opt, self.default_options[opt])
00094 
00095         # sanity check client option hash
00096         for opt in options.keys():
00097             if opt not in self.default_options:
00098                 raise KeyError("invalid TextFile option '%s'" % opt)
00099 
00100         if file is None:
00101             self.open(filename)
00102         else:
00103             self.filename = filename
00104             self.file = file
00105             self.current_line = 0       # assuming that file is at BOF!
00106 
00107         # 'linebuf' is a stack of lines that will be emptied before we
00108         # actually read from the file; it's only populated by an
00109         # 'unreadline()' operation
00110         self.linebuf = []

Here is the call graph for this function:

Here is the caller graph for this function:


Member Function Documentation

Close the current file and forget everything we know about it
   (filename, current line number).

Definition at line 118 of file text_file.py.

00118 
00119     def close(self):
00120         """Close the current file and forget everything we know about it
00121            (filename, current line number)."""
00122         self.file.close()
00123         self.file = None
00124         self.filename = None
00125         self.current_line = None

def distutils.text_file.TextFile.error (   self,
  msg,
  line = None 
)

Definition at line 138 of file text_file.py.

00138 
00139     def error(self, msg, line=None):
00140         raise ValueError("error: " + self.gen_error(msg, line))

Here is the call graph for this function:

Here is the caller graph for this function:

def distutils.text_file.TextFile.gen_error (   self,
  msg,
  line = None 
)

Definition at line 126 of file text_file.py.

00126 
00127     def gen_error(self, msg, line=None):
00128         outmsg = []
00129         if line is None:
00130             line = self.current_line
00131         outmsg.append(self.filename + ", ")
00132         if isinstance(line, (list, tuple)):
00133             outmsg.append("lines %d-%d: " % tuple(line))
00134         else:
00135             outmsg.append("line %d: " % line)
00136         outmsg.append(str(msg))
00137         return "".join(outmsg)

Here is the caller graph for this function:

def distutils.text_file.TextFile.open (   self,
  filename 
)
Open a new file named 'filename'.  This overrides both the
   'filename' and 'file' arguments to the constructor.

Definition at line 111 of file text_file.py.

00111 
00112     def open(self, filename):
00113         """Open a new file named 'filename'.  This overrides both the
00114            'filename' and 'file' arguments to the constructor."""
00115         self.filename = filename
00116         self.file = io.open(self.filename, 'r', errors=self.errors)
00117         self.current_line = 0

Read and return a single logical line from the current file (or
   from an internal buffer if lines have previously been "unread"
   with 'unreadline()').  If the 'join_lines' option is true, this
   may involve reading multiple physical lines concatenated into a
   single string.  Updates the current line number, so calling
   'warn()' after 'readline()' emits a warning about the physical
   line(s) just read.  Returns None on end-of-file, since the empty
   string can occur if 'rstrip_ws' is true but 'strip_blanks' is
   not.

Definition at line 151 of file text_file.py.

00151 
00152     def readline(self):
00153         """Read and return a single logical line from the current file (or
00154            from an internal buffer if lines have previously been "unread"
00155            with 'unreadline()').  If the 'join_lines' option is true, this
00156            may involve reading multiple physical lines concatenated into a
00157            single string.  Updates the current line number, so calling
00158            'warn()' after 'readline()' emits a warning about the physical
00159            line(s) just read.  Returns None on end-of-file, since the empty
00160            string can occur if 'rstrip_ws' is true but 'strip_blanks' is
00161            not."""
00162         # If any "unread" lines waiting in 'linebuf', return the top
00163         # one.  (We don't actually buffer read-ahead data -- lines only
00164         # get put in 'linebuf' if the client explicitly does an
00165         # 'unreadline()'.
00166         if self.linebuf:
00167             line = self.linebuf[-1]
00168             del self.linebuf[-1]
00169             return line
00170 
00171         buildup_line = ''
00172 
00173         while True:
00174             # read the line, make it None if EOF
00175             line = self.file.readline()
00176             if line == '':
00177                 line = None
00178 
00179             if self.strip_comments and line:
00180 
00181                 # Look for the first "#" in the line.  If none, never
00182                 # mind.  If we find one and it's the first character, or
00183                 # is not preceded by "\", then it starts a comment --
00184                 # strip the comment, strip whitespace before it, and
00185                 # carry on.  Otherwise, it's just an escaped "#", so
00186                 # unescape it (and any other escaped "#"'s that might be
00187                 # lurking in there) and otherwise leave the line alone.
00188 
00189                 pos = line.find("#")
00190                 if pos == -1: # no "#" -- no comments
00191                     pass
00192 
00193                 # It's definitely a comment -- either "#" is the first
00194                 # character, or it's elsewhere and unescaped.
00195                 elif pos == 0 or line[pos-1] != "\\":
00196                     # Have to preserve the trailing newline, because it's
00197                     # the job of a later step (rstrip_ws) to remove it --
00198                     # and if rstrip_ws is false, we'd better preserve it!
00199                     # (NB. this means that if the final line is all comment
00200                     # and has no trailing newline, we will think that it's
00201                     # EOF; I think that's OK.)
00202                     eol = (line[-1] == '\n') and '\n' or ''
00203                     line = line[0:pos] + eol
00204 
00205                     # If all that's left is whitespace, then skip line
00206                     # *now*, before we try to join it to 'buildup_line' --
00207                     # that way constructs like
00208                     #   hello \\
00209                     #   # comment that should be ignored
00210                     #   there
00211                     # result in "hello there".
00212                     if line.strip() == "":
00213                         continue
00214                 else: # it's an escaped "#"
00215                     line = line.replace("\\#", "#")
00216 
00217             # did previous line end with a backslash? then accumulate
00218             if self.join_lines and buildup_line:
00219                 # oops: end of file
00220                 if line is None:
00221                     self.warn("continuation line immediately precedes "
00222                               "end-of-file")
00223                     return buildup_line
00224 
00225                 if self.collapse_join:
00226                     line = line.lstrip()
00227                 line = buildup_line + line
00228 
00229                 # careful: pay attention to line number when incrementing it
00230                 if isinstance(self.current_line, list):
00231                     self.current_line[1] = self.current_line[1] + 1
00232                 else:
00233                     self.current_line = [self.current_line,
00234                                          self.current_line + 1]
00235             # just an ordinary line, read it as usual
00236             else:
00237                 if line is None: # eof
00238                     return None
00239 
00240                 # still have to be careful about incrementing the line number!
00241                 if isinstance(self.current_line, list):
00242                     self.current_line = self.current_line[1] + 1
00243                 else:
00244                     self.current_line = self.current_line + 1
00245 
00246             # strip whitespace however the client wants (leading and
00247             # trailing, or one or the other, or neither)
00248             if self.lstrip_ws and self.rstrip_ws:
00249                 line = line.strip()
00250             elif self.lstrip_ws:
00251                 line = line.lstrip()
00252             elif self.rstrip_ws:
00253                 line = line.rstrip()
00254 
00255             # blank line (whether we rstrip'ed or not)? skip to next line
00256             # if appropriate
00257             if (line == '' or line == '\n') and self.skip_blanks:
00258                 continue
00259 
00260             if self.join_lines:
00261                 if line[-1] == '\\':
00262                     buildup_line = line[:-1]
00263                     continue
00264 
00265                 if line[-2:] == '\\\n':
00266                     buildup_line = line[0:-2] + '\n'
00267                     continue
00268 
00269             # well, I guess there's some actual content there: return it
00270             return line

Here is the call graph for this function:

Here is the caller graph for this function:

Read and return the list of all logical lines remaining in the
   current file.

Definition at line 271 of file text_file.py.

00271 
00272     def readlines(self):
00273         """Read and return the list of all logical lines remaining in the
00274            current file."""
00275         lines = []
00276         while True:
00277             line = self.readline()
00278             if line is None:
00279                 return lines
00280             lines.append(line)

Here is the call graph for this function:

def distutils.text_file.TextFile.unreadline (   self,
  line 
)
Push 'line' (a string) onto an internal buffer that will be
   checked by future 'readline()' calls.  Handy for implementing
   a parser with line-at-a-time lookahead.

Definition at line 281 of file text_file.py.

00281 
00282     def unreadline(self, line):
00283         """Push 'line' (a string) onto an internal buffer that will be
00284            checked by future 'readline()' calls.  Handy for implementing
00285            a parser with line-at-a-time lookahead."""
00286         self.linebuf.append(line)
def distutils.text_file.TextFile.warn (   self,
  msg,
  line = None 
)
Print (to stderr) a warning message tied to the current logical
   line in the current file.  If the current logical line in the
   file spans multiple physical lines, the warning refers to the
   whole range, eg. "lines 3-5".  If 'line' supplied, it overrides
   the current line number; it may be a list or tuple to indicate a
   range of physical lines, or an integer for a single physical
   line.

Definition at line 141 of file text_file.py.

00141 
00142     def warn(self, msg, line=None):
00143         """Print (to stderr) a warning message tied to the current logical
00144            line in the current file.  If the current logical line in the
00145            file spans multiple physical lines, the warning refers to the
00146            whole range, eg. "lines 3-5".  If 'line' supplied, it overrides
00147            the current line number; it may be a list or tuple to indicate a
00148            range of physical lines, or an integer for a single physical
00149            line."""
00150         sys.stderr.write("warning: " + self.gen_error(msg, line) + "\n")

Here is the call graph for this function:

Here is the caller graph for this function:


Member Data Documentation

Definition at line 104 of file text_file.py.

Initial value:
{ 'strip_comments': 1,
                        'skip_blanks':    1,
                        'lstrip_ws':      0,
                        'rstrip_ws':      1,
                        'join_lines':     0,
                        'collapse_join':  0,
                        'errors':         'strict',
                      }

Definition at line 69 of file text_file.py.

Definition at line 103 of file text_file.py.

Definition at line 102 of file text_file.py.

Definition at line 109 of file text_file.py.


The documentation for this class was generated from the following file: