Back to index

python3.2  3.2.2
Public Member Functions | Public Attributes | Static Public Attributes | Private Member Functions | Private Attributes | Static Private Attributes
urllib.request.URLopener Class Reference
Inheritance diagram for urllib.request.URLopener:
Inheritance graph
[legend]
Collaboration diagram for urllib.request.URLopener:
Collaboration graph
[legend]

List of all members.

Public Member Functions

def __init__
def __del__
def close
def cleanup
def addheader
def open
def open_unknown
def open_unknown_proxy
def retrieve
def open_http
def http_error
def http_error_default
def open_https
def open_file
def open_local_file
def open_ftp
def open_data

Public Attributes

 proxies
 key_file
 cert_file
 addheaders
 tempcache
 ftpcache
 type

Static Public Attributes

string version = "Python-urllib/%s"

Private Member Functions

def _open_generic_http
def _https_connection

Private Attributes

 __unlink

Static Private Attributes

 __tempfiles = None

Detailed Description

Class to open URLs.
This is a class rather than just a subroutine because we may need
more than one set of global protocol-specific options.
Note -- this is a base class for those who don't want the
automatic handling of errors type 302 (relocated) and 401
(authorization needed).

Definition at line 1448 of file request.py.


Constructor & Destructor Documentation

def urllib.request.URLopener.__init__ (   self,
  proxies = None,
  x509 
)

Reimplemented in urllib.request.FancyURLopener.

Definition at line 1461 of file request.py.

01461 
01462     def __init__(self, proxies=None, **x509):
01463         if proxies is None:
01464             proxies = getproxies()
01465         assert hasattr(proxies, 'keys'), "proxies must be a mapping"
01466         self.proxies = proxies
01467         self.key_file = x509.get('key_file')
01468         self.cert_file = x509.get('cert_file')
01469         self.addheaders = [('User-Agent', self.version)]
01470         self.__tempfiles = []
01471         self.__unlink = os.unlink # See cleanup()
01472         self.tempcache = None
01473         # Undocumented feature: if you assign {} to tempcache,
01474         # it is used to cache files retrieved with
01475         # self.retrieve().  This is not enabled by default
01476         # since it does not work for changing documents (and I
01477         # haven't got the logic to check expiration headers
01478         # yet).
01479         self.ftpcache = ftpcache
01480         # Undocumented feature: you can use a different
01481         # ftp cache by assigning to the .ftpcache member;
01482         # in case you want logically independent URL openers
01483         # XXX This is not threadsafe.  Bah.

Here is the caller graph for this function:

Definition at line 1484 of file request.py.

01484 
01485     def __del__(self):
01486         self.close()


Member Function Documentation

def urllib.request.URLopener._https_connection (   self,
  host 
) [private]

Definition at line 1741 of file request.py.

01741 
01742         def _https_connection(self, host):
01743             return http.client.HTTPSConnection(host,
01744                                            key_file=self.key_file,
01745                                            cert_file=self.cert_file)

Here is the caller graph for this function:

def urllib.request.URLopener._open_generic_http (   self,
  connection_factory,
  url,
  data 
) [private]
Make an HTTP connection using connection_class.

This is an internal method that should be called from
open_http() or open_https().

Arguments:
- connection_factory should take a host name and return an
  HTTPConnection instance.
- url is the url to retrieval or a host, relative-path pair.
- data is payload for a POST request or None.

Definition at line 1621 of file request.py.

01621 
01622     def _open_generic_http(self, connection_factory, url, data):
01623         """Make an HTTP connection using connection_class.
01624 
01625         This is an internal method that should be called from
01626         open_http() or open_https().
01627 
01628         Arguments:
01629         - connection_factory should take a host name and return an
01630           HTTPConnection instance.
01631         - url is the url to retrieval or a host, relative-path pair.
01632         - data is payload for a POST request or None.
01633         """
01634 
01635         user_passwd = None
01636         proxy_passwd= None
01637         if isinstance(url, str):
01638             host, selector = splithost(url)
01639             if host:
01640                 user_passwd, host = splituser(host)
01641                 host = unquote(host)
01642             realhost = host
01643         else:
01644             host, selector = url
01645             # check whether the proxy contains authorization information
01646             proxy_passwd, host = splituser(host)
01647             # now we proceed with the url we want to obtain
01648             urltype, rest = splittype(selector)
01649             url = rest
01650             user_passwd = None
01651             if urltype.lower() != 'http':
01652                 realhost = None
01653             else:
01654                 realhost, rest = splithost(rest)
01655                 if realhost:
01656                     user_passwd, realhost = splituser(realhost)
01657                 if user_passwd:
01658                     selector = "%s://%s%s" % (urltype, realhost, rest)
01659                 if proxy_bypass(realhost):
01660                     host = realhost
01661 
01662             #print "proxy via http:", host, selector
01663         if not host: raise IOError('http error', 'no host given')
01664 
01665         if proxy_passwd:
01666             import base64
01667             proxy_auth = base64.b64encode(proxy_passwd.encode()).decode('ascii')
01668         else:
01669             proxy_auth = None
01670 
01671         if user_passwd:
01672             import base64
01673             auth = base64.b64encode(user_passwd.encode()).decode('ascii')
01674         else:
01675             auth = None
01676         http_conn = connection_factory(host)
01677         headers = {}
01678         if proxy_auth:
01679             headers["Proxy-Authorization"] = "Basic %s" % proxy_auth
01680         if auth:
01681             headers["Authorization"] =  "Basic %s" % auth
01682         if realhost:
01683             headers["Host"] = realhost
01684 
01685         # Add Connection:close as we don't support persistent connections yet.
01686         # This helps in closing the socket and avoiding ResourceWarning
01687 
01688         headers["Connection"] = "close"
01689 
01690         for header, value in self.addheaders:
01691             headers[header] = value
01692 
01693         if data is not None:
01694             headers["Content-Type"] = "application/x-www-form-urlencoded"
01695             http_conn.request("POST", selector, data, headers)
01696         else:
01697             http_conn.request("GET", selector, headers=headers)
01698 
01699         try:
01700             response = http_conn.getresponse()
01701         except http.client.BadStatusLine:
01702             # something went wrong with the HTTP status line
01703             raise URLError("http protocol error: bad status line")
01704 
01705         # According to RFC 2616, "2xx" code indicates that the client's
01706         # request was successfully received, understood, and accepted.
01707         if 200 <= response.status < 300:
01708             return addinfourl(response, response.msg, "http:" + url,
01709                               response.status)
01710         else:
01711             return self.http_error(
01712                 url, response.fp,
01713                 response.status, response.reason, response.msg, data)

Here is the call graph for this function:

Here is the caller graph for this function:

def urllib.request.URLopener.addheader (   self,
  args 
)
Add a header to be used by the HTTP interface only
e.g. u.addheader('Accept', 'sound/basic')

Definition at line 1504 of file request.py.

01504 
01505     def addheader(self, *args):
01506         """Add a header to be used by the HTTP interface only
01507         e.g. u.addheader('Accept', 'sound/basic')"""
01508         self.addheaders.append(args)

Definition at line 1490 of file request.py.

01490 
01491     def cleanup(self):
01492         # This code sometimes runs when the rest of this module
01493         # has already been deleted, so it can't use any globals
01494         # or import anything.
01495         if self.__tempfiles:
01496             for file in self.__tempfiles:
01497                 try:
01498                     self.__unlink(file)
01499                 except OSError:
01500                     pass
01501             del self.__tempfiles[:]
01502         if self.tempcache:
01503             self.tempcache.clear()

Here is the caller graph for this function:

Definition at line 1487 of file request.py.

01487 
01488     def close(self):
01489         self.cleanup()

Here is the call graph for this function:

Here is the caller graph for this function:

def urllib.request.URLopener.http_error (   self,
  url,
  fp,
  errcode,
  errmsg,
  headers,
  data = None 
)
Handle http errors.

Derived class can override this, or provide specific handlers
named http_error_DDD where DDD is the 3-digit error code.

Definition at line 1718 of file request.py.

01718 
01719     def http_error(self, url, fp, errcode, errmsg, headers, data=None):
01720         """Handle http errors.
01721 
01722         Derived class can override this, or provide specific handlers
01723         named http_error_DDD where DDD is the 3-digit error code."""
01724         # First check if there's a specific handler for this error
01725         name = 'http_error_%d' % errcode
01726         if hasattr(self, name):
01727             method = getattr(self, name)
01728             if data is None:
01729                 result = method(url, fp, errcode, errmsg, headers)
01730             else:
01731                 result = method(url, fp, errcode, errmsg, headers, data)
01732             if result: return result
01733         return self.http_error_default(url, fp, errcode, errmsg, headers)

Here is the call graph for this function:

Here is the caller graph for this function:

def urllib.request.URLopener.http_error_default (   self,
  url,
  fp,
  errcode,
  errmsg,
  headers 
)
Default error handler: close the connection and raise IOError.

Reimplemented in urllib.request.FancyURLopener.

Definition at line 1734 of file request.py.

01734 
01735     def http_error_default(self, url, fp, errcode, errmsg, headers):
01736         """Default error handler: close the connection and raise IOError."""
01737         void = fp.read()
01738         fp.close()
01739         raise HTTPError(url, errcode, errmsg, headers, None)

Here is the caller graph for this function:

def urllib.request.URLopener.open (   self,
  fullurl,
  data = None 
)
Use URLopener().open(file) instead of open(file, 'r').

Definition at line 1510 of file request.py.

01510 
01511     def open(self, fullurl, data=None):
01512         """Use URLopener().open(file) instead of open(file, 'r')."""
01513         fullurl = unwrap(to_bytes(fullurl))
01514         fullurl = quote(fullurl, safe="%/:=&?~#+!$,;'@()*[]|")
01515         if self.tempcache and fullurl in self.tempcache:
01516             filename, headers = self.tempcache[fullurl]
01517             fp = open(filename, 'rb')
01518             return addinfourl(fp, headers, fullurl)
01519         urltype, url = splittype(fullurl)
01520         if not urltype:
01521             urltype = 'file'
01522         if urltype in self.proxies:
01523             proxy = self.proxies[urltype]
01524             urltype, proxyhost = splittype(proxy)
01525             host, selector = splithost(proxyhost)
01526             url = (host, fullurl) # Signal special case to open_*()
01527         else:
01528             proxy = None
01529         name = 'open_' + urltype
01530         self.type = urltype
01531         name = name.replace('-', '_')
01532         if not hasattr(self, name):
01533             if proxy:
01534                 return self.open_unknown_proxy(proxy, fullurl, data)
01535             else:
01536                 return self.open_unknown(fullurl, data)
01537         try:
01538             if data is None:
01539                 return getattr(self, name)(url)
01540             else:
01541                 return getattr(self, name)(url, data)
01542         except socket.error as msg:
01543             raise IOError('socket error', msg).with_traceback(sys.exc_info()[2])

Here is the call graph for this function:

def urllib.request.URLopener.open_data (   self,
  url,
  data = None 
)
Use "data" URL.

Definition at line 1848 of file request.py.

01848 
01849     def open_data(self, url, data=None):
01850         """Use "data" URL."""
01851         if not isinstance(url, str):
01852             raise URLError('data error', 'proxy support for data protocol currently not implemented')
01853         # ignore POSTed data
01854         #
01855         # syntax of data URLs:
01856         # dataurl   := "data:" [ mediatype ] [ ";base64" ] "," data
01857         # mediatype := [ type "/" subtype ] *( ";" parameter )
01858         # data      := *urlchar
01859         # parameter := attribute "=" value
01860         try:
01861             [type, data] = url.split(',', 1)
01862         except ValueError:
01863             raise IOError('data error', 'bad data URL')
01864         if not type:
01865             type = 'text/plain;charset=US-ASCII'
01866         semi = type.rfind(';')
01867         if semi >= 0 and '=' not in type[semi:]:
01868             encoding = type[semi+1:]
01869             type = type[:semi]
01870         else:
01871             encoding = ''
01872         msg = []
01873         msg.append('Date: %s'%time.strftime('%a, %d %b %Y %H:%M:%S GMT',
01874                                             time.gmtime(time.time())))
01875         msg.append('Content-type: %s' % type)
01876         if encoding == 'base64':
01877             import base64
01878             # XXX is this encoding/decoding ok?
01879             data = base64.decodebytes(data.encode('ascii')).decode('latin1')
01880         else:
01881             data = unquote(data)
01882         msg.append('Content-Length: %d' % len(data))
01883         msg.append('')
01884         msg.append(data)
01885         msg = '\n'.join(msg)
01886         headers = email.message_from_string(msg)
01887         f = io.StringIO(msg)
01888         #f.fileno = None     # needed for addinfourl
01889         return addinfourl(f, headers, url)
01890 

Here is the call graph for this function:

def urllib.request.URLopener.open_file (   self,
  url 
)
Use local file or FTP depending on form of URL.

Definition at line 1750 of file request.py.

01750 
01751     def open_file(self, url):
01752         """Use local file or FTP depending on form of URL."""
01753         if not isinstance(url, str):
01754             raise URLError('file error', 'proxy support for file protocol currently not implemented')
01755         if url[:2] == '//' and url[2:3] != '/' and url[2:12].lower() != 'localhost/':
01756             raise ValueError("file:// scheme is supported only on localhost")
01757         else:
01758             return self.open_local_file(url)

Here is the call graph for this function:

def urllib.request.URLopener.open_ftp (   self,
  url 
)
Use FTP protocol.

Definition at line 1789 of file request.py.

01789 
01790     def open_ftp(self, url):
01791         """Use FTP protocol."""
01792         if not isinstance(url, str):
01793             raise URLError('ftp error', 'proxy support for ftp protocol currently not implemented')
01794         import mimetypes
01795         from io import StringIO
01796         host, path = splithost(url)
01797         if not host: raise URLError('ftp error', 'no host given')
01798         host, port = splitport(host)
01799         user, host = splituser(host)
01800         if user: user, passwd = splitpasswd(user)
01801         else: passwd = None
01802         host = unquote(host)
01803         user = unquote(user or '')
01804         passwd = unquote(passwd or '')
01805         host = socket.gethostbyname(host)
01806         if not port:
01807             import ftplib
01808             port = ftplib.FTP_PORT
01809         else:
01810             port = int(port)
01811         path, attrs = splitattr(path)
01812         path = unquote(path)
01813         dirs = path.split('/')
01814         dirs, file = dirs[:-1], dirs[-1]
01815         if dirs and not dirs[0]: dirs = dirs[1:]
01816         if dirs and not dirs[0]: dirs[0] = '/'
01817         key = user, host, port, '/'.join(dirs)
01818         # XXX thread unsafe!
01819         if len(self.ftpcache) > MAXFTPCACHE:
01820             # Prune the cache, rather arbitrarily
01821             for k in self.ftpcache.keys():
01822                 if k != key:
01823                     v = self.ftpcache[k]
01824                     del self.ftpcache[k]
01825                     v.close()
01826         try:
01827             if not key in self.ftpcache:
01828                 self.ftpcache[key] = \
01829                     ftpwrapper(user, passwd, host, port, dirs)
01830             if not file: type = 'D'
01831             else: type = 'I'
01832             for attr in attrs:
01833                 attr, value = splitvalue(attr)
01834                 if attr.lower() == 'type' and \
01835                    value in ('a', 'A', 'i', 'I', 'd', 'D'):
01836                     type = value.upper()
01837             (fp, retrlen) = self.ftpcache[key].retrfile(file, type)
01838             mtype = mimetypes.guess_type("ftp:" + url)[0]
01839             headers = ""
01840             if mtype:
01841                 headers += "Content-Type: %s\n" % mtype
01842             if retrlen is not None and retrlen >= 0:
01843                 headers += "Content-Length: %d\n" % retrlen
01844             headers = email.message_from_string(headers)
01845             return addinfourl(fp, headers, "ftp:" + url)
01846         except ftperrors() as msg:
01847             raise URLError('ftp error', msg).with_traceback(sys.exc_info()[2])

Here is the call graph for this function:

def urllib.request.URLopener.open_http (   self,
  url,
  data = None 
)
Use HTTP protocol.

Definition at line 1714 of file request.py.

01714 
01715     def open_http(self, url, data=None):
01716         """Use HTTP protocol."""
01717         return self._open_generic_http(http.client.HTTPConnection, url, data)

Here is the call graph for this function:

def urllib.request.URLopener.open_https (   self,
  url,
  data = None 
)
Use HTTPS protocol.

Definition at line 1746 of file request.py.

01746 
01747         def open_https(self, url, data=None):
01748             """Use HTTPS protocol."""
01749             return self._open_generic_http(self._https_connection, url, data)

Here is the call graph for this function:

def urllib.request.URLopener.open_local_file (   self,
  url 
)
Use local file.

Definition at line 1759 of file request.py.

01759 
01760     def open_local_file(self, url):
01761         """Use local file."""
01762         import mimetypes, email.utils
01763         from io import StringIO
01764         host, file = splithost(url)
01765         localname = url2pathname(file)
01766         try:
01767             stats = os.stat(localname)
01768         except OSError as e:
01769             raise URLError(e.errno, e.strerror, e.filename)
01770         size = stats.st_size
01771         modified = email.utils.formatdate(stats.st_mtime, usegmt=True)
01772         mtype = mimetypes.guess_type(url)[0]
01773         headers = email.message_from_string(
01774             'Content-Type: %s\nContent-Length: %d\nLast-modified: %s\n' %
01775             (mtype or 'text/plain', size, modified))
01776         if not host:
01777             urlfile = file
01778             if file[:1] == '/':
01779                 urlfile = 'file://' + file
01780             return addinfourl(open(localname, 'rb'), headers, urlfile)
01781         host, port = splitport(host)
01782         if (not port
01783            and socket.gethostbyname(host) in (localhost() + thishost())):
01784             urlfile = file
01785             if file[:1] == '/':
01786                 urlfile = 'file://' + file
01787             return addinfourl(open(localname, 'rb'), headers, urlfile)
01788         raise URLError('local file error', 'not on local host')

Here is the call graph for this function:

Here is the caller graph for this function:

def urllib.request.URLopener.open_unknown (   self,
  fullurl,
  data = None 
)
Overridable interface to open unknown URL type.

Definition at line 1544 of file request.py.

01544 
01545     def open_unknown(self, fullurl, data=None):
01546         """Overridable interface to open unknown URL type."""
01547         type, url = splittype(fullurl)
01548         raise IOError('url error', 'unknown url type', type)

Here is the call graph for this function:

def urllib.request.URLopener.open_unknown_proxy (   self,
  proxy,
  fullurl,
  data = None 
)
Overridable interface to open unknown URL type.

Definition at line 1549 of file request.py.

01549 
01550     def open_unknown_proxy(self, proxy, fullurl, data=None):
01551         """Overridable interface to open unknown URL type."""
01552         type, url = splittype(fullurl)
01553         raise IOError('url error', 'invalid proxy for %s' % type, proxy)

Here is the call graph for this function:

def urllib.request.URLopener.retrieve (   self,
  url,
  filename = None,
  reporthook = None,
  data = None 
)
retrieve(url) returns (filename, headers) for a local object
or (tempfilename, headers) for a remote object.

Definition at line 1555 of file request.py.

01555 
01556     def retrieve(self, url, filename=None, reporthook=None, data=None):
01557         """retrieve(url) returns (filename, headers) for a local object
01558         or (tempfilename, headers) for a remote object."""
01559         url = unwrap(to_bytes(url))
01560         if self.tempcache and url in self.tempcache:
01561             return self.tempcache[url]
01562         type, url1 = splittype(url)
01563         if filename is None and (not type or type == 'file'):
01564             try:
01565                 fp = self.open_local_file(url1)
01566                 hdrs = fp.info()
01567                 fp.close()
01568                 return url2pathname(splithost(url1)[1]), hdrs
01569             except IOError as msg:
01570                 pass
01571         fp = self.open(url, data)
01572         try:
01573             headers = fp.info()
01574             if filename:
01575                 tfp = open(filename, 'wb')
01576             else:
01577                 import tempfile
01578                 garbage, path = splittype(url)
01579                 garbage, path = splithost(path or "")
01580                 path, garbage = splitquery(path or "")
01581                 path, garbage = splitattr(path or "")
01582                 suffix = os.path.splitext(path)[1]
01583                 (fd, filename) = tempfile.mkstemp(suffix)
01584                 self.__tempfiles.append(filename)
01585                 tfp = os.fdopen(fd, 'wb')
01586             try:
01587                 result = filename, headers
01588                 if self.tempcache is not None:
01589                     self.tempcache[url] = result
01590                 bs = 1024*8
01591                 size = -1
01592                 read = 0
01593                 blocknum = 0
01594                 if reporthook:
01595                     if "content-length" in headers:
01596                         size = int(headers["Content-Length"])
01597                     reporthook(blocknum, bs, size)
01598                 while 1:
01599                     block = fp.read(bs)
01600                     if not block:
01601                         break
01602                     read += len(block)
01603                     tfp.write(block)
01604                     blocknum += 1
01605                     if reporthook:
01606                         reporthook(blocknum, bs, size)
01607             finally:
01608                 tfp.close()
01609         finally:
01610             fp.close()
01611 
01612         # raise exception if actual size does not match content-length header
01613         if size >= 0 and read < size:
01614             raise ContentTooShortError(
01615                 "retrieval incomplete: got only %i out of %i bytes"
01616                 % (read, size), result)
01617 
01618         return result

Here is the call graph for this function:


Member Data Documentation

urllib.request.URLopener.__tempfiles = None [static, private]

Definition at line 1456 of file request.py.

Definition at line 1470 of file request.py.

Definition at line 1468 of file request.py.

Definition at line 1467 of file request.py.

Definition at line 1478 of file request.py.

Definition at line 1466 of file request.py.

Definition at line 1465 of file request.py.

Definition at line 1471 of file request.py.

Definition at line 1529 of file request.py.

string urllib.request.URLopener.version = "Python-urllib/%s" [static]

Definition at line 1458 of file request.py.


The documentation for this class was generated from the following file: