当前位置:文档之家› Python3中urllib详细使用方法_光环大数据Python培训

Python3中urllib详细使用方法_光环大数据Python培训

Python3中urllib详细使用方法_光环大数据Python培训python3 抓取网页资源的 N 种方法1、最简单import urllib.requestresponse = urllib.request.urlopen(‘/’)html = response.read()2、使用 Requestimport urllib.requestreq = urllib.request.Request(‘/’)response = urllib.request.urlopen(req)the_page = response.read()3、发送数据#! /usr/bin/env python3import urllib.parseimport urllib.requesturl = ‘http://localhost/login.php’user_agent = ‘Mozilla/4.0 (compatible; MSIE 5.5; Windows NT)’values = {‘act’ : ‘login’,‘login[email]’ : ‘yzhang@’,‘login[password]’ : ‘123456’}data = urllib.parse.urlencode(values)req = urllib.request.Request(url, data)req.add_header(‘Referer’, ‘/’)response = urllib.request.urlopen(req)the_page = response.read()print(the_page.decode(“utf8”))4、发送数据和header#! /usr/bin/env python3import urllib.parseimport urllib.requesturl = ‘http://localhost/login.php’user_agent = ‘Mozilla/4.0 (compatible; MSIE 5.5; Windows NT)’values = {‘act’ : ‘login’,‘login[email]’ : ‘yzhang@’,‘login[password]’ : ‘123456’}headers = { ‘User-Agent’ : user_agent }data = urllib.parse.urlencode(values)req = urllib.request.Request(url, data, headers)response = urllib.request.urlopen(req)the_page = response.read()print(the_page.decode(“utf8”))5、http 错误#! /usr/bin/env python3import urllib.requestreq = urlli b.request.Request(‘ ‘) try:urllib.request.urlopen(req)except urllib.error.HTTPError as e:print(e.code)print(e.read().decode(“utf8”))6、异常处理1#! /usr/bin/env python3from urllib.request import Request, urlopenfrom urllib.error import URLError, HTTPErrorreq = Request(“ /”)try:response = urlopen(req)except HTTPError as e:print(‘The server couldn’t fulfill the request.’) print(‘Error code: ‘, e.code)except URLError as e:print(‘We failed to reach a server.’)print(‘Reason: ‘, e.reason)else:print(“good!”)print(response.read().decode(“utf8”))7、异常处理2#! /usr/bin/env python3from urllib.request import Request, urlopenfrom urllib.error import URLErrorreq = Request(“ /”)try:response = urlopen(req)except URLError as e:if hasattr(e, ‘reason’):print(‘We failed to reach a server.’)print(‘Reason: ‘, e.reason)elif hasattr(e, ‘code’):print(‘The server couldn’t fulfill the request.’)print(‘Error code: ‘, e.code)else:print(“good!”)print(response.read().decode(“utf8”))8、HTTP 认证#! /usr/bin/env python3import urllib.request# create a password managerpassword_mgr = urllib.request.HTTPPasswordMgrWithDefaultRealm()# Add the username and password.# If we knew the realm, we could use it instead of None.top_level_url = “https:// /”password_mgr.add_password(None, top_level_url, ‘rekfan’, ‘xxxxxx’)handler = urllib.request.HTTPBasicAuthHandler(password_mgr)# create “opener” (OpenerDirector insta nce)opener = urllib.request.build_opener(handler)# use the opener to fetch a URLa_url = “https:// /”x = opener.open(a_url)print(x.read())# Install the opener.# Now all calls to urllib.request.urlopen use our opener.urllib.request.install_opener(opener)a = urllib.request.urlopen(a_url).read().decode(‘utf8’)print(a)9、使用代理#! /usr/bin/env python3import urllib.requestproxy_support = urllib.request.ProxyHandler({‘sock5’: ‘localhost:1080’})opener = urllib.request.build_opener(proxy_support)urllib.request.install_opener(opener)a = urllib.request.urlopen(“ “).read().decode(“utf8”)print(a)10、超时#! /usr/bin/env python3import socketimport urllib.request# timeout in secondstimeout = 2socket.setdefaulttimeout(timeout)# this call to urllib.request.urlopen now uses the default timeout # we have set in the socket modulereq = urllib.request.Request(‘ /’)a = urllib.request.urlopen(req).read()print(a)为什么大家选择光环大数据!大数据培训、人工智能培训、Python培训、大数据培训机构、大数据培训班、数据分析培训、大数据可视化培训,就选光环大数据!光环大数据,聘请大数据领域具有多年经验的讲师,提高教学的整体质量与教学水准。

讲师团及时掌握时代的技术,将时新的技能融入教学中,让学生所学知识顺应时代所需。

通过深入浅出、通俗易懂的教学方式,指导学生较快的掌握技能知识,帮助莘莘学子实现就业梦想。

光环大数据启动了推进人工智能人才发展的“AI智客计划”。

光环大数据专注国内大数据和人工智能培训,将在人工智能和大数据领域深度合作。

未来三年,光环大数据将联合国内百所大学,通过“AI智客计划”,共同推动人工智能产业人才生态建设,培养和认证5-10万名AI大数据领域的人才。

参加“AI智客计划”,享2000元助学金!【报名方式、详情咨询】光环大数据网站报名:手机报名链接:http:// /mobile/。

相关主题