python 爬虫之requests进阶
python 爬虫之requests进阶
迫不及待了吗?本页内容为如何入门Requests提供了很好的指引。其假设你已经安装了Requests。如果还没有, 去 安装 一节看看吧。
首先,确认一下:
让我们从一些简单的示例开始吧。
发送请求
使用requests发送网络请求非常简单。
一开始要导入Requests模块:
>>> import requests
然后,尝试获取某个网页。本例子中,我们来获取Github的公共时间线
>>> r = requests.get(\’https://github.com/timeline.json\’)
现在,我们有一个名为 r 的 Response 对象。可以从这个对象中获取所有我们想要的信息。
Requests简便的API意味着所有HTTP请求类型都是显而易见的。例如,你可以这样发送一个HTTP POST请求:
>>> r = requests.post(“http://httpbin.org/post“)
漂亮,对吧?那么其他HTTP请求类型:PUT, DELETE, HEAD以及OPTIONS又是如何的呢?都是一样的简单:
>>> r = requests.put("http://httpbin.org/put")
>>> r = requests.delete("http://httpbin.org/delete")
>>> r = requests.head("http://httpbin.org/get")
>>> r = requests.options("http://httpbin.org/get")
都很不错吧,但这也仅是Requests的冰山一角呢。
为URL传递参数
你也许经常想为URL的查询字符串(query string)传递某种数据。如果你是手工构建URL,那么数据会以键/值 对的形式置于URL中,跟在一个问号的后面。例如,httpbin.org/get?key=val 。 Requests允许你使用 params 关键字参数,以一个字典来提供这些参数。举例来说,如果你想传递 key1=value1 和 key2=value2 到 httpbin.org/get ,那么你可以使用如下代码:
payload = {\'key1\': \'value1\', \'key2\': \'value2\'}
r = requests.get("http://httpbin.org/get", params=payload)
通过打印输出该URL,你能看到URL已被正确编码:
print(r.url)
http://httpbin.org/get?key1=value1&key2=value2
响应内容
我们能读取服务器响应的内容。再次以Github时间线为例:
import requests
r = requests.get(\'https://github.com/timeline.json\')
print(r.text)
{"message":"Hello there, wayfaring stranger. If you’re reading this then you probably didn’t see our blog post a couple of years back announcing that this API would go away: http://git.io/17AROg Fear not, you should be able to get what you need from the shiny new Events API instead.","documentation_url":"https://developer.github.com/v3/activity/events/#list-public-events"}
Requests会自动解码来自服务器的内容。大多数unicode字符集都能被无缝地解码。
请求发出后,Requests会基于HTTP头部对响应的编码作出有根据的推测。当你访问r.text 之时,Requests会使用其推测的文本编码。你可以找出Requests使用了什么编码,并且能够使用 r.encoding 属性来改变它:
>>> r.encoding
\'utf-8\'
>>> r.encoding = \'ISO-8859-1\'
如果你改变了编码,每当你访问 r.text ,Request都将会使用 r.encoding 的新值。
在你需要的情况下,Requests也可以使用定制的编码。如果你创建了自己的编码,并使用codecs 模块进行注册,你就可以轻松地使用这个解码器名称作为 r.encoding 的值, 然后由Requests来为你处理编码。
二进制响应内容
你也能以字节的方式访问请求响应体,对于非文本请求:
import requests
r = requests.get(\'https://images.unsplash.com/photo-1583008440670-ddb4c98d6966?ixlib=rb-1.2.1&ixid=eyJhcHBfaWQiOjEyMDd9&auto=format&fit=crop&w=679&q=80\')
print(r.content)
b\’\xff\xd8\xff\xe0\x00\x10JFIF\x00\x01\x01\x01\x00H\x00H\x00\x00\xff\xe2\x0cXICC_PROFILE\x00\x01\x01\x00\x00\x0cHLino\x02\x10\x00\x00mntrRGB XYZ……….\’
Requests会自动为你解码 gzip 和 deflate 传输编码的响应数据。
例如,以请求返回的二进制数据创建一张图片,你可以使用如下代码:
import requests
r = requests.get(\'https://images.unsplash.com/photo-1583008440670-ddb4c98d6966?ixlib=rb-1.2.1&ixid=eyJhcHBfaWQiOjEyMDd9&auto=format&fit=crop&w=679&q=80\')
f = open("demo04.png", "wb")
f.write(r.content)
f.close()
JSON响应内容
Requests中也有一个内置的JSON解码器,助你处理JSON数据:
import requests
r = requests.get(\'https://api.jinse.com/v6/www/information/list?catelogue_key=news&limit=23&information_id=18787201&flag=down&version=9.9.9&_source=www\')
print(r.json())
{\'top_id\': 18786657, \'list\': [{\'id\': 18786657, \'short_title\': \'爆仓归零 翻盘暴富 这三位期货大户你认识吗?\', \'title\': \'爆仓归零 翻盘暴富 这三位期货大户你认识吗?\', \'type\': 1, \'type_name\': \'\', \'is_top\': False, \'extra\': {\'source\': \'巴比特资讯\', \'title\': \'爆仓归零 翻盘暴富 这三位期货大户你认识吗?\', \'topic_type\': 1, \'topic_id\': 597083, \'version\': \'9.9.9\', \'extra_type\': 0, \'thumbnail_pic\': \'https://img.jinse.com/3009458\', \'author_avatar\': \'https://img.jinse.com/1488716_image20.png\', \'short_title\': \'爆仓归零 翻盘暴富 这三位期货大户你认识吗?\', \'thumbnails_pics\': [\'https://img.jinse.com/3009458_small.png\'], \'read_number_yuan\': 134120, \'topic_url\': \'https://www.jinse.com/blockchain/597083.html\', \'attribute_depth\': \'\', \'read_number\': 134120, \'attribute_spread\': \'\', \'author\': \'巴比特资讯\', \'author_id\': 117427, \'thumbnail_type\': 1, \'summary\': \'你是否曾经跟着他们炒期货?你因此获利了吗?\', \'video_url\': \'\', \'child_id\': 597083, \'attribute_exclusive\': \'\', \'published_at\': 1582887478, \'attribute_live_broadcast\': \'\', \'author_level\': 5}, \'word_blocks\': [], \'order\': 0}, {\'id\': 18783809, \'short_title\': \'金色百科 | OKEx遭遇的DDos攻击是什么?影响有多大?\', \'title\': \'金色百科 | OKEx遭遇的DDos攻击是什么?影响有多大?\', \'type\': 1, \'type_name\': \'\', \'is_top\': False, \'extra\': {\'source\': None, \'title\': \'金色百科 | OKEx遭遇的DDos攻击是什么?影响有多大?\', \'topic_type\': 1, \'topic_id\': 596994, \'version\': \'9.9.9\', \'extra_type\': 0, \'thumbnail_pic\': \'https://img.jinse.com/3009096\', \'author_avatar\': \'https://img.jinse.com/436341_image20.png\', \'short_title\': \'金色百科 | OKEx遭遇的DDos攻击是什么?影响有多大?\', \'thumbnails_pics\': [\'https://img.jinse.com/3009096_small.png\'], \'read_number_yuan\': 174538, \'topic_url\': \'https://www.jinse.com/news/blockchain/596994.html\', \'attribute_depth\': \'\', \'read_number\': 174538, \'attribute_spread\': \'\', \'author\': \'金色财经 meio\', \'author_id\': 41356, \'thumbnail_type\': 1, \'summary\': \'2月28日下午2:30起,OKEx再次遭受多轮DDOS攻击,单次最高流量峰值甚至达到400G。\', \'video_url\': \'\', \'child_id\': 596994, \'attribute_exclusive\': \'\', \'published_at\': 1582883265, \'attribute_live_broadcast\': \'\', \'author_level\': 1}, \'word_blocks\': [{\'type\': \'coin\', \'data\': {\'slug\': \'okb\', \'change_24h\': -0.0672273, \'symbol\': \'OKB\'}}], \'order\': 0}, {\'id\': 18783425, \'short_title\': \'金色观察 | 蚂蚁S19VS神马M30:一场芯片之争\', \'title\': \'金色观察 | 蚂蚁S19VS神马M30:一场芯片之争\', \'type\': 1, \'type_name\': \'\', \'is_top\': False, \'extra\': {\'source\': None, \'title\': \'金色观察 | 蚂蚁S19VS神马M30:一场芯片之争\', \'topic_type\': 1, \'topic_id\': 596982, \'version\': \'9.9.9\', \'extra_type\': 0, \'thumbnail_pic\': \'https://img.jinse.com/3008996\', \'author_avatar\': \'https://img.jinse.com/1154697_image20.png\', \'short_title\': \'金色观察 | 蚂蚁S19VS神马M30:一场芯片之争\', \'thumbnails_pics\': [\'https://img.jinse.com/3008996_small.png\'], \'read_number_yuan\': 144242, \'topic_url\': \'https://www.jinse.com/news/blockchain/596982.html\', \'attribute_depth\': \'\', \'read_number\': 144242, \'attribute_spread\': \'\', \'author\': \'金色财经 罐罐儿\', \'author_id\': 197000, \'thumbnail_type\': 1, \'summary\': \'S19与M30的比拼,或将成为2020年矿机行业的“终局之战”。\', \'video_url\': \'\', \'child_id\': 596982, \'attribute_exclusive\': \'\', \'published_at\': 1582881592, \'attribute_live_broadcast\': \'\', \'author_level\': 1}, \'word_blocks\': [], \'order\': 0}, {\'id\': 18782913, \'short_title\': \'Coinbase已限制员工到中国 发布四阶段“抗疫”行动\', \'title\': \'Coinbase已限制员工到中国等地 发布四阶段“抗疫”行动\', \'type\': 1, \'type_name\': \'\', \'is_top\': False, \'extra\': {\'source\': \'区块链前哨\', \'title\': \'Coinbase已限制员工到中国等地 发布四阶段“抗疫”行动\', \'topic_type\': 1, \'topic_id\': 596966, \'version\': \'9.9.9\', \'extra_type\': 0, \'thumbnail_pic\': \'https://img.jinse.com/3008932\', \'author_avatar\': \'https://img.jinse.com/2959898_image20.png\', \'short_title\': \'Coinbase已限制员工到中国 发布四阶段“抗疫”行动\', \'thumbnails_pics\': [\'https://img.jinse.com/3008932_small.png\'], \'read_number_yuan\': 129276, \'topic_url\': \'https://www.jinse.com/blockchain/596966.html\', \'attribute_depth\': \'\', \'read_number\': 129276, \'attribute_spread\': \'\', \'author\': \'区块链前哨\', \'author_id\': 411594, \'thumbnail_type\': 1, \'summary\': \'疫情面前,人人远程。\', \'video_url\': \'\', \'child_id\': 596966, \'attribute_exclusive\': \'\', \'published_at\': 1582881126, \'attribute_live_broadcast\': \'\', \'author_level\': 3}, \'word_blocks\': [], \'order\': 0}, {\'id\': 18780769, \'short_title\': \'金色荐读 | 全景式分析 MakerDAO 治理体系\', \'title\': \'金色荐读 | 全景式分析 MakerDAO 治理体系:以 2% 治理事件为例\', \'type\': 1, \'type_name\': \'\', \'is_top\': False, \'extra\': {\'source\': None, \'title\': \'金色荐读 | 全景式分析 MakerDAO 治理体系:以 2% 治理事件为例\', \'topic_type\': 1, \'topic_id\': 596899, \'version\': \'9.9.9\', \'extra_type\': 0, \'thumbnail_pic\': \'https://img.jinse.com/3008627\', \'author_avatar\': \'https://img.jinse.com/2832519_image20.png\', \'short_title\': \'金色荐读 | 全景式分析 MakerDAO 治理体系\', \'thumbnails_pics\': [\'https://img.jinse.com/3008627_small.png\'], \'read_number_yuan\': 127067, \'topic_url\': \'https://www.jinse.com/news/blockchain/596899.html\', \'attribute_depth\': \'\', \'read_number\': 127067, \'attribute_spread\': \'\', \'author\': \'曹寅\', \'author_id\': 401948, \'thumbnail_type\': 1, \'summary\': \'从传统政治学角度,如何看待 DeFi 治理体系?我们应用政治学研究方法 GAF 框架,从 2% 治理事件入手,全景式分析 MakerDAO 治理体系。\', \'video_url\': \'\', \'child_id\': 596899, \'attribute_exclusive\': \'\', \'published_at\': 1582876925, \'attribute_live_broadcast\': \'\', \'author_level\': 3}, \'word_blocks\': [{\'type\': \'coin\', \'data\': {\'slug\': \'maker\', \'change_24h\': -0.0311574, \'symbol\': \'MKR\'}}], \'order\': 0}, {\'id\': 18779841, \'short_title\': \'如何在不确定的市场中录找确定性趋势和拐点\', \'title\': \'如何在不确定的市场中录找确定性趋势和拐点\', \'type\': 1, \'type_name\': \'\', \'is_top\': False, \'extra\': {\'source\': None, \'title\': \'如何在不确定的市场中录找确定性趋势和拐点\', \'topic_type\': 1, \'topic_id\': 596870, \'version\': \'9.9.9\', \'extra_type\': 0, \'thumbnail_pic\': \'https://img.jinse.com/3008517\', \'author_avatar\': \'https://img.jinse.com/575416_image20.png\', \'short_title\': \'如何在不确定的市场中录找确定性趋势和拐点\', \'thumbnails_pics\': [\'https://img.jinse.com/3008517_small.png\'], \'read_number_yuan\': 126658, \'topic_url\': \'https://www.jinse.com/news/blockchain/596870.html\', \'attribute_depth\': \'\', \'read_number\': 126658, \'attribute_spread\': \'\', \'author\': \'金色财经\', \'author_id\': 10063, \'thumbnail_type\': 1, \'summary\': \'今天我们来研究一下技术分析里最简单和有效的趋势交易法则,让你可以更快的走出第二个迷茫的阶段,进入慢慢获利的更高阶.\', \'video_url\': \'\', \'child_id\': 596870, \'attribute_exclusive\': \'\', \'published_at\': 1582874853, \'attribute_live_broadcast\': \'\', \'author_level\': 1}, \'word_blocks\': [], \'order\': 0}, {\'id\': 18779361, \'short_title\': \'Ripple公布多个亚洲合作伙伴 打开国际汇款通道\', \'title\': \'Ripple公布多个亚洲合作伙伴 打开国际汇款通道\', \'type\': 1, \'type_name\': \'\', \'is_top\': False, \'extra\': {\'source\': \'区块链铅笔Blockchain\', \'title\': \'Ripple公布多个亚洲合作伙伴 打开国际汇款通道\', \'topic_type\': 1, \'topic_id\': 596855, \'version\': \'9.9.9\', \'extra_type\': 0, \'thumbnail_pic\': \'https://img.jinse.com/3008459\', \'author_avatar\': \'https://img.jinse.com/709274_image20.png\', \'short_title\': \'Ripple公布多个亚洲合作伙伴 打开国际汇款通道\', \'thumbnails_pics\': [\'https://img.jinse.com/3008459_small.png\'], \'read_number_yuan\': 123411, \'topic_url\': \'https://www.jinse.com/blockchain/596855.html\', \'attribute_depth\': \'\', \'read_number\': 123411, \'attribute_spread\': \'\', \'author\': \'区块链铅笔Blockchain\', \'author_id\': 117299, \'thumbnail_type\': 1, \'summary\': \'一批韩国汇款和汇款公司已加入Ripple的基于区块链的金融服务网络RippleNet,以加强该国的汇款市场。\', \'video_url\': \'\', \'child_id\': 596855, \'attribute_exclusive\': \'\', \'published_at\': 1582874167, \'attribute_live_broadcast\': \'\', \'author_level\': 5}, \'word_blocks\': [{\'type\': \'coin\', \'data\': {\'slug\': \'ripple\', \'change_24h\': -0.02423, \'symbol\': \'XRP\'}}], \'order\': 0}, {\'id\': 18779169, \'short_title\': \'历史数据显示VIX上涨可预测比特币底部\', \'title\': \'历史数据显示VIX上涨可预测比特币底部\', \'type\': 1, \'type_name\': \'\', \'is_top\': False, \'extra\': {\'source\': None, \'title\': \'历史数据显示VIX上涨可预测比特币底部\', \'topic_type\': 1, \'topic_id\': 596849, \'version\': \'9.9.9\', \'extra_type\': 0, \'thumbnail_pic\': \'https://img.jinse.com/2277551\', \'author_avatar\': \'https://img.jinse.com/1884269_image20.png\', \'short_title\': \'历史数据显示VIX上涨可预测比特币底部\', \'thumbnails_pics\': [\'https://img.jinse.com/2277551_small.png\'], \'read_number_yuan\': 121462, \'topic_url\': \'https://www.jinse.com/news/blockchain/596849.html\', \'attribute_depth\': \'\', \'read_number\': 121462, \'attribute_spread\': \'\', \'author\': \'小葱APP\', \'author_id\': 113622, \'thumbnail_type\': 1, \'summary\': \'分析师称VIX指标疑显示比特币和其余加密货币已经找到或非常接近底部。\', \'video_url\': \'\', \'child_id\': 596849, \'attribute_exclusive\': \'\', \'published_at\': 1582873383, \'attribute_live_broadcast\': \'\', \'author_level\': 5}, \'word_blocks\': [{\'type\': \'coin\', \'data\': {\'slug\': \'bitcoin\', \'change_24h\': -0.00646348, \'symbol\': \'BTC\'}}], \'order\': 0}, {\'id\': 18778177, \'short_title\': \'火币研究院年度报告:区块链得到认可 产业发展趋理性\', \'title\': \'火币研究院年度报告:区块链技术得到认可 产业发展日趋理性\', \'type\': 1, \'type_name\': \'\', \'is_top\': False, \'extra\': {\'source\': None, \'title\': \'火币研究院年度报告:区块链技术得到认可 产业发展日趋理性\', \'topic_type\': 1, \'topic_id\': 596818, \'version\': \'9.9.9\', \'extra_type\': 0, \'thumbnail_pic\': \'https://img.jinse.com/2683808\', \'author_avatar\': \'https://img.jinse.com/2437995_image20.png\', \'short_title\': \'火币研究院年度报告:区块链得到认可 产业发展趋理性\', \'thumbnails_pics\': [\'https://img.jinse.com/2683808_small.png\'], \'read_number_yuan\': 120728, \'topic_url\': \'https://www.jinse.com/news/blockchain/596818.html\', \'attribute_depth\': \'\', \'read_number\': 120728, \'attribute_spread\': \'\', \'author\': \'火币资讯\', \'author_id\': 364224, \'thumbnail_type\': 1, \'summary\': \'火币研究院院长袁煜明表示,随着区块链技术的广泛应用,将显著降低整个社会运行成本、提高效率,改变人们原有认知标准,这必将引发新一轮产业格局升级。\', \'video_url\': \'\', \'child_id\': 596818, \'attribute_exclusive\': \'\', \'published_at\': 1582871095, \'attribute_live_broadcast\': \'\', \'author_level\': 2}, \'word_blocks\': [], \'order\': 0}, {\'id\': 18777473, \'short_title\': \'FTX CEO既开交易所、又当做市商 有错吗?\', \'title\': \'FTX CEO既开交易所、又当做市商 有错吗?\', \'type\': 1, \'type_name\': \'\', \'is_top\': False, \'extra\': {\'source\': \'Odaily\', \'title\': \'FTX CEO既开交易所、又当做市商 有错吗?\', \'topic_type\': 1, \'topic_id\': 596796, \'version\': \'9.9.9\', \'extra_type\': 0, \'thumbnail_pic\': \'https://img.jinse.com/3008204\', \'author_avatar\': \'https://img.jinse.com/1466646_image20.png\', \'short_title\': \'FTX CEO既开交易所、又当做市商 有错吗?\', \'thumbnails_pics\': [\'https://img.jinse.com/3008204_small.png\'], \'read_number_yuan\': 120571, \'topic_url\': \'https://www.jinse.com/blockchain/596796.html\', \'attribute_depth\': \'\', \'read_number\': 120571, \'attribute_spread\': \'\', \'author\': \'Odaily\', \'author_id\': 177195, \'thumbnail_type\': 1, \'summary\': \'不是人红是非多,而是在缺乏监管的情况下,交易所更应该要自律、公平、透明。\', \'video_url\': \'\', \'child_id\': 596796, \'attribute_exclusive\': \'\', \'published_at\': 1582868809, \'attribute_live_broadcast\': \'\', \'author_level\': 5}, \'word_blocks\': [], \'order\': 0}, {\'id\': 18776609, \'short_title\': \'全球股市持续暴跌 币圈能否独善其身?\', \'title\': \'全球股市持续暴跌 币圈能否独善其身?\', \'type\': 1, \'type_name\': \'\', \'is_top\': False, \'extra\': {\'source\': \'数字货币趋势狂人\', \'title\': \'全球股市持续暴跌 币圈能否独善其身?\', \'topic_type\': 1, \'topic_id\': 596769, \'version\': \'9.9.9\', \'extra_type\': 0, \'thumbnail_pic\': \'https://img.jinse.com/2988349\', \'author_avatar\': \'https://img.jinse.com/616555_image20.png\', \'short_title\': \'全球股市持续暴跌 币圈能否独善其身?\', \'thumbnails_pics\': [\'https://img.jinse.com/2988349_small.png\'], \'read_number_yuan\': 120806, \'topic_url\': \'https://www.jinse.com/blockchain/596769.html\', \'attribute_depth\': \'\', \'read_number\': 120806, \'attribute_spread\': \'\', \'author\': \'数字货币趋势狂人\', \'author_id\': 102375, \'thumbnail_type\': 1, \'summary\': \'疫情总会过去,市场还会回来,所以这样的下跌令狂人感到兴奋,跌下来才是上车的机会。\', \'video_url\': \'\', \'child_id\': 596769, \'attribute_exclusive\': \'\', \'published_at\': 1582868201, \'attribute_live_broadcast\': \'\', \'author_level\': 3}, \'word_blocks\': [], \'order\': 0}, {\'id\': 18776225, \'short_title\': \'全面解读闪电贷:为什么闪电攻击将成为新常态?\', \'title\': \'全面解读闪电贷:为什么闪电攻击将成为新常态?\', \'type\': 1, \'type_name\': \'\', \'is_top\': False, \'extra\': {\'source\': \'链闻ChainNews\', \'title\': \'全面解读闪电贷:为什么闪电攻击将成为新常态?\', \'topic_type\': 1, \'topic_id\': 596757, \'version\': \'9.9.9\', \'extra_type\': 0, \'thumbnail_pic\': \'https://img.jinse.com/3008098\', \'author_avatar\': \'https://img.jinse.com/890951_image20.png\', \'short_title\': \'全面解读闪电贷:为什么闪电攻击将成为新常态?\', \'thumbnails_pics\': [\'https://img.jinse.com/3008098_small.png\'], \'read_number_yuan\': 109782, \'topic_url\': \'https://www.jinse.com/blockchain/596757.html\', \'attribute_depth\': \'\', \'read_number\': 109782, \'attribute_spread\': \'\', \'author\': \'链闻ChainNews\', \'author_id\': 161265, \'thumbnail_type\': 1, \'summary\': \'对各家 DeFi 协议而言,闪电攻击意味着安全模式已经改变。\', \'video_url\': \'\', \'child_id\': 596757, \'attribute_exclusive\': \'\', \'published_at\': 1582866972, \'attribute_live_broadcast\': \'\', \'author_level\': 5}, \'word_blocks\': [], \'order\': 0}, {\'id\': 18774945, \'short_title\': \'超43%以太坊地址盈利 研究员看好ETH成为新避险工具\', \'title\': \'超43%以太坊地址盈利 研究员看好以太坊成为新避险工具\', \'type\': 1, \'type_name\': \'\', \'is_top\': False, \'extra\': {\'source\': \'中本小葱\', \'title\': \'超43%以太坊地址盈利 研究员看好以太坊成为新避险工具\', \'topic_type\': 1, \'topic_id\': 596717, \'version\': \'9.9.9\', \'extra_type\': 0, \'thumbnail_pic\': \'https://img.jinse.com/3007907\', \'author_avatar\': \'//resource.jinse.com/www/v2/img/avatar.png\', \'short_title\': \'超43%以太坊地址盈利 研究员看好ETH成为新避险工具\', \'thumbnails_pics\': [\'https://img.jinse.com/3007907_small.png\'], \'read_number_yuan\': 111259, \'topic_url\': \'https://www.jinse.com/blockchain/596717.html\', \'attribute_depth\': \'\', \'read_number\': 111259, \'attribute_spread\': \'\', \'author\': \'中本小葱\', \'author_id\': 257173, \'thumbnail_type\': 1, \'summary\': \'那些人在看好以太坊?\', \'video_url\': \'\', \'child_id\': 596717, \'attribute_exclusive\': \'\', \'published_at\': 1582863281, \'attribute_live_broadcast\': \'\', \'author_level\': 0}, \'word_blocks\': [{\'type\': \'coin\', \'data\': {\'slug\': \'ethereum\', \'change_24h\': -0.0192013, \'symbol\': \'ETH\'}}], \'order\': 0}, {\'id\': 18774817, \'short_title\': \'如何评估并降低参与数字资产投资的风险\', \'title\': \'如何评估并降低参与数字资产投资的风险\', \'type\': 1, \'type_name\': \'\', \'is_top\': False, \'extra\': {\'source\': \'Tokenin\', \'title\': \'如何评估并降低参与数字资产投资的风险\', \'topic_type\': 1, \'topic_id\': 596713, \'version\': \'9.9.9\', \'extra_type\': 0, \'thumbnail_pic\': \'https://img.jinse.com/3000180\', \'author_avatar\': \'https://img.jinse.com/1306171_image20.png\', \'short_title\': \'如何评估并降低参与数字资产投资的风险\', \'thumbnails_pics\': [\'https://img.jinse.com/3000180_small.png\'], \'read_number_yuan\': 110324, \'topic_url\': \'https://www.jinse.com/blockchain/596713.html\', \'attribute_depth\': \'\', \'read_number\': 110324, \'attribute_spread\': \'\', \'author\': \'Tokenin\', \'author_id\': 210276, \'thumbnail_type\': 1, \'summary\': \'黄金在脱离金本位以后,实际上变成了一个供需决定的这么一个资产。BTC也在朝着这个方向在走,但是毕竟在现在还处于一个早期的阶段。\', \'video_url\': \'\', \'child_id\': 596713, \'attribute_exclusive\': \'\', \'published_at\': 1582863166, \'attribute_live_broadcast\': \'\', \'author_level\': 0}, \'word_blocks\': [], \'order\': 0}, {\'id\': 18774529, \'short_title\': \'非接触更舒心 防疫时期深圳区块链电子发票新体验\', \'title\': \'非接触更舒心 防疫时期深圳区块链电子发票新体验\', \'type\': 1, \'type_name\': \'\', \'is_top\': False, \'extra\': {\'source\': None, \'title\': \'非接触更舒心 防疫时期深圳区块链电子发票新体验\', \'topic_type\': 1, \'topic_id\': 596704, \'version\': \'9.9.9\', \'extra_type\': 0, \'thumbnail_pic\': \'https://img.jinse.com/2840781\', \'author_avatar\': \'https://img.jinse.com/680634_image20.png\', \'short_title\': \'非接触更舒心 防疫时期深圳区块链电子发票新体验\', \'thumbnails_pics\': [\'https://img.jinse.com/2840781_small.png\'], \'read_number_yuan\': 101794, \'topic_url\': \'https://www.jinse.com/news/blockchain/596704.html\', \'attribute_depth\': \'\', \'read_number\': 101794, \'attribute_spread\': \'\', \'author\': \'经济日报\', \'author_id\': 112422, \'thumbnail_type\': 1, \'summary\': \'区块链技术,这一在2019年火遍了全中国的分布式账本技术在电子发票领域也贡献了其特有的科技力量。\', \'video_url\': \'\', \'child_id\': 596704, \'attribute_exclusive\': \'\', \'published_at\': 1582861222, \'attribute_live_broadcast\': \'\', \'author_level\': 5}, \'word_blocks\': [], \'order\': 0}, {\'id\': 18774497, \'short_title\': \'著名矿商竞相推出顶级矿机 应对BTC产出减半\', \'title\': \'著名矿商竞相推出顶级矿机 应对比特币网络产出减半\', \'type\': 1, \'type_name\': \'\', \'is_top\': False, \'extra\': {\'source\': None, \'title\': \'著名矿商竞相推出顶级矿机 应对比特币网络产出减半\', \'topic_type\': 1, \'topic_id\': 596703, \'version\': \'9.9.9\', \'extra_type\': 0, \'thumbnail_pic\': \'https://img.jinse.com/3007800\', \'author_avatar\': \'https://img.jinse.com/2607898_image20.png\', \'short_title\': \'著名矿商竞相推出顶级矿机 应对BTC产出减半\', \'thumbnails_pics\': [\'https://img.jinse.com/3007800_small.png\'], \'read_number_yuan\': 101591, \'topic_url\': \'https://www.jinse.com/news/blockchain/596703.html\', \'attribute_depth\': \'\', \'read_number\': 101591, \'attribute_spread\': \'\', \'author\': \'金色新闻汇\', \'author_id\': 382605, \'thumbnail_type\': 1, \'summary\': \'全球最大的两家比特币采矿设备制造商正在展开一场势均力敌的竞赛,赶在不到3个月的就要发生的比特币区块减半事件之前推出顶级矿机,以增加竞争力。\', \'video_url\': \'\', \'child_id\': 596703, \'attribute_exclusive\': \'\', \'published_at\': 1582860911, \'attribute_live_broadcast\': \'\', \'author_level\': 1}, \'word_blocks\': [{\'type\': \'coin\', \'data\': {\'slug\': \'bitcoin\', \'change_24h\': -0.00646348, \'symbol\': \'BTC\'}}], \'order\': 0}, {\'id\': 18773505, \'short_title\': \'“区块链+农业”推动产业创新发展\', \'title\': \'“区块链+农业”推动产业创新发展\', \'type\': 1, \'type_name\': \'\', \'is_top\': False, \'extra\': {\'source\': \'量观网络\', \'title\': \'“区块链+农业”推动产业创新发展\', \'topic_type\': 1, \'topic_id\': 596672, \'version\': \'9.9.9\', \'extra_type\': 0, \'thumbnail_pic\': \'https://img.jinse.com/3007625\', \'author_avatar\': \'https://img.jinse.com/2896195_image20.png\', \'short_title\': \'“区块链+农业”推动产业创新发展\', \'thumbnails_pics\': [\'https://img.jinse.com/3007625_small.png\'], \'read_number_yuan\': 101397, \'topic_url\': \'https://www.jinse.com/blockchain/596672.html\', \'attribute_depth\': \'\', \'read_number\': 101397, \'attribute_spread\': \'\', \'author\': \'量观网络\', \'author_id\': 405839, \'thumbnail_type\': 1, \'summary\': \'运用区块链技术推进数字农业建设、以信息化带动农业现代化。\', \'video_url\': \'\', \'child_id\': 596672, \'attribute_exclusive\': \'\', \'published_at\': 1582857723, \'attribute_live_broadcast\': \'\', \'author_level\': 0}, \'word_blocks\': [], \'order\': 0}, {\'id\': 18772737, \'short_title\': \'首发 | BTC与传统资产不相关 BTC期权受到机构关注\', \'title\': \'首发 | OKEX投研:比特币与传统资产不相关 BTC期权市场受到传统金融机构关注\', \'type\': 1, \'type_name\': \'\', \'is_top\': False, \'extra\': {\'source\': None, \'title\': \'首发 | OKEX投研:比特币与传统资产不相关 BTC期权市场受到传统金融机构关注\', \'topic_type\': 1, \'topic_id\': 596648, \'version\': \'9.9.9\', \'extra_type\': 0, \'thumbnail_pic\': \'https://img.jinse.com/3000663\', \'author_avatar\': \'https://img.jinse.com/2885103_image20.png\', \'short_title\': \'首发 | BTC与传统资产不相关 BTC期权受到机构关注\', \'thumbnails_pics\': [\'https://img.jinse.com/3000663_small.png\'], \'read_number_yuan\': 100006, \'topic_url\': \'https://www.jinse.com/news/blockchain/596648.html\', \'attribute_depth\': \'\', \'read_number\': 100006, \'attribute_spread\': \'\', \'author\': \'OKEx Research\', \'author_id\': 171209, \'thumbnail_type\': 1, \'summary\': \'全球风险资产在本周遭抛售,比特币仍然与传统资产类别不相关,比特币期权市场受到传统金融机构关注。\', \'video_url\': \'\', \'child_id\': 596648, \'attribute_exclusive\': \'\', \'published_at\': 1582856241, \'attribute_live_broadcast\': \'\', \'author_level\': 2}, \'word_blocks\': [{\'type\': \'coin\', \'data\': {\'slug\': \'bitcoin\', \'change_24h\': -0.00646348, \'symbol\': \'BTC\'}}], \'order\': 0}, {\'id\': 18772449, \'short_title\': \'区块链早餐2.28:OKEx短暂遭DDoS攻击 FCoin重启工作正式开始\', \'title\': \'区块链早餐2.28:OKEx短暂遭DDoS攻击 FCoin重启工作正式开始\', \'type\': 1, \'type_name\': \'\', \'is_top\': False, \'extra\': {\'source\': \'金色财经\', \'title\': \'区块链早餐2.28:OKEx短暂遭DDoS攻击 FCoin重启工作正式开始\', \'topic_type\': 1, \'topic_id\': 596639, \'version\': \'9.9.9\', \'extra_type\': 0, \'thumbnail_pic\': \'https://img.jinse.com/3007442\', \'author_avatar\': \'https://img.jinse.com/1626081_image20.png\', \'short_title\': \'\', \'thumbnails_pics\': [\'https://img.jinse.com/3007442_small.png\'], \'read_number_yuan\': 39598, \'topic_url\': \'https://www.jinse.com/blockchain/596639.html\', \'attribute_depth\': \'\', \'read_number\': 39598, \'attribute_spread\': \'\', \'author\': \'全球区块链早餐\', \'author_id\': 253635, \'thumbnail_type\': 1, \'summary\': \'OKEx短暂遭DDoS攻击\nFCoin重启工作正式开始\nBTC一度跌破8600美元\', \'video_url\': \'\', \'child_id\': 596639, \'attribute_exclusive\': \'\', \'published_at\': 1582855741, \'attribute_live_broadcast\': \'\', \'author_level\': 3}, \'word_blocks\': [{\'type\': \'coin\', \'data\': {\'slug\': \'okb\', \'change_24h\': -0.0672273, \'symbol\': \'OKB\'}}], \'order\': 0}, {\'id\': 18772385, \'short_title\': \'首发 | 百度财报体现区块链 BaaS平台成为新战略重点\', \'title\': \'首发 | 百度财报体现区块链 BaaS平台成为新战略重点\', \'type\': 1, \'type_name\': \'\', \'is_top\': False, \'extra\': {\'source\': None, \'title\': \'首发 | 百度财报体现区块链 BaaS平台成为新战略重点\', \'topic_type\': 1, \'topic_id\': 596637, \'version\': \'9.9.9\', \'extra_type\': 0, \'thumbnail_pic\': \'https://img.jinse.com/3007534\', \'author_avatar\': \'https://img.jinse.com/782516_image20.png\', \'short_title\': \'首发 | 百度财报体现区块链 BaaS平台成为新战略重点\', \'thumbnails_pics\': [\'https://img.jinse.com/3007534_small.png\'], \'read_number_yuan\': 98778, \'topic_url\': \'https://www.jinse.com/news/blockchain/596637.html\', \'attribute_depth\': \'\', \'read_number\': 98778, \'attribute_spread\': \'\', \'author\': \'金色财经 美咲\', \'author_id\': 255249, \'thumbnail_type\': 1, \'summary\': \'2020年2月28日,百度(股票代码BAIDU)公布财报,其中将区块链BaaS平台相关的进展进行了单独叙述,彰显了区块链技术创新方向获得了更多的关注和投入。\', \'video_url\': \'\', \'child_id\': 596637, \'attribute_exclusive\': \'\', \'published_at\': 1582855342, \'attribute_live_broadcast\': \'\', \'author_level\': 1}, \'word_blocks\': [], \'order\': 0}, {\'id\': 18772161, \'short_title\': \'德克萨斯州的风如何掀起比特币挖矿热潮\', \'title\': \'德克萨斯州的风如何掀起比特币挖矿热潮\', \'type\': 1, \'type_name\': \'\', \'is_top\': False, \'extra\': {\'source\': None, \'title\': \'德克萨斯州的风如何掀起比特币挖矿热潮\', \'topic_type\': 1, \'topic_id\': 596630, \'version\': \'9.9.9\', \'extra_type\': 0, \'thumbnail_pic\': \'https://img.jinse.com/3007488\', \'author_avatar\': \'https://img.jinse.com/1673885_image20.png\', \'short_title\': \'德克萨斯州的风如何掀起比特币挖矿热潮\', \'thumbnails_pics\': [\'https://img.jinse.com/3007488_small.png\'], \'read_number_yuan\': 100023, \'topic_url\': \'https://www.jinse.com/news/blockchain/596630.html\', \'attribute_depth\': \'\', \'read_number\': 100023, \'attribute_spread\': \'\', \'author\': \'比推BitpushNews\', \'author_id\': 258882, \'thumbnail_type\': 1, \'summary\': \'根据Technologyreview的报道,德克萨斯州,俗称“孤星州“,拥有廉价的电力和独特的监管环境,被几个大型挖矿项目的支持者称为新型挖矿圣地。\', \'video_url\': \'\', \'child_id\': 596630, \'attribute_exclusive\': \'\', \'published_at\': 1582853914, \'attribute_live_broadcast\': \'\', \'author_level\': 5}, \'word_blocks\': [{\'type\': \'coin\', \'data\': {\'slug\': \'bitcoin\', \'change_24h\': -0.00646348, \'symbol\': \'BTC\'}}], \'order\': 0}, {\'id\': 18772129, \'short_title\': \'手握近百万BTC 揭秘巨鲸Coinbase是如何管理它们的\', \'title\': \'手握近百万BTC 揭秘巨鲸Coinbase是如何管理它们的\', \'type\': 1, \'type_name\': \'\', \'is_top\': False, \'extra\': {\'source\': \'巴比特\', \'title\': \'手握近百万BTC 揭秘巨鲸Coinbase是如何管理它们的\', \'topic_type\': 1, \'topic_id\': 596629, \'version\': \'9.9.9\', \'extra_type\': 0, \'thumbnail_pic\': \'https://img.jinse.com/3007482\', \'author_avatar\': \'https://img.jinse.com/1488716_image20.png\', \'short_title\': \'手握近百万BTC 揭秘巨鲸Coinbase是如何管理它们的\', \'thumbnails_pics\': [\'https://img.jinse.com/3007482_small.png\'], \'read_number_yuan\': 97346, \'topic_url\': \'https://www.jinse.com/blockchain/596629.html\', \'attribute_depth\': \'\', \'read_number\': 97346, \'attribute_spread\': \'\', \'author\': \'巴比特资讯\', \'author_id\': 117427, \'thumbnail_type\': 1, \'summary\': \'Coinbase Custody的工程经理Andrei Anisimov介绍了该公司如何使用比特币“父子支付”(Child-Pays-For-Parent)技术来管理他们的钱包。\', \'video_url\': \'\', \'child_id\': 596629, \'attribute_exclusive\': \'\', \'published_at\': 1582853719, \'attribute_live_broadcast\': \'\', \'author_level\': 5}, \'word_blocks\': [{\'type\': \'coin\', \'data\': {\'slug\': \'bitcoin\', \'change_24h\': -0.00646348, \'symbol\': \'BTC\'}}], \'order\': 0}, {\'id\': 18771777, \'short_title\': \'美CFTC技术咨询委员会召开公开会议讨论稳定币\', \'title\': \'美CFTC技术咨询委员会召开公开会议讨论稳定币\', \'type\': 1, \'type_name\': \'\', \'is_top\': False, \'extra\': {\'source\': \'比推BitpushNews\', \'title\': \'美CFTC技术咨询委员会召开公开会议讨论稳定币\', \'topic_type\': 1, \'topic_id\': 596618, \'version\': \'9.9.9\', \'extra_type\': 0, \'thumbnail_pic\': \'https://img.jinse.com/3007397\', \'author_avatar\': \'https://img.jinse.com/1673885_image20.png\', \'short_title\': \'美CFTC技术咨询委员会召开公开会议讨论稳定币\', \'thumbnails_pics\': [\'https://img.jinse.com/3007397_small.png\'], \'read_number_yuan\': 100047, \'topic_url\': \'https://www.jinse.com/blockchain/596618.html\', \'attribute_depth\': \'\', \'read_number\': 100047, \'attribute_spread\': \'\', \'author\': \'比推BitpushNews\', \'author_id\': 258882, \'thumbnail_type\': 1, \'summary\': \'周三美国商品期货交易委员会(CFTC)技术咨询委员举行公开会议对稳定币、加密货币保险、监管和网络安全进行讨论。\', \'video_url\': \'\', \'child_id\': 596618, \'attribute_exclusive\': \'\', \'published_at\': 1582849782, \'attribute_live_broadcast\': \'\', \'author_level\': 5}, \'word_blocks\': [], \'order\': 0}], \'total\': 0, \'news\': 23, \'count\': 23, \'bottom_id\': 18771777}
如果JSON解码失败, r.json 就会抛出一个异常。
原始响应内容
在罕见的情况下你可能想获取来自服务器的原始套接字响应,那么你可以访问 r.raw 。 如果你确实想这么干,那请你确保在初始请求中设置了 stream=True 。具体的你可以这么做:
import requests
r = requests.get(\'https://api.jinse.com/v6/www/information/list?catelogue_key=news&limit=23&information_id=18787201&flag=down&version=9.9.9&_source=www\', stream=True)
print(r.raw)
print(r.raw.read(10))
<urllib3.response.HTTPResponse object at 0x000001DCD7D72240>
b\'\x1f\x8b\x08\x00\x00\x00\x00\x00\x00\x03\'
定制请求头
如果你想为请求添加HTTP头部,只要简单地传递一个 dict 给 headers 参数就可以了。
例如,在前一个示例中我们没有指定content-type:
import json
import requests
url = \'https://api.github.com/some/endpoint\'
payload = {\'some\': \'data\'}
headers = {\'content-type\': \'application/json\'}
r = requests.post(url, data=json.dumps(payload), headers=headers)
print(r.text)
更加复杂的POST请求
通常,你想要发送一些编码为表单形式的数据—非常像一个HTML表单。 要实现这个,只需简单地传递一个字典给 data 参数。你的数据字典 在发出请求时会自动编码为表单形式:
import requests
payload = {\'key1\': \'value1\', \'key2\': \'value2\'}
r = requests.post("http://httpbin.org/post", data=payload)
print(r.text)
{
"args": {},
"data": "",
"files": {},
"form": {
"key1": "value1",
"key2": "value2"
},
"headers": {
"Accept": "*/*",
"Accept-Encoding": "gzip, deflate",
"Content-Length": "23",
"Content-Type": "application/x-www-form-urlencoded",
"Host": "httpbin.org",
"User-Agent": "python-requests/2.23.0",
"X-Amzn-Trace-Id": "Root=1-5e5c6d2a-45b01589e2cab7702ccfc542"
},
"json": null,
"origin": "125.70.79.231",
"url": "http://httpbin.org/post"
}
很多时候你想要发送的数据并非编码为表单形式的。如果你传递一个 string 而不是一个dict ,那么数据会被直接发布出去。
例如,Github API v3接受编码为JSON的POST/PATCH数据:
import json
import requests
url = \'https://api.github.com/some/endpoint\'
payload = {\'some\': \'data\'}
r = requests.post(url, data=json.dumps(payload))
print(r.text)
POST一个多部分编码(Multipart-Encoded)的文件
Requests 使得上传多部分编码文件变得很简单:
import requests
url = \'http://httpbin.org/post\'
files = {\'file\': open(\'demo11.xls\', \'rb\')}
r = requests.post(url, files=files)
print(r.text)
{
"args": {},
"data": "",
"files": {
"file": ""
},
"form": {},
"headers": {
"Accept": "*/*",
"Accept-Encoding": "gzip, deflate",
"Content-Length": "146",
"Content-Type": "multipart/form-data; boundary=200745988988d41d305427fc13c7d0a6",
"Host": "httpbin.org",
"User-Agent": "python-requests/2.23.0",
"X-Amzn-Trace-Id": "Root=1-5e5c7414-921faf7a86e17fc4793e90b4"
},
"json": null,
"origin": "125.70.79.231",
"url": "http://httpbin.org/post"
}
你可以显式地设置文件名,文件类型和请求头:
import requests
url = \'http://httpbin.org/post\'
files = {\'file\': (\'demo11.xls\', open(\'demo11.xls\', \'rb\'), \'application/vnd.ms-excel\', {\'Expires\': \'0\'})}
r = requests.post(url, files=files)
print(r.text)
{
"args": {},
"data": "",
"files": {
"file": ""
},
"form": {},
"headers": {
"Accept": "*/*",
"Accept-Encoding": "gzip, deflate",
"Content-Length": "198",
"Content-Type": "multipart/form-data; boundary=c60ca4feba0f8d6d7fa20e2997eb1782",
"Host": "httpbin.org",
"User-Agent": "python-requests/2.23.0",
"X-Amzn-Trace-Id": "Root=1-5e5c74d2-17c10890c01ec4b9235048ab"
},
"json": null,
"origin": "125.70.79.231",
"url": "http://httpbin.org/post"
}
如果你想,你也可以发送作为文件来接收的字符串:
import requests
url = \'http://httpbin.org/post\'
files = {\'file\': (\'demo11.xls\', \'some,data,to,send\nanother,row,to,send\n\')}
r = requests.post(url, files=files)
print(r.text)
{
"args": {},
"data": "",
"files": {
"file": "some,data,to,send\nanother,row,to,send\n"
},
"form": {},
"headers": {
"Accept": "*/*",
"Accept-Encoding": "gzip, deflate",
"Content-Length": "184",
"Content-Type": "multipart/form-data; boundary=6c2ccd2ad93f8edf09c53e5e087e1aea",
"Host": "httpbin.org",
"User-Agent": "python-requests/2.23.0",
"X-Amzn-Trace-Id": "Root=1-5e5c754d-7aed9fae117f661053c2c1a0"
},
"json": null,
"origin": "125.70.79.231",
"url": "http://httpbin.org/post"
}
响应状态码
我们可以检测响应状态码:
import requests
requests.get(\'http://httpbin.org/get\')
print(r.status_code)
为方便引用,Requests还附带了一个内置的状态码查询对象:
import requests
r = requests.get(\'http://httpbin.org/get\')
print(r.status_code == requests.codes.ok)
如果发送了一个失败请求(非200响应),我们可以通过 Response.raise_for_status() 来抛出异常:
import requests
bad_r = requests.get(\'http://httpbin.org/status/404\')
print(bad_r.status_code)
print(bad_r.raise_for_status())
Traceback (most recent call last):
File "D:/projects/学习项目目录/python相关/python 爬虫/python 爬虫进阶/requests_demo/demo12.py", line 4, in <module>
404
print(bad_r.raise_for_status())
File "D:\projects\学习项目目录\python相关\python 爬虫\python 爬虫进阶\venv\lib\site-packages\requests\models.py", line 941, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 404 Client Error: NOT FOUND for url: http://httpbin.org/status/404
但是,由于我们的例子中 r 的 status_code 是 200 ,当我们调用 raise_for_status() 时,得到的是:
>>> r.raise_for_status()
None
响应头
我们可以查看以一个Python字典形式展示的服务器响应头:
import requests
r = requests.get(\'http://httpbin.org/get\')
print(r.headers)
{ \'Access-Control-Allow-Credentials\': \'true\', \'Connection\': \'keep-alive\', \'Date\': \'Mon, 02 Mar 2020 03:07:37 GMT\', \'Access-Control-Allow-Origin\': \'*\',
\'Server\': \'gunicorn/19.9.0\',
\'Content-Type\': \'application/json\',
\'Content-Length\': \'306\' }
但是这个字典比较特殊:它是仅为HTTP头部而生的。根据 RFC 2616 , HTTP头部是大小写不敏感的。
因此,我们可以使用任意大写形式来访问这些响应头字段:
import requests
r = requests.get(\'http://httpbin.org/get\')
print(r.headers)
print(r.headers[\'Content-Type\'])
print(r.headers.get(\'content-type\'))
# 如果某个响应头字段不存在,那么它的默认值为 None
print(r.headers[\'X-Random\'])
Cookie
如果某个响应中包含一些 cookie,你可以快速访问它们:
url = \'https://www.baidu.com/\' r = requests.get(url) print(r.cookies) print(r.cookies[\'BDORZ\'])
要想发送你的cookies到服务器,可以使用 cookies 参数:
url = \'http://httpbin.org/cookies\' cookies = dict(cookies_are=\'xingxingnbsp\') r = requests.get(url, cookies=cookies) print(r.text) """ { "cookies": { "cookies_are": "xingxingnbsp" } } """
Cookie 的返回对象为 RequestsCookieJar,它的行为和字典类似,但接口更为完整,适合跨域名跨路径使用。你还可以把 Cookie Jar 传到 Requests 中:
import requests.cookies jar = requests.cookies.RequestsCookieJar() jar.set(\'tasty_cookie\', \'yum\', domain=\'httpbin.org\', path=\'/cookies\') jar.set(\'gross_cookie\', \'blech\', domain=\'httpbin.org\', path=\'/elsewhere\') url = \'http://httpbin.org/cookies\' r = requests.get(url, cookies=jar) print(r.text) """ { "cookies": { "tasty_cookie": "yum" } } """
重定向与请求历史
默认情况下,除了 HEAD, Requests 会自动处理所有重定向。
可以使用响应对象的 history
方法来追踪重定向。
Response.history
是一个 Response
对象的列表,为了完成请求而创建了这些对象。这个对象列表按照从最老到最近的请求进行排序。
例如,Github 将所有的 HTTP 请求重定向到 HTTPS:
import requests r = requests.get(\'http://github.com\') print(r.url) print(r.status_code) print(r.history) """ https://github.com/ 200 [<Response [301]>] """
如果你使用的是GET、OPTIONS、POST、PUT、PATCH 或者 DELETE,那么你可以通过 allow_redirects 参数禁用重定向处理:
import requests r = requests.get(\'http://github.com\',allow_redirects=False) print(r.status_code) print(r.history) """ 301 [] """
如果你使用了 HEAD,你也可以启用重定向:
import requests r = requests.head(\'http://github.com\', allow_redirects=True) print(r.url) print(r.history)
""" https://github.com/ [<Response [301]>] """
超时
你可以告诉 requests 在经过以 timeout 参数设定的秒数时间之后停止等待响应。基本上所有的生产代码都应该使用这一参数。如果不使用,你的程序可能会永远失去响应:
import requests r = requests.get(\'http://github.com\', timeout=1) """requests.exceptions.ReadTimeout: HTTPSConnectionPool(host=\'github.com\', port=443): Read timed out. (read timeout=1)
"""
注意timeout 仅对连接过程有效,与响应体的下载无关。
timeout 并不是整个下载响应的时间限制,而是如果服务器在 timeout 秒内没有应答,
将会引发一个异常(更精确地说,是在 timeout 秒内没有从基础套接字上接收到任何字节的数据时)
If no timeout is specified explicitly, requests do not time out.错误与异常遇到网络问题(如:DNS 查询失败、拒绝连接等)时,
Requests 会抛出一个 ConnectionError 异常。如果 HTTP 请求返回了不成功的状态码, Response.raise_for_status() 会抛出一个 HTTPError 异常。
若请求超时,则抛出一个 Timeout 异常。若请求超过了设定的最大重定向次数,则会抛出一个 TooManyRedirects 异常。所有Requests显式抛出的异常都继承自 requests.exceptions.RequestException 。
。。。