今天在开源中国上看到有个有写了个小程序,用来获取代理IP地址。用的beautifulsoup。

自己动手用正则重写了一下。

#!/usr/bin/python

import requests
import re

pattern=re.compile(r\'(\d+)\D(\d+)\D(\d+)\D(\d+)\D(\d+)\')

headers={\'Host\':"www.ip-adress.com",
        \'User-Agent\':"Mozilla/5.0 (Windows NT 6.1; WOW64; rv:34.0) Gecko/20100101 Firefox/34.0",
        \'Accept\':"text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8",
        \'Accept-Language\':"zh-cn,zh;q=0.8,en-us;q=0.5,en;q=0.3",
        \'Accept-Encoding\':"gzip, deflate",
        \'Referer\':"http://www.ip-adress.com/Proxy_Checker/"
}

url="http://www.ip-adress.com/proxy_list/"
req=requests.get(url,headers=headers)
content=req.content
proxy_ip=re.findall(pattern,content)
for ip in proxy_ip:
        print \'.\'.join(ip)

 

版权声明:本文为tmyyss原创文章,遵循 CC 4.0 BY-SA 版权协议,转载请附上原文出处链接和本声明。
本文链接:https://www.cnblogs.com/tmyyss/p/4204882.html