编程的快乐只有在运行成功的那一刻才知道QAQ

  目标网站:https://www.kuaidaili.com/free/inha/  #若有侵权请联系我

  因为上面的代理都是http的所以没写这个判断

  代码如下:

 1 #!/usr/bin/env python
 2 # -*- coding: utf-8 -*-
 3 import urllib.request
 4 import re
 5 import time
 6 n = 1
 7 headers = {\'User-Agent\':\'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.132 Safari/537.36\'}
 8 def web(url):
 9     req=urllib.request.Request(url=url,headers=headers)
10     response = urllib.request.urlopen(url)
11     html = response.read().decode(\'UTF-8\',\'ignore\')
12     ip = r\'[0-9]+(?:\.[0-9]+){3}\'
13     port = r\'"PORT">(\d{0,1}\d{0,1}\d{0,1}\d{0,1}\d)<\'
14     out = re.findall(ip,html)
15     out1 = re.findall(port,html)
16     i = 0
17     dictionary = {}
18     while i <= 14:
19         dictionary[0] = (out[i],out1[i])
20         store(dictionary)
21         i += 1
22     print(out,\'\n\',out1)
23 def store(dictionary):
24     with open(\'ip.txt\',\'a\') as f:
25         c = \'ip:\' + dictionary[0][0] + \'\tport:\' + dictionary[0][1] + \'\n\'
26         f.write(c)
27         print(\'store successfully\')        
28 while n <= 3313:
29     url1 = "https://www.kuaidaili.com/free/inha/"
30     url = url1 + str(n) +\'/\'
31     web(url)
32     time.sleep(5)
33     n += 1

 

版权声明:本文为vhhi原创文章,遵循 CC 4.0 BY-SA 版权协议,转载请附上原文出处链接和本声明。
本文链接:https://www.cnblogs.com/vhhi/p/12380560.html