首页 » Python » python学习循序渐进二:python爬虫批量化抓取网页标题

python学习循序渐进二:python爬虫批量化抓取网页标题

 

提取url.txt里面的网址并抓取它的标题然后输出到1.txt里的python脚本

# -- coding: utf-8 --
import sys,urllib2,re 
reload(sys)
sys.setdefaultencoding("utf-8") 
html3 = open('C:/Users/panda/Desktop/1.txt','w') 
for urllist in open("C:/Users/panda/Desktop/url.txt"):
	urllist = urllist.strip()
	print urllist
	html = urllib2.urlopen(urllist).read() 
	html2 = re.search(r'(.*?)',html).group(1).decode('utf-8') 
	#print html2 
	html3.write(html2+"\n") 

原文链接:python学习循序渐进二:python爬虫批量化抓取网页标题,转载请注明来源!

0