Crawler --爬虫初窥门径
Published in:2020-09-24 |

crawler 简单的爬取百度图片,并保存图片

Crawler 爬虫——爬取百度图片

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
import urllib
from urllib import request, parse
import re

# keyword = "阿狸"
# keyword = urllib.parse.quote(keyword)
# print(keyword) #%E9%98%BF%E7%8B%B8

url = "https://image.baidu.com/search/index?tn=baiduimage&ipn=r&ct=201326592&cl=2&lm=-1&st=-1&fm=result&fr=&sf=1&fmq=1600852001400_R&pv=&ic=0&nc=1&z=&hd=&latest=&copyright=&se=1&showtab=0&fb=0&width=&height=&face=0&istype=2&ie=utf-8&sid=&word=%E9%98%BF%E7%8B%B8"
result = request.urlopen(url=url).read().decode('utf-8')
# print(result)
# 写入html
# with open("picture.html", 'w', encoding='utf-8') as w:
# w.write(result)

#正则匹配规则,列表存储符合的图片链接
rule = '"thumbURL":"(.*?)"'
pic_list = re.findall(rule, result)

#遍历列表,然后保存图片
a = 1
for i in pic_list:
print(i)
res = request.urlopen(i).read()
string = str(a) + '.jpg'
a = int(a)+1
with open(string, 'wb') as w:
w.write(res)


Prev:
Crawler --爬取笔趣阁小说
Next:
Crawler --爬虫爬取百度图片