你首先需要学习requests和bs4两个库,其次需要了解如何使用代理和多线程。下面是爬取朗文英西字典的简单示例代码。真要爬取,你还要做很多工作的。
import requests
from bs4 import BeautifulSoup as bs
with open(‘cookie.txt’, encoding=‘utf-8-sig’) as f: cookie=f.read()
headers={‘cookie’:cookie,‘User-Agent’:‘Edge/120.0.0.0’}
r=requests.get(‘A (REAL) TROOPER - Spanish translation - Longman’, headers=headers)
soup=bs(r.text, ‘lxml’)
data=soup.select_one(‘div[class=dictionary]’)
print(data.get_text())