JASPAR爬取Class信息

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
import requests
from bs4 import BeautifulSoup

file_out = open("D:/群晖NAS/Desktop/MEME网站结果/JASPAR.爬虫结果.txt", "w")

with open("D:/群晖NAS/Desktop/MEME网站结果/JASPAR.id.txt", "r") as file_id:
lines = file_id.readlines()

for line in lines:
# 指定要爬取的网址
url = 'https://jaspar.elixir.no/matrix/' + line.replace("\n", "") + '/'

print(url)

# 发送 GET 请求并获取网页内容
response = requests.get(url)

soup = BeautifulSoup(response.text, 'html.parser')

profile = soup.find('table', class_='table table-hover')

tr_tags = profile.find_all("tr")

class_txt = tr_tags[2].find_all('td')[1].get_text()

file_out.write(line.replace("\n", "") + "\t" + class_txt + "\n")

file_out.close()

JASPAR爬取Class信息
https://lixiang117423.github.io/article/jaspar/
作者
李详【Xiang LI】
发布于
2024年3月22日
许可协议