Python爬虫入门案例

这是一个用Python3写的简单爬虫,用了requests和beautifulsoup4库。它可以把抓取到的数据存到数据库里,代码注释里有说明。怎么运行呢?先解压程序,到主目录下,确保你装了requests和beautifulsoup4库(用pip install requests和pip install beautifulsoup4安装)。然后打开getData.py文件,把你想爬的网站地址放到url那里,最后在命令行里运行python getData.py就行了。

zip
web-crawler-master.zip 预估大小:69个文件
folder
web-crawler-master 文件夹
folder
first_project 文件夹
file
db.sqlite3 128KB
folder
data_crawler 文件夹
file
admin.py 505B
folder
migrations 文件夹
folder
__init__.py 文件夹
file
0001_initial.py 1KB
folder
__pycache__ 文件夹
file
0001_initial.cpython-38.pyc 880B
file
__init__.cpython-38.pyc 172B
file
models.py 999B
file
urls.py 110B
folder
__pycache__ 文件夹
file
models.cpython-38.pyc 2KB
file
admin.cpython-38.pyc 786B
folder
first_project 文件夹
folder
__init__.py 文件夹
file
wsgi.py 403B
file
urls.py 850B
file
settings.py 3KB
folder
__pycache__ 文件夹
file
wsgi.cpython-38.pyc 577B
file
urls.cpython-38.pyc 1KB
file
settings.cpython-38.pyc 2KB
file
__init__.cpython-38.pyc 162B
file
asgi.py 403B
file
manage.py 637B
folder
news 文件夹
file
admin.py 516B
folder
migrations 文件夹
folder
__init__.py 文件夹
file
0001_initial.py 1023B
folder
__pycache__ 文件夹
file
0001_initial.cpython-38.pyc 962B
file
__init__.cpython-38.pyc 164B
file
models.py 432B
folder
templates 文件夹
folder
news 文件夹
file
article_list.html 440B
file
user_list.html 153B
file
article_detail.html 41B
file
year_archive.html 366B
file
month_archive.html 40B
file
base.html 198B
file
urls.py 343B
folder
__pycache__ 文件夹
file
models.cpython-38.pyc 959B
file
urls.cpython-38.pyc 488B
file
admin.cpython-38.pyc 809B
file
views.cpython-38.pyc 1KB
file
views.py 1KB
folder
polls 文件夹
folder
__init__.py 文件夹
file
tests.py 60B
file
admin.py 413B
folder
migrations 文件夹
folder
__init__.py 文件夹
file
0001_initial.py 1KB
folder
__pycache__ 文件夹
file
0001_initial.cpython-38.pyc 1018B
file
__init__.cpython-38.pyc 165B
file
apps.py 85B
file
models.py 760B
folder
templates 文件夹
folder
polls 文件夹
file
detail.html 547B
file
index.html 392B
file
results.html 323B
file
urls.py 509B
folder
__pycache__ 文件夹
file
models.cpython-38.pyc 1KB
file
urls.cpython-38.pyc 553B
file
admin.cpython-38.pyc 571B
file
apps.cpython-38.pyc 372B
file
__init__.cpython-38.pyc 154B
file
views.cpython-38.pyc 2KB
folder
static 文件夹
folder
polls 文件夹
file
style.css 167B
file
views.py 2KB
folder
crawler 文件夹
file
getData.py 2KB
file
connection.py 581B
file
function.py 5KB
folder
__pycache__ 文件夹
file
function.cpython-38.pyc 3KB
file
connection.cpython-38.pyc 678B
file
bash.exe.stackdump 1KB
file
README.md 1008B
folder
SonNhaDep 文件夹
file
getData.py 719B
file
function.py 475B
folder
__pycache__ 文件夹
file
function.cpython-38.pyc 660B
...
zip 文件大小:50.38KB