4chan-crawler 中文文档教程
4chan-crawlerJS
Preamble
tomcat-bit/4chan-crawler 为实时站点提供了易于使用的爬虫,我想为其返工抓取档案并收集文本和媒体。 Archive.4plebs.org DDos 保护会阻止来自 python-requests
的请求,但不会阻止来自 Node 的 https< /code>,所以我构建了一个 JS 版本。
Installation
npm i 4chan-crawler
Setup
npm i
Usage
在 config.js
npm start
中更新所需的板和输出目录
4chan-crawlerJS
Preamble
tomcat-bit/4chan-crawler provides and easy to use crawler for the live site, which I wanted to rework for crawling the archive and collecting text as well as media. Archive.4plebs.org DDos protection blocks requests from python-requests
, but not from Node's https
, so I built a JS version.
Installation
npm i 4chan-crawler
Setup
npm i
Usage
Update Desired boards and output directory in config.js
npm start