4chan-crawler
v1.0.1
Published
Crawl 4Chan's archives from most recent to oldest, saving contents to disk.
Downloads
3
Readme
4chan-crawlerJS
Preamble
tomcat-bit/4chan-crawler provides and easy to use crawler for the live site, which I wanted to rework for crawling the archive and collecting text as well as media. Archive.4plebs.org DDos protection blocks requests from python-requests
, but not from Node's https
, so I built a JS version.
Installation
npm i 4chan-crawler
Setup
npm i
Usage
Update Desired boards and output directory in config.js
npm start