cendertron
v0.0.4
Published
Renders webpages using headless Chrome for usage by bots
Downloads
3
Readme
Cendertron
Cendertron = Crawler + cendertron
Crawl AJAX-heavy client-side Single Page Applications (SPAs), deploying with docker, focusing on scraping requests(page urls, apis, etc.), followed by pentest tools(Sqlmap, etc.).
Cendertron can be used for extracting requests(page urls, apis, etc.) from your Web 2.0 page, view in demo page, or result page.
Deploy
- Run locally
$ git clone ...
$ yarn install
$ npm run dev
- Deploy in Docker
# build image
$ docker build -t cendertron .
# run as contaner
$ docker run -it --rm -p 3000:3000 --name cendertron-instance cendertron
# run as container, fix with Jessie Frazelle seccomp profile for Chrome.
$ wget https://raw.githubusercontent.com/jfrazelle/dotfiles/master/etc/docker/seccomp/chrome.json -O ~/chrome.json
$ docker run -it -p 3000:3000 --security-opt seccomp=$HOME/chrome.json --name cendertron-instance cendertron
# or
$ docker run -it -p 3000:3000 --cap-add=SYS_ADMIN --name cendertron-instance cendertron
# use network and mapping logs
$ docker run -d -p 5000:3000 --cap-add=SYS_ADMIN --name cendertron-instance --network wsat-network cendertron
Test Urls
- http://testphp.vulnweb.com/AJAX/#
- http://demo.aisec.cn/demo/
- https://jsonplaceholder.typicode.com/
Use as Libs
Install cendertron from NPM:
# set not downloading chromium
$ PUPPETEER_SKIP_CHROMIUM_DOWNLOAD=true
$ yarn add cendertron
# or
$ npm install cendertron -S
Import Crawler
and use in your code:
About
Roadmap
- [x] 将自定义参数的爬虫全部划归到 POST 中,POST 请求会进行 Body 存储与匹配
- [x] 引入自定义的 BrowserEventEmitter,全局仅注册单个 Browser 监听器
- [x] add https://github.com/winstonjs/winston as logger
- https://123.125.98.210/essframe
- [ ] 分别添加调度器级别与爬虫级别的监控
Motivation & Credits
- gremlins.js: Monkey testing library for web apps and Node.js