site-pages-sampler

v0.3.3

Published

3 years ago

A simple crawler to get page URL samples in a website.

Downloads

0High
0Medium
0Low

miyanaga

To get 10 URLs of webpage in the website from a starting URL.

site-pages-sampler -l 10 -s 'https://www.ideamans.com/'

Usage

site-pages-sampler <url>

Starts to crawl samples pages from the URL.

Options:
  --help                 Show help                                     [boolean]
  --version              Show version number                           [boolean]
  --user-agent-type, -u  User agent type. (mobile|desktop)   [default: "mobile"]
  --limit, -l            Limit of total sample pages.             [default: 100]
  --limit-each           Sample page links from each page.        [default: 100]
  --concurrency, -c      Concurrency of requests.                   [default: 8]
  --timeout-each         Timeout for each page request. (seconds)
  --timeout              Total timeout. (seconds)                  [default: 30]
  --url-hash             Recognizes url with hash as unique.
                                                      [boolean] [default: false]
  --verify, -v           Verifies each url can be got.[boolean] [default: false]
  --shuffle              Shuffles links order.        [boolean] [default: false]
  --debug, -d            Outputs debug logs to stderr.[boolean] [default: false]
  --page-extnames        Comma separated extension names of pages.
                              [default: ",.html,.htm,.php,.asp,.aspx,.jsp,.cgi"]
  --directory-index      Comma separated directory index minimatch patterns.
                                                  [default: "index.*,Default.*"]
  --ignore-param         Comma separated search param minimatch patterns to
                         ignore.                  [default: "index.*,Default.*"]
  --format, -f           Output format. (text|json)            [default: "text"]

Pkg
Stats

Discover Tips

General search

Package details

User packages

Sponsor

About

Twitter

GitHub

Twitter

GitHub

Site

Open Software & Tools

Framework

Server

Data Store

Caching

CSS / Styling

Typeface

Avatars

Data Viz

Date formatting

Infinite scrolling

Markdown rendering

Repository url parsing

User data

Compiling

Types

Odds & Ends

site-pages-sampler

v0.3.3

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

Usage