@tanaloua/links-scraper
v1.0.4
Published
Web scraper for links
Downloads
9
Readme
Links Scraper
Links Scraper is a Node.js package for crawling web pages and extracting links recursively. It provides a simple and efficient way to collect links from a given website, allowing you to build applications such as web crawlers, site mapping tools, or link analysis tools.
Disclaimer You should always respect the robots.txt file of a website and avoid crawling websites that prohibit web scraping. This package is intended for educational purposes and should be used responsibly.
Installation
You can install Links Scraper via npm:
npm install @tanaloua/links-scraper
Usage
const LinksScraper = require('@tanaloua/links-scraper');
const linksScraper = new LinksScraper();
// Crawl a website and extract links
linksScraper.crawl('https://www.scrapethissite.com').then((links) => {
console.log(links);
}).catch((error) => {
console.error('An error occurred:', error);
});
API
LinksScraper
constructor(progressiveRetrieval = false, onProgress)
progressiveRetrieval
: Indicates whether to use progressive retrieval (default:false
).onProgress
: Callback function for progressive retrieval.
Creates a new instance of the LinksScraper
class.
crawl(url, ignore)
Crawls the provided URL and extracts links recursively.
url
(String): The URL to crawl.ignore
(String): Optional URL pattern to ignore while crawling.
Returns a Promise that resolves to an array of links found on the website.
Example 1
const linksScraper = new LinksScraper();
linksScraper.crawl('https://www.scrapethissite.com').then((links) => {
console.log(links);
}).catch((error) => {
console.error('An error occurred:', error);
});
Example 2
With progressive retrieval.
const onProgress = (url) => {
console.log('Crawling:', url);
};
const linksScraper = new LinksScraper(true, onProgress);
linksScraper.crawl('https://www.scrapethissite.com').then((links) => {
console.log(links);
}).catch((error) => {
console.error('An error occurred:', error);
});
License
This project is licensed under the MIT License - see the LICENSE file for details.
Contributing
Contributions are welcome! Please read the CONTRIBUTING.md file for details on how to contribute to this project.
Issues
Please report any issues or feature requests on the issues page.