scraplink
v1.0.5
Published
Scraplink library, for scraping links and assets url from a webpage
Downloads
9
Maintainers
Readme
scraplink
Scralink library, for scraping links and assets url from a webpage
Install
npm install scraplink
Usage
const { Scrapper } = require('scraplink');
(async () => {
const { assets, links } = await Scrapper('http://kaspat.com');
console.log(assets);
console.log(links);
})();
// Assets URLS
// 'http://www.theie6countdown.com/images/upgrade.jpg',
// 'http://kaspat.com/images/img1.jpg',
// 'http://kaspat.com/images/img2.jpg',
// 'http://kaspat.com/images/img3.jpg',
// 'http://kaspat.com/images/img4.jpg',
// 'http://kaspat.com/images/page1_img1.jpg',
// 'http://kaspat.com/images/icon1.jpg',
// 'http://kaspat.com/images/icon2.jpg',
// 'http://kaspat.com/images/icon3.jpg',
// 'http://kaspat.com/images/icon4.jpg',
// 'http://www.e-zeeinternet.com/count.php?page=986859&style=odometer&nbdigits=8&reloads=1'
// Links
// 'http://www.microsoft.com/windows/internet-explorer/default.aspx?ocid=ie6_countdown_bannercode',
// 'http://kaspat.com/index.php',
// 'http://kaspat.com/index.php',
// 'http://kaspat.com/News.php',
// 'http://kaspat.com/Services.php',
// 'http://kaspat.com/Kaspat.php',
// 'http://kaspat.com/Clients.php',
API
Scrapper
- Takes url input and scraps assets url and links from the page
Parse
Parse exposes two functions, as defined below
assets
- Fetches all the assets from the html data
links
- Fetches all the links from the html data
ScrapperUtil
formatRelativeUrls
- Formats relative urls to absolute (takes rootUrl and array urls as input)
Contributing
Interested in contributing to this project? You can log any issues or suggestion related to this library here
Read our contributing guide on getting started with contributing to the codebase
Contributors
Thank you to all the contributors who have helped us in making this project better :raised_hands: