espatorrla
v1.0.2
Published
Avoid torrent pages ads, coinminers and bloatware with this torrent scraper
Downloads
4
Readme
Espatorrla - Spanish Torrent Scraping
This tool aims to extract all the desired torrents from Spanish Torrent Pages with certain template. With this, you can store them in a safe place and use it for whatever you want
Requirements
- NodeJS 8.11 at least.
Installation
npm install espatorrla --save
Usage
This package comes with three methods.
getWorkingPage
Because of common online connectivity problems with torrenting pages, the getWorkingPage
method returns which URL is working at the desired moment. By default, it scrapes Descargas2020 and TorrentRapid, but you can test any URL.
const { getWorkingPage } = require('espatorrla')
getWorkingPage(['url1', 'url2'])
.then(url => {
console.info(`Working URL is ${url}`);
})
.catch(err => {
console.error('No working URLs found at this moment :(')
})
getItemsForCategory
Searches for torrents in a category. The search can be customized to stop at a certain page, element or date. The object parameters are the following:
{
"url": "http://someurl.com", // URL to be scraped. Can be obtained from the previous method. Required.
"category": {
"id": 429, // Id of the category. Some of them are available at the categories.json file. Required.
"description": "HD Movies" // Descripcion of the category. Required.
},
"limitPage": 3, // Max pages to be scraped. Optional. If not provided, it will scrape all pages.
"limitItem": "http://someurl.com/myfilm", // URL of the item to be stopped at. Optional. This is useful when scraping new content from last executions
"date": "Mes", // Age of the torrents. Optional. It can be one of these: Hoy, Ayer, Semana, Mes, Siempre.
"initPage": 2 // Initial page to search. Optional. It has to be lower than limitPage
}
The method returns an array of objects, consisting of each item. The information provided is the following:
[
{
"url": "http://someurl.com/myfilm", // Item's details URL
"title": "My favourite movie", // Title of the item
"image": "http://someurl.com/myfilm-cover.jpg", // URL of the image attached to the item (usually the cover)
"quality": "MicroHD", // Image quality of the item
"category": {
"id": 429,
"description": "HD Movies"
}
}
]
So, the usage is like this:
const { getItemsForCategory } = require('espatorrla')
const options = {
url: 'http://torrentrapid.com/ultimas-descargas',
category: {
id: 1027,
description: "HD Movies"
},
limitPage: 3,
limitItem: 'http://torrentrapid.com/myfilm',
date: 'Siempre'
}
getItemsForCategory(['url1', 'url2'])
.then(items => {
console.log(items)
})
.catch(err => {
console.error('Cannot scrape torrents for given parameters at this moment :(')
})
getTorrentLink
Extracts the .torrent
file URL of a given URL.
const { getTorrentLink } = require('espatorrla')
getTorrentLink('http://torrentrapid.com/descargar/peliculas-castellano/contrato-mortal/')
.then(torrent => {
console.info(`Torrent URL: ${torrent}`);
})
.catch(err => {
console.error('Could not find any torrent on given URL :(')
})