espatorrla

v1.0.2

Published

3 years ago

Avoid torrent pages ads, coinminers and bloatware with this torrent scraper

Downloads

0High
0Medium
0Low

dionakra

scraping torrent

Espatorrla - Spanish Torrent Scraping

This tool aims to extract all the desired torrents from Spanish Torrent Pages with certain template. With this, you can store them in a safe place and use it for whatever you want

Requirements

NodeJS 8.11 at least.

Installation

npm install espatorrla --save

Usage

This package comes with three methods.

getWorkingPage

Because of common online connectivity problems with torrenting pages, the getWorkingPage method returns which URL is working at the desired moment. By default, it scrapes Descargas2020 and TorrentRapid, but you can test any URL.

const { getWorkingPage } = require('espatorrla')

getWorkingPage(['url1', 'url2'])
  .then(url => {
    console.info(`Working URL is ${url}`); 
  })
  .catch(err => {
    console.error('No working URLs found at this moment :(')
  })

getItemsForCategory

Searches for torrents in a category. The search can be customized to stop at a certain page, element or date. The object parameters are the following:

{
  "url": "http://someurl.com", // URL to be scraped. Can be obtained from the previous method. Required.
  "category": {
    "id": 429, // Id of the category. Some of them are available at the categories.json file. Required.
    "description": "HD Movies" // Descripcion of the category. Required.
  }, 
  "limitPage": 3, // Max pages to be scraped. Optional. If not provided, it will scrape all pages.
  "limitItem": "http://someurl.com/myfilm", // URL of the item to be stopped at. Optional. This is useful when scraping new content from last executions
  "date": "Mes", // Age of the torrents. Optional. It can be one of these: Hoy, Ayer, Semana, Mes, Siempre.
  "initPage": 2 // Initial page to search. Optional. It has to be lower than limitPage
}

The method returns an array of objects, consisting of each item. The information provided is the following:

[
  {
   "url": "http://someurl.com/myfilm", // Item's details URL
   "title": "My favourite movie", // Title of the item
   "image": "http://someurl.com/myfilm-cover.jpg", // URL of the image attached to the item (usually the cover)
   "quality": "MicroHD", // Image quality of the item
   "category": {
      "id": 429, 
      "description": "HD Movies"
   }
  }
]

So, the usage is like this:

const { getItemsForCategory } = require('espatorrla')

const options = {
  url: 'http://torrentrapid.com/ultimas-descargas',
  category: {
    id: 1027,
    description: "HD Movies"
  },
  limitPage: 3,
  limitItem: 'http://torrentrapid.com/myfilm',
  date: 'Siempre'
}

getItemsForCategory(['url1', 'url2'])
  .then(items => {
    console.log(items)
  })
  .catch(err => {
    console.error('Cannot scrape torrents for given parameters at this moment :(')
  })

getTorrentLink

Extracts the .torrent file URL of a given URL.

const { getTorrentLink } = require('espatorrla')

getTorrentLink('http://torrentrapid.com/descargar/peliculas-castellano/contrato-mortal/')
  .then(torrent => {
    console.info(`Torrent URL: ${torrent}`); 
  })
  .catch(err => {
    console.error('Could not find any torrent on given URL :(')
  })

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme