npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

event_scraper

v0.0.2

Published

Event based asynchronous proxied scraper.

Downloads

2

Readme

Event Scraper

Asynchronous scraping as fast as possible!

Installing:

npm install event_scraper

Starting example:

import { Event, EventListener, EventHandler, IContext } from 'event_scraper';

class AsyncScrapeEvent extends Event {
  context: IContext = {
    callerEvent: this,
    proxied: false,
    numberOfTry: 1
  }

  public async event(proxy?: string) {
    // do something asynchronous...
    return result;
  }
}

const event = new AsyncScrapeEvent();

class MyEventListener extends EventListener {
  eventToListenTo = AsyncScrapeEvent.name;

  onSuccess(context: IContext, result: any): void {
    // do something with result
    
    // start new events as:
    EventHandler.scheduleEvent(newEvent);
  }
}
const eventListener = new MyEventListener();

// set up everything:
const eventHandler = new EventHandler(5);


const proxies = []; /** if you have proxies... */
eventHandler.setEventListener(eventListener);
eventHandler.setProxies(proxies);
eventHandler.startEvent(event);

Description

This module operates with a basic EventEmitter under the hood. The goal of the project is to create events, all of which can do anything asynchronous while the relationship between them are leveraged by the event-listeners we set. The addition which makes it more than just a regular event-emitter, is that you can feed in a given set of proxies, and every time an event requires a proxy, it will automatically assign one to it completely randomly. There are always as many proxied asynchronous events as many proxies you feed in.

The other extra is that you can set how many retries it should do. This way if any event fails, it'll automatically reschedule it.


Documentation

Event class

Every time you create a new event, you should extend the Event class. You'll be required to create a context parameter. There are three parameters required here. These are:

| Parameter name | Type | Description | | -------------- | ---- |:-----------:| | callerEvent | Event| You should just bind this for this parameter. It is used to re-schedule the same event | | proxied | boolean | Indicates whether the event should get a proxy or not | | numberOfTry | number | You should always set this to 1 |

You should also define the asynchronous event function, which can get an optional parameter, the proxy as a string if you set the proxied context parameter as true.

If you get a proxy it'll be a string in the following format:

http://proxy.ipAddress:proxy.port // http:/141.12.16.59:4400

You can put anything in the context parameter as you'd like, these can be used later at the event listeners, since the functions you can override there will all get you extended context. Therefore if you'd like to transfer data from the event to the appropriate event listener, you should put it into the context.

EventListener class

Every event listener you'd like to set up has to extend from the EventListener class. Here you have six methods to override. These are:

| Function name | Parameter types | Return types | Description | | ------------- | --------------- | ------------ |:-----------:| | onSuccess | context: IContext, result: any | void | This function is going to get executed when the appropriate event was successful. It'll also get the event's context and the result of the event | | onFailure | context: IContext, result: any | void | This function gets executed when the appropriate event was unsuccessful. The result will contain the error, therefore the reason of failure. | | onReschedule | context: IContext, result: any | void | This function gets executed on rescheduling the event. The result contains the error, therefore the reason of the reschedule | | logSuccess | context: IContext | string | The return string is going to be logged on the console on successful execution in green. | | logFailure | context: IContext | string | The return string is going to be logged on the console on unsuccessful execution in red. | | logReschedule | context: IContext | string | The return string is going to be logged on the console on rescheduling of the event. |

When setting up the EventListener class, you have to define a string property called eventToListenTo. This value of this should be the name property of the Event class to listen to. See example above.

If the logger functions don't get overridden, nothing will be logged.

EventHandler class

This is the interface through which you can set up the listeners, events and proxies as well. Also it gives you a handy function to schedule new events into the pool as well. When instantiated the parameter that it needs will be the total number of times it will try to do each event in case of failures.

| Function name | Parameter types | Return types | Description | | ------------- | --------------- | ------------ |:-----------:| | setEventListener | eventListener: EventListener | void | Use this function to set up an event listener. | | setEventListeners | eventListeners: EventListener[] | void | Use this function to set up multiple event listener at once. | | startEvent | event: Event | void | This function schedules one single event instantly. Use this to start the whole event-chain. | | startEvents | events: Event[] | void | This function schedules an array of events instantly. Use this to start the whole event-chain if there are multiple starter events | | setProxies | proxies: IProxies[] | void | Use this function to set up available proxies for the events. |

And there is an abstract method defined as well:

| Function name | Parameter types | Return types | Description | | ------------- | --------------- | ------------ |:-----------:| | scheduleEvent | event: Event | void |Use this function to schedule new events while the script is already running (like at the onSuccess method of a listener) |

IContext

This is the interface of an event's context. If you'd like to add anything to the context, you should extend this interface and after that everywhere where it requires the type IContext, you should just use your extended context.

TODO: Get rid of the numberOfTry... Testing, testing, testing...


License

MIT