npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

@openactive/harvesting-utils

v0.1.3

Published

Utils library for harvesting RPDE feeds

Downloads

108

Readme

harvesting-utils

Utils library for harvesting RPDE feeds.

Version 0.X.X

This library is currently in version 0.X.X, which means that the API will not be stable until 1.0.0.

Install

This library can be installed as an npm package using the following command:

$ npm install git://github.com/openactive/harvesting-utils.git

Usage

const { harvestRPDE } = require('@openactive/harvesting-utils')

harvestRPDE({
  baseUrl: '...',
  /* ...relevant parameters here */
});

Examples

A very simple example of harvestRPDE can be found in examples/simple-rpde-harvester.js. For more information on this script see here.

API Reference

harvestRPDE

Indefinitely harvests an RPDE feed, following the "expected consumer behaviour" described in the RPDE spec.

N.B. This function will run indefinitely, and only return if a fatal error occurs. For this reason, you will generally not want to run await harvestRPDE(..).

Required Parameters

| Parameter | Type | Description | | --------- | ---- | ----------- | | baseUrl | string | Feed URL to harvest | | feedContextIdentifier | string | Unique identifier for feed within the dataset eg ScheduledSession | | headers | () => Promise<Object.<string,string>> | Function that returns headers needed to make a request to the feed URL | | processPage | (rpdePage: any, feedIdentifier: string, isInitialHarvestComplete: () => boolean) => Promise | Function that processes items in each page of the feed | | onFeedEnd | () => Promise | Function that is called when the last page of the feed is reached. This function may be called multiple times if new items are added after the first time harvestRPDE() reaches the last page | | onError | () => Promise | Function that is called if the harvest errors | | isOrdersFeed | boolean | Is the feed an Orders feed? |

Optional Parameters

| Parameter | Type | Description | | --------- | ---- | ----------- | | state | object | Existing state can be passed in and manipulated within harvestRPDE() | | state.context | FeedContext | Context about the feed. Default: new FeedContext(feedContextIdentifier,baseUrl, multibar) | | state.feedContextMap | Map<string, FeedContext> | Map containing FeedContexts about this and other feeds within the dataset. Default: new Map() | | state.startTime | Date | Start time of the harvest. Default: new Date() | | loggingFns | object | Logging functions for different cases | | loggingFns.log | (message?: any, ...optionalParams: any[]) => void | Normal logging. Default: console.log | | loggingFns.logError | (message?: any, ...optionalParams: any[]) => void | Error logging. Default: console.error | | loggingFns.logErrorDuringHarvest | (message?: any, ...optionalParams: any[]) => void | Error logging during the harvest Default: console.error | | config| object | Configuration options | | config.howLongToSleepAtFeedEnd | () => number | How long to wait, in milliseconds, before re-polling a feed after fetching the last page (RPDE spec). Default: () => 500 | | config.WAIT_FOR_HARVEST | boolean | Whether to wait for harvest to complete and run onFeedEnd() function. Default: true | | config.VALIDATE_ONLY | boolean | TODO. Default: false | | config.VERBOSE | boolean | Verbose logging. Default: false | | config.ORDER_PROPOSALS_FEED_IDENTIFIER | string | TODO. Default: null | | config.REQUEST_LOGGING_ENABLED | boolean | Extra logging around the request. Default: false | | options | object | Optional features | | options.multibar | import('cli-progress').MultiBar | If using cli-progress.Multibar, this can be supplied and harvesting updates will be provided to the multibar. Default: null | | options.pauseResume | {waitIfPaused: () => Promise} | Function, if implemented, that can be used to pause harvesting. Default: null |

createFeedContext

Function that creates a FeedContext object

Required Parameters

| Parameter | Type | Description | | --------- | ---- | ----------- | | feedContextIdentifier | string | Unique identifier for feed within the dataset eg ScheduledSession | | baseUrl | string | Feed URL to harvest |

Optional Parameters

| Parameter | Type | Description | | --------- | ---- | ----------- | | multibar | import('cli-progress').MultiBar | If using cli-progress.Multibar, this can be supplied and context values will be provided to the multibar. Default: null |

progressFromContext

Function that returns harvesting progress values from a FeedContext object

Required Parameters

| Parameter | Type | Description | | --------- | ---- | ----------- | | context | FeedContext | FeedContext object to get progress values from |

harvestRpdeLossless

harvestRpdeLossless has the same function signature as harvestRpde. However it is capable of handling modified values that are too large for JavaScript numbers to handle natively ie > 2^53. This is handled by storing them as strings in memory.

For more guidance on how to handle these values, see here.

Developing

TypeScript

The code is written in native JS, but uses TypeScript to check for type errors. TypeScript uses JSDoc annotations to determine types (See: Type Checking JavaScript Files) from our native .js files.

In order for these types to be used by other projects, they must be saved to TypeScript Declaration files. This is enabled by our tsconfig.json, which specifies that declaration files are to be generated and saved to built-types/ (As an aside, the reason that the package's types must be saved to .d.ts files is due to TypeScript not automatically using JS defined types from libraries. There is a good reason for this and proposals to allow it to work at least for certain packages. See some of the discussion here: https://github.com/microsoft/TypeScript/issues/33136).

For this reason, TypeScript types should be generated after code changes to make sure that consumers of this library can use the new types. The openactive-test-suite project does this automatically in its pre-commit hook, which calls npm run gen-types

TypeScript-related scripts:

  • check-types: This uses the tsconfig.check.json config, which does not emit any TS declaration files - all it does is check that there are no type errors. This is used for code tests.

  • gen-types: This uses the tsconfig.gen.json config, which emits TS declaration files into built-types/.

    Additionally, it copies programmer-created .d.ts files from our source code (e.g. src/types/Criteria.d.ts) into built-types/. This is because our code references these types, so they must be in the built-types/ directory so that the relative paths match (e.g. so that import('../types/Criteria').Criteria works).