npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

ldes-client

v0.0.16

Published

This package provides common tooling to work with LDESes.

Downloads

3,524

Readme

The LDES client

This package provides common tooling to work with LDESes.

The main functionality is to replicate an LDES, and keeping in-sync with it. It pipes the result to another processor complying to the connector architecture.

Install the package using npm install -g ldes-client

Replication and synchronization

The LDES client has two modes: sync and replicate. Both are accessable view the ldes-client command.

ldes-client <url> [-f] [--save <path>] [--pollInterval <number>] [--shape <shapefile>]  

Flags

  • -f --follow: follow the LDES, the client stays in sync
  • -s --state: filepath to the save state file to use, used both to resume and to update
  • --pollInterval: time to wait between polling the LDES when the client is following the LDES.
  • --shape: shape file to which LDES members should conform (overwrite LDES configured shape)
  • -t --default-timezone: default timezone to use for dates in tree:InBetweenRelation. AoE|Z|±HH:mm Default: AoE.
  • --condition <condition_file>: filter the LDES stream to only emit members that adhere to this condition
  • -m, --metadata: include metadata in the emitted members. Notifies the ldes server that it is interested in metadata.
  • Others soon coming

You can also use this as a library in your TS/JS projects. See the client.ts file for documentation.

Conditions

When passing a condition file to the ldes-client cli, the expected contents is a condition where the subject is the file itself.

Simple condition

@prefix csp: <http://vocab.deri.ie/csp#>.
@prefix tree: <https://w3id.org/tree#>.
<> a csp:Condition;
    # Type of relation to filter on
    tree:relationType tree:GreaterThanOrEqualToRelation;
    # Path to extract values for the filter
    tree:path <http://def.isotc211.org/iso19156/2011/Observation#OM_Observation.resultTime>;
    # Alpha value for comparison
    tree:value "2024-01-14T08:35:35.720Z";
    tree:compareType "date".

And condition

@prefix csp: <http://vocab.deri.ie/csp#>.
@prefix tree: <https://w3id.org/tree#>.
<> a csp:And;
  csp:and [
    a csp:Condition;
    tree:relationType tree:GreaterThanOrEqualToRelation;
    tree:path <http://def.isotc211.org/iso19156/2011/Observation#OM_Observation.resultTime>;
    tree:value "2024-01-14T08:35:35.720Z"
    tree:compareType "date";
  ];
  csp:and [
    a csp:Condition;
    tree:relationType tree:LessThanRelation;
    tree:path <http://def.isotc211.org/iso19156/2011/Observation#OM_Observation.resultTime>;
    tree:value "2024-01-15T08:35:35.720Z"
    tree:compareType "date";
  ].

Or condition

@prefix sh: <http://www.w3.org/ns/shacl#>.
@prefix sosa: <http://www.w3.org/ns/sosa/>.
@prefix as: <https://www.w3.org/ns/activitystreams#>.
@prefix owl: <http://www.w3.org/2002/07/owl#>.
@prefix csp:   <http://vocab.deri.ie/csp#> .
@prefix tree: <https://w3id.org/tree#>.

<> a csp:Or;
  csp:or [
    a csp:Condition;
    tree:relationType tree:EqualToRelation;
    tree:value ex:sensor1;
    tree:path ( sosa:madeBySensor [ sh:inversePath sosa:hosts ] );
  ];
  csp:or [
    a csp:Condition;
    tree:relationType tree:EqualToRelation;
    tree:value ex:sensor2;
    tree:path ( sosa:madeBySensor [ sh:inversePath sosa:hosts ] );
  ].

Use cases

  • I want to replicate all data as fast as possible. This takes a long time so I want options to resume, and later stay in sync.
  • I want to stay up to date to all entities that are present in the data. For this, I don't care about old data, only the newest data per entity is required. And stay in sync later.
    • It is acceptable that some data is emitted twice, as long as the timestamp is in the correct order.
    • It is not acceptable that some data is emitted twice. If an entity is re-emitted, the entity is changed at this instance.
  • I want to do timeseries analysis, for this I want all data in order, according to their timestamp path. I want options to resume, and later stay in sync.

Software architecture

The client contains two parts, fetching the fragments and emitting the members. Use cases have different influences on these parts.

For example:

  • the time series analysis wants to fetch pages first that contain older members, the member emitter only wants to emit a member when the oldest page has been found.
  • emitting latest versions is the inverse, first fetch the youngest pages, so some members can already be emitted.

Difficulties:

  • the member emitter needs to know what the fragment fetcher is doing (did it know that the oldest page is fetched).
  • how do we handle unbouned size?
    • Keeping state of emitted members is unbounded
    • Keeping state of visited pages is unbounded These are also influenced by the configuration.

Fragment Fetcher

The fragment fetcher fetches the fragments. These fragments are targeted by relation chains, but only two types exist, important and not important. For example, when emitting members in order, the important relations are the GreaterThan relations, because all other relation types are equivalent, that is to say, we can only emit members when all unimportant relations are fetched and processed.

Relation Chains are chains, because when you fetch a page, you can find new relations pointing from that page. But we need to distinguish between a relation after an important relation or a relation after an unimportant relation. Important relations squash unimportant relations, these chains should only be fetched if all unimportant relations are done. Unimportant relations squash other unimportant relations. Important relations squash other important relations, the new value is the bigger value of the two. The ordering of these chains is thus, first unimportant relations, then important relations ordered on value.

These chains dictate the order that pages should be fetched. Because fetching is asynchonous, we can only interpret a page, if no pages are in flight, that came from a smaller relation. In code this is denoted by heaps readyPage and inFlightPages, that both contain relation chains. Note that relation can be interpretted at any time.

When a page is ready to be interpretted, the helper is asked to interpret the page. A special value called marker is derived from the value of the incoming chain if the chain was important. For example, when emitting members in order, the member manager can always extract the members that are found, but can only emit them when a marker is issued and only the members that are smaller than that marker.

Fault tolerance

The fetcher tries to be tault tolerant. HTTP codes that indicate that the server is overloaded or something else is going wrong are caught and retried. This is the default behaviour when the provided config does not provide a fetch function.

Caught HTTP codes:

  • 408: Request timeout
  • 425: Too Early
  • 429: Too Many requests
  • 500: Internal Server Error
  • 502: Bad Gateway
  • 503: Service Unavailable
  • 504: Gateway Timeout
// Provide your own codes with a custom retry function
config.fetch = retry_fetch(fetch, [408, 425, 429, 500, 502, 503, 504], 500, 5);

Member Manager

The member manager just extract members and emits them when they are ready. Extracting members is asynchonous, because sometime out of bound requests are made. This result in currentPromises that are awaited before sorting and emitting members.

The streaming api comes with a requirement to always emit at least one member, per poll. To achieve this, the memberManager has a function called reset() which returns a promise when a member is emitted.

Expected Features

  • Use view that is indicated as EventSource
  • Unit tests, performance tests, integration tests and conformance tests.
  • Use the client as a library
  • Use the client to pipe members to stdout for testing purposes
  • Use the client with the connector architecture
  • Provide a json object (or file) denoting the expected connector architecture channel to write to.
  • Maybe add little feature flag to indicate that you only want to follow greaterThan relations.
  • Add flag 'keep all state', default value would only keep some seen page id's to conserve memory
  • Flag to indicate, don't store the emitted members, this is the default only if an EventSource is found
  • Overwrite Shacl insead of the LDES provided shape

Other tooling available from this repository

  • A TREE/LDES hypermedia extractor
  • A class comparing and ordering links based on the tree:Relation spec

Authors and license

  • Tests and design: Pieter Colpaert
  • Actual implementation: Arthur Vercruysse

© 2023 -- Ghent University - IMEC. MIT license