npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

@peterwmwong/gto

v0.0.7

Published

GTO: Gremlin Typescript ORM

Downloads

1

Readme

GTO: Gremlin TypeScript ORM

WARNING: This project is an experiment and has not been put into production yet.

Developer Getting Started

npm ci
npm run test-db-build-docker-image
npm run test-db-start
npm run test

Enforced consistency and correctness

Currently, our with repository methods are ad-hoc groupings of raw read/write DB queries. Hard to enforce consistency or correctness.

Example: IngestionRow's rowNumber

Excel import path added rowNumber property, IRI/CSV import path did not. If repositories were Object Oriented that have a consistent read/write view of properties, this would not have happened.

Example: IngestionRow's rowNumber PART 2.

Mike and I attempted to add setting rowNumber in the IRI/CSV import path, but only found out later it was incorrectly set as a string instead of a number... and effectively causing ingestion row chunking to take forever/blow up.

Benefits from GTO

  • Nodes and Edges are created and filtered with the correct properties with the correct types
  • Traversals between Nodes and Edges are always correct
    • ex. Prevent accidentally going from Ingestion to IngestionRow through the wrong edge (HAS???)_
    • ex. Prevent accidentally using the wrong direction (in? out?)_

FUTURE IDEA: DB/Query Metrics/Statistics

  • Individual
  • Aggregate
    • What are the longest taking queries?
    • What are the most frequent queries?
    • What are the biggest queries?

FUTURE IDEA: Automated DB validation

It is still possible for the database's structure to be tampered with outside of the application (JupyterHub, direct '/gremlin' access).

As GTO provides a single source of truth/schema for the DB, we could easily build a script that runs through each GTO Node, Edge, properties and make sure we're still in sync/valid.

  • ex. Using Node.name and new Node(g).properties, query nodes that don't have all the required properties, mis-typed properties, extra-properties, etc.

A more accessible Graph DB

Currently, the learning curve to enable Product/QA/Developer to access data in the DB is steep for a number of reasons:

  • Gremlin Querying
    • Not widely known as other DB querying languages (ex. SQL)
    • Less Stack Overflows
    • Less Documentation
    • Little-to-no tooling support (is this gremlin query syntactically correct?)
  • No Schema
    • Unlike SQL DBs, where out-of-the-box tooling can surface tables, columns (name, type), relationships between tables... Neptune does not.
    • This makes it hard to even know where to begin when trying to access data:
      • What nodes/edges are available?
      • What properties for nodes/edges and their types (number? string?)
      • Which direction is the edge? (inE? outE? in_? out?)
    • Currently, the structure of the Graph DB is enforced by our code.
    • Even worse, the code currently does not have a single-source-of-truth about which nodes/edges nor the properties (name/type) on nodes/edges.
  • Constants
    • Labels for Nodes/Edges and property names are mostly in flat "lists" of constants
    • Incredibly easy to use the wrong constant, in the wrong place. Nothing stopping you from trying to use P_VERTEX_TYPE when querying against an Edge.

Benefits from GTO

  • Single source of truth for a Node/Edge and relationship between Nodes and Edges
  • Type/Editor driven querying
    • Type information provides users accurate hints on what's possible and valid
      • ex. Ingestion. options - all, byId
      • ex. Ingestion.all(g, { options - source (property)
      • ex. Ingestion.all(g, {source: 'Annotator'}). options - having, count, fetchOne, fetchAll, IngestionRows.

Discoveries

Gremlin: GraphTraversalSource, GraphTraversal, Statics (Anonymous Traversal) have different steps.

| Step | GraphTraversalSource | GraphTraversal | Statics | |----------|----------------------|----------------|---------| | E | ✖ | | | | V | ✖ | ✖ | ✖ | | addE | ✖ | ✖ | ✖ | | addV | ✖ | ✖ | ✖ | | toList | | ✖ | | | iterate | | ✖ | | | next | | ✖ | |