npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

dracoql

v0.4.1

Published

DracoQL is a TypeScript-based DSL for web scraping and data manipulation with tooling to pull data from files, web or databas and pipe it to different sources.

Downloads

3

Readme

DracoQL 🐉

TypeScript NPM

DracoQL is a an embeddable query language for processing and transforming data from the web resources and writing it to files and databases.

Language actively in development, please report any bugs under issues.

Install

npm install dracoql

Usage

import * as draco from "dracoql";

draco.eval(`PIPE "Hello world!" TO STDOUT`);

Additionally, you can get runtime variables from the caller

import * as draco from "dracoql";

draco.eval(`VAR data = FETCH https://jsonplaceholder.typicode.com/todos/ AS JSON`, (ctx) => {
  console.log(ctx.getVar("data"))
});

Syntax

Variables

A variable can hold either an INT_LITERAL, STRING_LITERAL or an expression. Draco does not support string escaping, you can instead use '' for that.

VAR foo = 1
VAR bar = "hello world!"
VAR baz = FETCH "https://example.org"

Networking

Draco provides FETCH as the primary method for interacting with a url

Fetch Response

VAR data = FETCH "https://example.org"
      HEADER "User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:109.0) Gecko/20100101 Firefox/112.0"
      HEADER "Content-type: application/json"
      METHOD "GET"

Here the data variable will hold a request object, which looks like so

{
  headers: any,
  status: number,
  redirected: boolean
  url: string
}

Additionaly, you can also make POST requests

VAR data = FETCH "https://reqres.in/api/users" METHOD "POST"
  HEADER "User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:109.0) Gecko/20100101 Firefox/112.0"
  HEADER "Content-type: application/json"
  BODY JSON '{"name": "morpheus", "job": "leader"}'

Fetch JSON

VAR data = FETCH "https://reqres.in/api/users" 
  HEADER "User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:109.0) Gecko/20100101 Firefox/112.0"
  AS JSON

here data will be stored as the parsed JSON object

Fetch HTML

VAR data = FETCH "https://reqres.in" 
  HEADER "User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:109.0) Gecko/20100101 Firefox/112.0"
  AS HTML

here data will be stored as the parsed HTML object, which looks like so

{
  tag: string,
  attributes: any,
  children: [...]
}

Caching HTML

Addtionally draco also has a CACHE keyword which requires an time in milliseconds and optional path for html-cache directory

Here is example usage. NOTE Caching only works with HTML data type

VAR data = FETCH "https://example.org"
  CACHE 10000
  AS HTML

Headless HTML mode

To scrap HTML from SPAs Draco offers an optinal HEADLESS flag, which when enabled will use puppeteer to load and fetch the html page.

VAR data = FETCH "https://bloomberg.com"
  CACHE 6e5
  AS HTML HEADLESS

Piping

To extract data out of the evaluater, you can use the PIPE keyword

PIPE "hello world" TO STDOUT

you can also output data to a file

PIPE "Draco was here" TO FILE "draco.txt"

Extraction

Draco provides in-built support for parsing HTML selectors and JSON queries

VAR res = FETCH "https://reqres.in/api/users" AS JSON
VAR data = EXTRACT "data.0.id" FROM res
PIPE data TO STDOUT
VAR res = FETCH "https://reqres.in" AS HTML
VAR headline = EXTRACT "h2.tagline:nth-child(1)" FROM res
PIPE headline TO STDOUT

Examples

Fetch data and log it to the console

VAR data = FETCH "https://www.cs.cmu.edu/afs/cs/project/ai-repository/ai/areas/nlp/corpora/names/male.txt"
PIPE title TO STDOUT

Fetch data and put it to file

VAR data = FETCH "https://jsonplaceholder.typicode.com/users/1"
  AS JSON 
  OR DIE 

PIPE data TO FILE "user.json" 

Scrape data from a website

VAR data = FETCH https://www.cnet.com/

VAR headline = EXTRACT 
  ".c-pageHomeHightlights>div:nth-child(1)>div:nth-child(2)>div:nth-child(1)>a:nth-child(1)>div:nth-child(1)>div:nth-child(2)>div:nth-child(1)>h3:nth-child(1)>span:nth-child(1)"
  FROM data 
  AS HTML

VAR txt = EXTRACT innerText FROM headline 
  AS JSON

PIPE txt TO STDOUT

API

module draco, exports the lexer, interpreter and an parser.

import * as draco from "dracoql";

const lexer = new draco.lexer(`PIPE "hello world" TO STDOUT`);
const parser = new draco.parser(lexer.lex());
const interpreter = new draco.interpreter(parser.parse());

(async () => {
  await interpreter.run();
})()