npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

@themarkup/blacklight-collector

v3.4.1

Published

A real-time website privacy inspector.

Downloads

110

Readme

Blacklight-Collector

NOTE: This repo contains some, but not all, of the code backing Blacklight. It may not be very useful on its own. We're thinking about ways to move more of the functionality into this package in order to make it more generally useful.

For more information about the blacklight-collector please read our methodology.

blacklight-collector is available on npm. You can add it to your own project with the following command.

npm i @themarkup/blacklight-collector

If you are interested in running it locally you can clone this repository and follow the instructions below.

Build

nvm use

npm install

npm run build

Usage

npm run example.

Results are stored in demo-dir by default

Collector configuration

collect takes the following arguments:

  • inUrl required
    • The URL you want to scrape
  • outDir
    • default: saves to tmp directory and deletes after completion
    • To save the full report provide a directory path
  • blTests
    • Array of tests to run
    • default: All
      • "behaviour_event_listeners"
      • "canvas_fingerprinters"
      • "canvas_font_fingerprinters"
      • "cookies"
      • "fb_pixel_events"
      • "key_logging"
      • "session_recorders"
      • "third_party_trackers"
  • numPages
    • default: 3
    • crawl depth
  • headless
    • Boolean flag, useful for debugging.
  • emulateDevice
    • Puppeteer makes device emulation pretty easy. Choose from this list
  • captureHar
    • default: true
    • Boolean flag to save the HTTP requests to a file in the HAR(Http Archive Format).
    • Note: You will need to provide a path to outDir if you want to see the captured file
  • captureLinks
    • default: false
    • Save first and third party links from the pages
  • enableAdBlock
    • default: false
  • clearCache
    • default: true
    • Clear the browser cookies and cache
  • saveBrowsingProfile
    • default: false
    • Lets you optionally save the browsing profile to the outDir
  • quiet
    • default: true
    • dont pipe raw event data to stdout
  • title
    • default: 'Blacklight Inspection'
  • saveScreenshots
    • default: true
  • headers
    • default: {} (expects { "[HTTP header]": "[value]", ... })
  • defaultTimeout
    • default: 30000
    • amount of time the page will wait to load
  • defaultWaitUntil
  • puppeteerExecutablePath
    • Path to Chromium executable.
    • default: uses bundled puppeteer chromium
  • extraChromiumArgs
    • Extra flags to pass to Chromium executable
    • default: []

Inspection Result

blacklight-collector creates a few different assets at the end of an inspection, these include:

  • browser-cookies.json
    • JSON file containing a list of all the cookies set on that website.
  • inspection-log.ndjson
    • This file contains all the raw events that are recorded during the inspection which are used for analysis.
  • inspection.json:
    • Final inspection report that includes the following keys:
      • browser: Details of the browser version used.
      • browsing_history: List of pages that were visited.
      • config: Inspection configuration.
      • deviceEmulated: Information about the device that was emulated for this inspection.
      • end_time: When the inspection ended.
      • host: The hostname of the visited website.
      • hosts: A list of first-party and third-party hosts visited on this inspection.
      • reports: The initial results of the tests blacklight runs. For more information please read the methodology.
      • script: Details about the NodeJS version, host and this package version.
      • start_time: When the inspection began.
      • uri_ins: The URL that was entered by the user.
      • uri_dest: The final url that was visited after any redirects.
      • uri_redirects: The redirect chain.
  • n.html
    • Nth inspected page's html source.
  • n.jpeg
    • Nth inspected page's screenshot.
  • requests.har
    • HTTP archive of all the network requests.
    • TIP: Firefox lets you import a HAR file and visualize it using the network tab in the developer tools.
    • You can also view it here.

Blacklight would not be possible without the work of OpenWPM and the EU-EDPS's website evidence collector