npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

syphonx-core

v1.2.68

Published

SyphonX is a template-driven solution for extracting data from HTML in a highly efficient way. It combines the power of jQuery, Regular Expressions, and Javascript into a declarative template-driven format that extracts and reshapes HTML data into JSON.

Downloads

407

Readme

What is SyphonX?

SyphonX is a template-driven solution for extracting data from HTML in a highly efficient way. It combines the power of jQuery, Regular Expressions, and JavaScript into a declarative template-driven format that extracts and reshapes HTML data into JSON.

Simplified Web Scraping

SyphonX revolutionizes web crawling and data extraction by providing a no-code, template-driven approach that simplifies the process for data engineers, analysts, and web developers alike. Whether you're extracting pricing and stock status from retailer sites, sourcing contact information from professional firms, or gathering event schedules, SyphonX streamlines the task without the need to write complex code. Users can effortlessly create templates in JSON format through a user-friendly GUI, making data extraction accessible to all skill levels. By leveraging CSS selectors, jQuery, Regular Expressions, and JavaScript, templates are highly customizable to meet diverse web scraping needs. SyphonX transforms the complexity of web data extraction into a straightforward, efficient process, saving time and resources while maximizing data accuracy and reliability.

Unparalleled Integration Simplicity

Beyond its user-friendly interface, SyphonX distinguishes itself with a pioneering inside-out architecture--designed as a single, zero-dependency JavaScript function. This groundbreaking approach allows SyphonX to operate from within the browser, pushing data outward, in contrast to conventional 'outside-in' solutions that impose restrictive architectures. Such an unopinionated design ensures seamless integration into any existing web crawling framework--just add SyphonX into the last mile of your architecture where the web browser automation sits. Whether deployed as a standalone tool to transform HTML into structured JSON or injected into browser-based automation tools like Playwright, Puppeteer, or directly within a web browser's developer console, SyphonX adapts effortlessly to enhance your data extraction processes without disrupting your established workflows.

No Glass Cieling

Unlike other solutions that limit themselves with proprietary and often restrictive selector or filtering syntaxes, by harnessing jQuery and Regular Expressions without compromise SyphonX has no such glass ceiling. This means there's virtually no limit to what you can achieve with SyphonX; if you can do it with jQuery and Regular Expressions, you can do it with SyphonX. While jQuery hasn't been at the forefront of front-end design implementation for some time, it continues to stand as the most powerful and comprehensive tool for querying the DOM. Its unparalleled breadth and flexibility remain unmatched, even after 20+ years. Thanks to decades of industry and community refinement, jQuery is not only mature but also a hardened and reliable solution. SyphonX leverages this robust foundation, ensuring users can tackle even the most complex data extraction tasks with confidence and efficiency, without hitting the barriers commonly encountered with other tools.

Dynamic Content? No Problem!

Running in-browser, the capabilities of SyphonX extend beyond mere data extraction. It can simulate user interactions, such as clicking and scrolling, enabling a comprehensive capture of dynamic content. This adaptability makes SyphonX an invaluable asset, offering a scalable and robust solution that complements and enhances existing data extraction architectures. With its non-disruptive, unopinionated nature, SyphonX stands as a testament to flexible, efficient, and effective web scraping technology, perfectly integrating into the 'last mile' of your browser pipeline to maximize data accuracy and reliability.

Offline HTML Content Extraction

SyphonX is a powerful HTML parser and web data gathering tool that combines CSS Selectors, jQuery, and Regular Expressions to solve any web data extraction problem. When used online, it is capable of running within any web browser to extract data. It can take control of the browser by clicking and navigating around a site to access any needed data. SyphonX can also be used to extract data from offline HTML content.

Expanding its versatility, SyphonX also excels in extracting data from raw offline HTML content, enabling users to process stored HTML files without the need for a browser. This feature opens up new possibilities for analyzing historical data, offline content, and bulk processing of HTML files, all with the same ease and efficiency that SyphonX brings to online data extraction. Whether you're dealing with live web data or sifting through archives of offline HTML, SyphonX provides a seamless, unified approach to transform any HTML content into structured JSON data. This capability ensures that SyphonX is not just a powerful tool for real-time web scraping but also a comprehensive solution for data extraction across a broad spectrum of use cases, further establishing its role as an indispensable asset in your data processing toolkit.