npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

@arbs.io/asset-extractor-wasm

v0.1.2

Published

This npm package offers a straightforward method to extract text content from various binary and text file formats. The package comes with a pre-built configuration that works out-of-the-box, requiring no additional setup. It is designed for use in Browse

Downloads

72

Readme

asset-extractor-wasm

Caution: This package is currently in development and should be treated as a preview release (pre-v1.0)

Welcome to @arbs.io/asset-extractor-wasm, a powerful npm package that provides a straightforward method to extract content from a wide range of binary and text file formats. This package is pre-configured to work seamlessly, requiring no additional setup. It is designed to be compatible with both Browsers and Node.js environments, including Visual Studio Code extensions, making it a versatile tool for your development needs.

Features

Supported File Types

The current version of the package supports content extraction from an extensive list of MIME types, including but not limited to:

| Text | Media | extension | Mimetype | | ---- | ----- | --------- | ------------------------------------------------------------------------- | | ✅ | ⚫ | txt | text/plain | | ✅ | ✅ | docx | application/vnd.openxmlformats-officedocument.wordprocessingml.document | | ✅ | ✅ | pptx | application/vnd.openxmlformats-officedocument.presentationml.presentation | | 🔲 | 🔲 | xlsx | application/vnd.openxmlformats-officedocument.spreadsheetml.sheet | | ✅ | ✅ | odp | application/vnd.oasis.opendocument.presentation | | ✅ | ✅ | ods | application/vnd.oasis.opendocument.spreadsheet | | ✅ | ✅ | odt | application/vnd.oasis.opendocument.text | | ✅ | 🔲 | xml | text/xml | | ✅ | 🔲 | pdf | application/pdf | | ✅ | 🔲 | html | text/html | | ✅ | 🔲 | epub | application/epub+zip | | ✅ | 🔲 | mobi | application/x-mobipocket-ebook |

Legend

✅: Completed 🔲: Coming soon ⚫: Not Applicable

  • Text: Extract Text
  • Media: Extract Image/Video/Audio

Requesting Additional File Support

We are always looking to expand the capabilities of @arbs.io/asset-extractor-wasm. If you need support for additional file formats, please submit an enhancement issue on the project's repository. We value your feedback and contributions as they help us improve this package for the broader developer community.

Installation

To install the package, use the following npm command:

npm install @arbs.io/asset-extractor-wasm

This command will add the package to your project's dependencies.

Usage

Here's an example of how to extract text from a buffer. If the file type is binary, the mime-type is verified using file-type.

import * as fs from 'fs'
import {
  createDocumentParser,
  getTextPlain,
} from '@arbs.io/asset-extractor-wasm'

export const documentParserExample = () => {
  const buf = fs.readFileSync(`./data_source/microservices.docx`)
  const documentParser = createDocumentParser(new Uint8Array(buf))

  console.log(`mimetype: (${documentParser?.mimetype})`)
  console.log(`extension: (${documentParser?.extension})`)
  console.log(`content [text/plain]: (${documentParser?.contents?.text!})`)
}

This example demonstrates how to read a file, convert it to a Uint8Array, and then extract the assets.

API

DocumentParser

The DocumentParser object provides the following properties:

  • mimetype: The mime-type of the buffer determined by the binary signature.
  • extension: The (file) extension of the buffer determined by the binary signature.
  • contents: An array of Content within the buffer (text, images, ...)
interface DocumentParser {
  mimetype: string
  extension: string
  contents: ParserContent | null
}

ParserContent

  • text: Text content of the buffer. There is only ever a single text content for each buffer.
  • media: Array of all embedded media assets with the buffer (images, audio, video, ...).
interface ParserContent {
  text: string | null
  media: ContentData[] | null
}

ContentData

  • identity: The identity of the binary embedded object. For example: image1.png
  • mimetype: The mime-type is set to the format of the data send to the function. For example: image/png
  • data: The raw data base64 of the image binary format
interface ContentData {
  identity: string
  mimetype: string
  data: string
}

We hope you find @arbs.io/asset-extractor-wasm useful for your projects. If you have any questions, issues, or suggestions, please feel free to open an issue on our GitHub repository. We appreciate your support and are committed to making this package even better for the developer community.