npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

@tokenizer/s3

v1.0.0

Published

Amazon S3 tokenizer

Downloads

20,924

Readme

Node.js CI CodeQL NPM version npm downloads Known Vulnerabilities

@tokenizer/s3

The tokenizer-s3 module enables seamless integration with Amazon Web Services (AWS) S3, allowing you to read and tokenize data from S3 objects in a streaming fashion. This module extends the functionality of the strtok3 tokenizer by providing support for chunked S3 data access.

Features

Streaming Support: Efficiently read and tokenize data from Amazon S3 objects using streaming, which is ideal for handling large files without loading them entirely into memory. Integration with strtok3: Works seamlessly with the strtok3 tokenizer to process S3 data streams, making it easy to handle various tokenization tasks. Flexible Access: Provides options to configure S3 access, allowing for customized tokenization workflows based on your specific needs. Promise-Based API: Utilizes a promise-based API for easy integration into modern asynchronous workflows.

Installation

npm install @tokenizer/s3

Sponsor

If you appreciate my work and want to support the development of open-source projects like music-metadata, file-type, and listFix(), consider becoming a sponsor or making a small contribution. Your support helps sustain ongoing development and improvements. Become a sponsor to Borewit

or

API Documention

makeChunkedTokenizerFromS3

Initialize a tokenizer, with the option for random access, from an Amazon S3 client for use in extracting metadata from media files.

Function Signature

function makeChunkedTokenizerFromS3(s3: S3Client, objRequest: GetObjectRequest): Promise<IRandomAccessTokenizer>

Reads from the S3 as a stream.

Parameters

  • s3 (S3Client):

    The S3 client used to make requests to Amazon S3.

    [!NOTE] To configure AWS client authentication see Configuration and credential file settings.

  • objRequest (GetObjectRequest):

    The S3 object request containing details about the S3 object to fetch. This includes properties like the bucket name and object key.

  • options (IS3Options, optional):

Returns

  • Promise<IRandomAccessTokenizer>:

    A Promise that resolves to an instance of IRandomAccessTokenizer. This tokenizer can be used to extract metadata from the specified media file in the S3 object. It supports random access reads.

makeStreamingTokenizerFromS3

Initialize a tokenizer from an Amazon S3 client for use in extracting metadata from media files.

Function Signature

function makeStreamingTokenizerFromS3(s3: S3Client, objRequest: GetObjectRequest): Promise<ITokenizer>

Reads from the S3 as a stream.

Parameters

  • s3 (S3Client):

    The S3 client used to make requests to Amazon S3.

    [!NOTE] To configure AWS client authentication see Configuration and credential file settings.

  • objRequest (GetObjectRequest):

    The S3 object request containing details about the S3 object to fetch. This includes properties like the bucket name and object key.

Returns

  • Promise<ITokenizer>:

    A Promise that resolves to an instance of ITokenizer. This tokenizer can be used to extract metadata from the specified media file in the S3 object.

Compatibility

Module: version 0.3.0 migrated from CommonJS to pure ECMAScript Module (ESM). The distributed JavaScript codebase is compliant with the ECMAScript 2020 (11th Edition) standard.

This module requires a Node.js ≥ 16 engine. It can also be used in a browser environment when bundled with a module bundler.

For TypeScript CommonJs backward compatibility, you can use load-esm.

Examples

Determine S3 file type

Determine file type (based on it's content) from a file stored Amazon S3 cloud:

import { fileTypeFromTokenizer } from 'file-type';
import { fromEnv } from '@aws-sdk/credential-providers';
import { S3Client } from '@aws-sdk/client-s3';
import { makeChunkedTokenizerFromS3 } from '@tokenizer/s3';

(async () => {

  // Initialize S3 client
  const s3 = new S3Client({
    region: 'eu-west-2',
    credentials: fromEnv(),
  });

  // Initialize S3 tokenizer
  const s3Tokenizer = await makeChunkedTokenizerFromS3(s3, {
    Bucket: 'affectlab',
    Key: '1min_35sec.mp4'
  });

  // Figure out what kind of file it is
  const fileType = await fileTypeFromTokenizer(s3Tokenizer);
  console.log(fileType);
})();

See also example at file-type.

Reading audio metadata from Amazon S3

Retrieve music-metadata

import { makeChunkedTokenizerFromS3 } from '@tokenizer/s3';
import { S3Client } from '@aws-sdk/client-s3';
import { parseFromTokenizer } from 'music-metadata/lib/core';

/**
 * Retrieve metadata from Amazon S3 object
 * @param objRequest S3 object request
 * @param options `tokenizer-s3` options
 * @return Metadata
 */
async function parseS3Object(s3, objRequest, options) {
  const s3Tokenizer = await makeChunkedTokenizerFromS3(s3, objRequest);
  return parseFromTokenizer(s3Tokenizer, options);
}

(async () => {
  const s3 = new S3Client({});

  const metadata = await parseS3Object(s3, {
    Bucket: 'standing0media',
    Key: '01 Where The Highway Takes Me.mp3'
  });

  console.log(metadata);
})();

A module implementation of this example can be found in @music-metadata/s3.

Dependency graph

dependency graph