npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

fp16

v0.3.0

Published

Half-precision 16-bit floating point numbers

Downloads

1,345

Readme

fp16

standard-readme compliant license NPM version TypeScript types

Half-precision 16-bit floating point numbers.

DataView has APIs for getting and setting float64s and float32s. This library provides the analogous methods for float16s, and utilities for testing how a given float64 value will convert to float32 and float16 values. Conversion implements the IEEE 754 default rounding behavior ("Round-to-Nearest RoundTiesToEven").

NaN is always encoded as 0x7e00, which extends the pattern of how browsers serialize NaN in 32 bits and is the recommendation in the CBOR spec.

This library is TypeScript-native, ESM-only, and has zero dependencies. It works in Node, the browser, and Deno.

Table of Contents

Install

npm i fp16

Usage

Set a 16-bit float

declare function setFloat16(
  view: DataView,
  offset: number,
  value: number,
  littleEndian?: boolean,
): void

Get a 16-bit float

declare function getFloat16(
  view: DataView,
  offset: number,
  littleEndian?: boolean,
): number

Precision

In addition to methods for getting and setting float16s, fp16 exports two methods for testing how a given number value will convert to 32-bit and 16-bit values.

export const Precision = {
	Exact: 0,
	Inexact: 1,
	Underflow: 2,
	Overflow: 3,
} as const

export type Precision = typeof Precision[keyof typeof Precision]

declare function getFloat32Precision(value: number): Precision
declare function getFloat16Precision(value: number): Precision
  • Precision.Exact: Conversion will not loose precision. The value is guaranteed to round-trip back to the same number value. Positive and negative zero, positive and negative infinity, and NaN all return exact. Values that can be represented losslessly as a subnormal value in the target format will return exact.
  • Precision.Overflow: the exponent of the given value is greater than the maximum exponent of the target size (127 for float32 or 15 for float16). Conversion is guaranteed to overflow to +/- Infinity.
  • Precision.Underflow: the exponent of the given value is less than the minimum exponent minus the number of fractional bits of the target size (-126 - 23 for float32 or -14 - 10 for float16). Conversion is guaranteed to underflow to +/- 0 or to the smallest signed subnormal value (+/- 2^-24 for float16 or +/- 2^-149 for float32).
  • Precision.Inexact: the exponent is within the target range, but precision bits will be lost during rounding. The value may round to +/- 0 but will never round to +/- Infinity.

Note that the boundaries for overflow and underflow are not what you might necessarily expect; this is because values with exponents just under the minimum exponent for a format map to subnormal values.

Also note that fp16 treats all NaN values as identical, ignoring sign and signalling bits when decoding, and encoding every NaN value as 0x7e00. This means that not all 16-bit values will round-trip through setFloat16 and getFloat16.

Testing

Tests use AVA and live in the test directory.

npm run test

Tests cover decoding all 65536 possible 16-bit values, rounding behaviour, subnormal values, underflows, and overflows. More tests are always welcome.

Credits

This PDF was extremely helpful as a reference for understanding the float16 format, even though fp16 doesn't use the table-based aproach it outlines.

The Golang github.com/x448/float16 package was used as a reference for implementing rounding. The test suite in tests/32to16.js was adapted from its test file float16_test.go.

Contributing

I don't expect to add any additional features to this library, or change any of the exported interfaces. If you encounter a bug or would like to add more tests, please open an issue to discuss it!

License

MIT © 2021 Joel Gustafson