milkid
v2.0.7
Published
English | [中文](./README_ZH.md)
Downloads
812
Readme
English | 中文
Milkid is a highly customizable distributed unique ID generator written in TypeScript.
Preface
JavaScript has many excellent distributed unique ID generators, but they are all mainly targeted at some specific scenarios and cannot be customized freely.
For database primary keys, we need IDs that are sorted in lexicographical order to avoid fragmentation in the database.
For some distributed systems, we need IDs to be as evenly distributed as possible to avoid hotspots.
For news websites, we hope that the IDs in the URLs of news are completely random to avoid being traversed by crawlers.
For some short URL functions, we need the IDs in the URLs to be as short as possible.
Milkid can be highly customized to meet our requirements for IDs in different scenarios.
Moreover, the encoding table of Milkid is very safe in common scenarios. You can even use it in URLs or the class
attribute of HTML (the class
must start with a letter, and the first character of each segment of Milkid will definitely be a letter).
Installation
npm i milkid
Usage
const idGenerator = defineIdGenerator({
length: 24,
hyphen: false,
fingerprint: false,
timestamp: true,
sequential: true,
magicNumber: 12345678,
});
console.log(idGenerator.createId()); // AWdM7nLAX5XA5DJAfRD8TjtY
Timestamp
When the IDs we generate are used as database primary keys, it's better that our primary keys are incremented. If we use an unordered ID generator like UUID, it will cause the leaf nodes of our database index to split and merge frequently.
Placing the millisecond-level timestamp at the beginning of the ID naturally becomes the key to making our IDs as ordered as possible. Meanwhile, the millisecond-level timestamp can also effectively avoid the probability of ID collisions: as long as two identical IDs are not generated within one millisecond, there will be no ID duplication.
We can enable the timestamp function by setting the timestamp
option to true
.
Monotonically Increasing
Although we have no way to ensure that the insertion order of IDs is as strictly guaranteed as that of database primary keys, we can get very close. By enabling the timestamp function, we have already achieved the order of IDs within the same millisecond. By setting the sequential
option to true
, we can make the IDs generated within the same millisecond in the same process automatically increase by 1
to ensure the order of the results as much as possible.
Fingerprint
One annoying thing about UUID and some other ID generation algorithms is that they require users to provide machine IDs. However, for horizontally scalable systems nowadays, the number of running machines is not determined in advance.
When you set the fingerprint
option to true
, you can pass a string or a Buffer as a fingerprint when generating an ID. Setting a fingerprint can reduce the probability of ID collisions, but it is not mandatory. As long as the length of the ID is long enough, it is also sufficient to avoid collision problems.
Regardless of what content your fingerprint passes and how long it is, this content will be hashed and then used as part of the ID. You can choose to pass a fingerprint or not according to your needs.
Ideally, we can concatenate the following contents to form a fingerprint: user ID, UserAgent, machine ID, process ID, internal network IP address, system startup time, session counter.
Of course, not all the contents need to be used as fingerprints. Some contents may be too closely coupled with the business or may be difficult to obtain in the environment. You can choose which contents to pass according to your needs. And if your ID is long enough, even without a fingerprint, the probability of collision will be low enough.
const idGenerator = defineIdGenerator({
fingerprint: true,
timestamp: true,
});
const fingerprint = `${context.USER_ID}-${navigator.userAgent}-${process.env.MACHINE_ID}-${process.pid}-${getLocalIp()}-${process.uptime()}-${sessionStorage.getItem('sessionCounter')}`;
console.log(idGenerator.createId(fingerprint));
Composition
When you enable the timestamp and fingerprint, the ID generated by Milkid consists of the following parts:
Aba3eJC - nY5EC - 2z2SXrxk09j0
Millisecond timestamp (7) | Fingerprint (5) | Random Bits (12)
The encoding table of Milkid consists of 0-9a-zA-Z
. Each segment of Milkid is very safe in common scenarios. You can even use it in URLs or the class
attribute of HTML (the browser requires that the class
must start with a letter, and the first character of each segment of Milkid will definitely be a letter).
Collision Probability
By default, the length of the ID generated by Milkid is 24
. With the timestamp function enabled, 243 trillion IDs need to be generated within the same millisecond to have a 1% probability of at least one collision occurring.
Options
Option | Default Value | Description
---|---|---
length
| 24
| The length of the generated ID.
timestamp
| -
| Whether to use the timestamp as the beginning of the ID, which can effectively avoid fragmentation in the database.
hyphen
| false
| Whether to use hyphens to separate the various parts of the ID.
fingerprint
| false
| Whether to use a fingerprint as part of the ID. After enabling it, a fingerprint needs to be passed when generating an ID.
sequential
| true
| Whether it is sequential. When the IDs generated by the same JavaScript process are ordered, it will increase by 1 within the current millisecond each time, which is very important for the database.
Use Cases
We have listed several scenarios and provided recommended configurations for them. Note that the fingerprint function is not enabled in the following configurations.
Database Primary Keys
In the database, we enable the timestamp
and sequential
options to ensure that the generated IDs are as ordered as possible, which can effectively avoid fragmentation in the database.
const idGenerator = defineIdGenerator({
length: 24,
timestamp: true,
sequential: true,
});
Distributed Systems
In distributed systems, we need to disable the timestamp
and sequential
options to ensure that the generated IDs are as evenly distributed as possible, which can avoid hotspots.
const idGenerator = defineIdGenerator({
length: 24,
timestamp: false,
sequential: false,
});
Short URLs
In short URLs, we need the IDs to be as short as possible.
const idGenerator = defineIdGenerator({
length: 6,
timestamp: false,
sequential: false,
});
Other Languages
Python - kawaiior
See Also
nanoid - A small, fast unique string ID generator.
ulid - A time-ordered unique string ID generator.
cuid2 - A unique ID generator that takes more security into consideration.