registry-static
v2.4.0
Published
create a static, flat file based npm registry
Downloads
18
Readme
registry-static(1) -- a flat-file registry mirror tool
SYNOPSIS
registry-static -d my.registry.com -o /var/www/registry
DESCRIPTION
This module follows the skimdb
from npm and creates a flat file copy of the
registry on your local machine. This can then be served up behind nginx
to
give you a local, read-only mirror of the npm registry. It's not a private registry
nor does it attempt to be one. It's designed to allow you to use the registry
in-network to speed up your local development.
USAGE
This is the most common use, this will start following the registry
and make copies of the modules and their tarballs into /var/www/registry
.
When it does, it will replace the tarball
key in the package.json
with the url provided with -d
(so the downloads resolve locally).
This module also uses the sequence file, so you can kill it and it should restart from where it left off.
WARNING: This may take quite a while to complete and a large amount of disk space (more than 283GB at last check)
DIRECTORY STRUCTURE
When it pulls the package.json
for a module, it will create a directory structure similar to this:
./davargs
|-- -
| |-- davargs-0.0.1.tgz
| +-- davargs-0.0.2.tgz
|-- 0.0.1
| +-- index.json
|-- 0.0.2
| +-- index.json
|-- index.json
+-- latest
+-- index.json
This allows for the following url styles to work:
my.registry.com/davargs
my.registry.com/davargs/0.0.1
my.registry.com/davargs/0.0.2
my.registry.com/davargs/latest
my.registry.com/davargs/-/davargs-0.0.1.tgz
my.registry.com/davargs/-/davargs-0.0.2.tgz
NGINX CONFIGURATION
Since we are writing a bunch of index.json
files, you need to setup nginx
to front the filesytem to resolve things like:
myregistry.com/foo
/index.json
Note also that we write 404.json
and the top-level index.json
to the '-' directory under the root, in order to not
collide with packages that might have those names.
Here is the simple nginx.config
that I use on my local mirror.
server {
listen 80;
server_name localhost;
charset utf-8;
root /Users/davglass/registry/;
index index.json;
#cache the crap out of the tarballs
location ~* ^.+\.(?:tgz)$ {
expires 30d;
tcp_nodelay off;
open_file_cache max=3000 inactive=120s;
open_file_cache_valid 45s;
open_file_cache_min_uses 2;
open_file_cache_errors off;
}
#don't cache the main index
location /-/index.json {
expires -1;
}
#cache all json by modified time
location / {
expires modified +15m;
try_files $uri $uri/-/index.json $uri/index.json $uri.json =404;
}
error_page 404 /-/404.json;
}
The try_files
here with $uri
are to keep nginx from doing a 302 redirect without the trailing /
LOGIC
First, no files are ever deleted. The reference to the tarball may be removed from the local package.json
but
the tarball itself is not removed. This is to ensure that things like npm shrinkwrap
continue to work.
Each download is verified against the shasum
in the package.json. If the verification fails, the file is retried
up to 4 times. If it fails all of those, it is skipped and not stored locally.
Each change
request will process the entire module, not just the change alone. This is to make sure that tags
and new versions are all in sync.
HOOKS
If you provide --hooks <path>
, the module at path
will be required
. It is expected to export an object
whose properties are hook functions. A hook function has the following signature:
function(data, callback){ /* ... */ }
data
is a blob of data corresponding to the current state. Usually it's a set of useful metadata about
the package currently being processed.
this.options
is the result of yargs parsing the command-line options and/or
config file. You can use this to refer to any existing options, or to introduce your own.
this.log
is registry-static
's instance of davlog
.
Hook functions are called with the same context each time and for each hook. It's one context shared throughout the whole process. You can use this to share data between invocations or different hooks.
The callback's signature is:
function(error, shouldSave){ /* ... */ }
Where shouldSave
is a boolean stating whether or not to actually perform the action that happens right
after the hook (usually writing something to disk). In most cases, you'll want to call the callback with
callback(null, true)
. To prevent the action from happening, you can do callback(null, false)
.
Note the data
passed in is a reference, so modifications to it may have side effects. For example, modifying
data.tarball
in the tarball
callback will change the location of the tarball.
Please don't throw any errors inside a hook function. If an error occurs, pass it along as a first parameter to
the callback
.
Here are the currently provided hooks:
beforeAll
: Called before any data is written, at the beginning of processing a change. If the callback is called with an error orfalse
, no more processing will be done for this change, and no files will be written.afterAll
: Called after all the data is written, at the end of processing a change. If there is no error, the callback parameters are ignored.globalIndexJson
: Called before writing an update to the top-levelindex.json
.indexJson
: Called before writing a package's mainindex.json
.versionJson
: Called before writing theindex.json
for a particular package version.tarball
: Called before downloading/verifying/writing a package tarball.afterTarball
: Called after downloading/verifying/writing a package tarball. If there is no error, the callback parameters are ignored.startup
: Called before doing anything else at start time.shasumCheck
: Called in order to check thesha1sum
of a tarball. Calling back with true implies the shasum passed. Default is inlib/defaultShasumCheck.js
.
Some examples are included in the examples
directory.
BLOB-STORES
By default, registry-static
uses fs-blob-store
, meaning all the metadata and tarballs are stored
on disk, wherever you've decided to with the --dir
option. Alternatively, you can use a custom
blob store, as long as it implements the abstract-blob-store
spec. Just create a file that exports the blob store you want, and then pass that in to the --blobstore
option.
LOGGING
Supports --log <path>
to log all output to a specific file.
When doing this, you may want to rotate your logs. You can do this by sending the
process a SIGPIPE
signal. This will free up the file descriptor and then reattach it to the file.
If you are using logrotate.d, configure your process monitor to start registry-static
like this:
registry-static -d my.registry.com -o /var/www/ --log /var/log/registry-static/output.log
and the logrotate config file (/etc/logrotate.d/registry-static):
/var/log/registry-static/*.log {
daily
missingok
rotate 52
compress
delaycompress
notifempty
sharedscripts
postrotate
[ -f /tmp/registry-static.pid ] && kill -PIPE `cat /tmp/registry-static.pid`
endscript
}
(This assumes that your process monitor stores the pid in /tmp/registry-static.pid
.)
CAVEATS
Smart routes like /-/all
or /-/short
These routes require processing of the files. You "could" technically do it with a cache and using the fs
module
to walk the tree and build those routes.