@wikipathways/cget
v0.1.7
Published
Robust streaming parallel download manager with filesystem cache
Downloads
1,377
Readme
cget
cget
is a robust streaming parallel download manager with a filesystem cache and a simple API.
These docs are relevant to the new version under construction on Github, not yet published on NPM.
Features
- Promise-based API, returns HTTP headers and a Node.js stream with contents.
- Filesystem cache mirrors remote hosts and their directory structure.
- Easy to bypass
cget
and look at cached files.
- Easy to bypass
- Stores headers in separate
.header.json
files. - Caches HTTP errors to avoid repeating failing requests.
- Limits concurrent downloads automatically using cwait.
- Follows and caches redirect headers.
- Built on top of request.
- Optionally allow streaming from
file://
URLs, bypassing the cache. - Add arbitrary files in the cache with any URI (URL or URN) as the key.
- Written in TypeScript.
cget
is perfect for downloading and caching various schema files,
and is used in cxsd
Usage
Cached downloads
var Cache = require('cget').Cache;
// Store files in "cache" subdirectory next to this script.
var basePath = require('path').join(__dirname, 'cache');
// Initialize the download cache.
var cache = new Cache(basePath, {
// Allow up to 2 parallel downloads.
concurrency: 2
});
// Download a web page and print some info.
cache.fetch('http://www.google.com/').then(function(result) {
console.log('Remote address: ' + result.address.url);
console.log('Local cache path: ' + result.address.path);
console.log('HTTP status code: ' + result.status + ' ' + result.message);
console.log('Headers:');
console.log(result.headers);
console.log('Content:');
result.stream.pipe(process.stdout);
});
Running it the first time prints and saves the downloaded file and its headers including any redirects in local files, for example:
cache/www.google.com.header.json
cache/www.google.<COUNTRY>/<NONCE>
cache/www.google.<COUNTRY>/<NONCE>.header.json
The second time it prints the exact same output, but without needing a network connection.
Caching arbitrary files
The store
method supports caching a string with any URI (URL or URN) as the key:
var cache = new (require('cget').Cache)();
cache.store('urn:x-inspire:specification:gmlas:GeographicalNames:3.0', 'Some data');
cache.store('http://inspire.ec.europa.eu/schemas/ad/4.0', 'More data');
License
Copyright (c) 2015-2016 BusFaster Ltd