@apica-io/repo-downloader
v3.6.3
Published
Repository download utility
Downloads
27
Readme
- Overview
- Repository types
- Decryption/Encryption
- Request caching
Overview
The repository download utility
You can download content from repositories with the repo_downloader utility. Downloaded content is stored in the working directory.
Command line
Usage: repo-downloader [options] <files...>
Options:
-u, --url <url> Url for the repository. For git based repositories the last part can be the branch
-w, --work <work> Working directory (default: "./work")
-t, --type <type> Type of repository [azureGit,bitBucket,gitHub,postman,ssh] (default: "gitHub")
-a, --auth <auth> Authentication. Password for basic auth
-pk, --privateKey <key> Private key. Currently only used for ssh
-user, --userName <userName> The username for accessing the repository
-at, --authType <authType> Authentication type. Default basic auth (default: "basic")
-dk, --decryptKey <decryptKey> Decrypt key
-dfm, --decryptFileMatch <decryptFileMatch> Decrypt file match regular expression
-l, --logLevel <logLevel> Log level in log4js (default: "info")
--proxy <proxy> Http(s) proxy
--cache <cache> Cache server:port
-V, --version output the version number
-h, --help display help for command
Exit code and stdout/stderr
Exit code 0 will be return if all content specified in the arguments are downloaded successfully.
- stdout is used for information logging
- stderr is used for error messages when exit code is not 0
Options
Option | Argument | Description ----------------| -------- |--------------- -u --url| Url to repository | For git repositories it can contain and additional path which is the branch name -w --work| working directory | Directory where downloaded files are stored. Default ./work -l --logLevel| Log level for log4js | See log levels for the log4js package. Default "info" -t --type | Type of repository | Values azureGit, bitBucket, gitHub, http or ssh -user | Username for authentication. When username is used the auth option should contain the password or personal access token -a --auth | Authentication string | For git basic auth with personal access token -pk --privateKey| Private key used for ssh repositories. Key file or string -at -authType | Authentication type | For git always basic in this release -dk --decryptKey| Decryption key | Decryption key used by the cryptify library -dfm | Decryption file match | Regular expression for files which will decrypted
Repository types
Git based repositories (GitHub, BitBucket and AzureGit)
The files arguments can be a files or directories in the repository. If a directory is given the content of directory is downloaded recursively inclusive sub directories with content to the working directory. You can use / as argument to clone the complete repository, it similar to clone the repository without git meta data.
GitHub repositories support
Overview
The implementation for GitHub support is based on the rest based API for repositories. API Documentation can be found here. Repositories deployed in on-premise environments are expected to be deployed with GitHub Enterprise Server. We have tested version 3.3 of the enterprise server. A repository url not started with http://github.com is expected to use GitHub Enterprise Server with this Rest API.
- A limitation in current api implementation is that each directory can contain max 1000 files. The current api usage sets that constraint. The api will only scan the first 1000 files when searching for a file name.
- Request caching is supported for this repository type.
- Default branch is main for a GitHub repository.
- Authentication support is basic auth.
The repository url format
https://github.com/{owner}/{repository}/{branch}
Example:https://github.com/ApicaSystem/Public-CBT-Solutions
Example with auth token as string
repo-downloader -t 'github' -u https://github.com/ApicaSystem/cbt-demo-apitools/master -a cb-demo-user:<pat token> README.md tm_project
Download the README.md file and tm_project from the given repository. The Branch is master in this example.
Same example with separation of username and access token
repo-downloader -t 'github' -u https://github.com/ApicaSystem/cbt-demo-apitools/master -user cb-demo-user -a <pat token> README.md tm_project
BitBucket repositories support
Overview
The implementation for BitBucket support is based on the rest based API for repositories. API Documentation can be found here.
- Request caching is supported for this repository type.
- Default branch is master for a BitBucket repository.
- Authentication support is basic auth.
The repository url format
https://{host}/{owner}/{repository}/{branch}
Example: https://bitbucket.org/JostgrenApica/cb-demo/master
Azure DevOps git repositories support
Overview
Azure DevOps supports git as one of the supported repositories on a AzureDevOps server. The repository is managed inside Azure DevOps. The implementation is based on the Azure DevOps Rest API. API Documentation can be found here.
- Request caching is not supported for this repository type.
- Default branch is main for a Azure DevOps repository.
- Authentication support is basic auth.
- The repository url must contain a project.
The repository url format
https://{host}/{owner}/{project}/{repository}/{branch}
Example: https://dev.azure.com/janostgren/Demo%20Project%201/cbt-demo-apitools/master
HTTP repositories support
Overview
With http(s) based we use an http url to download resources. It is not really a repository with support for different versions of a resource like you can do in git based repositories. Downloading of an entire directory with included files and sub directories is not supported.
- Authentication support is basic auth.
- Request caching is not supported for this repository type if the required headers exist in the response.
- No support of downloading content in a directory.
Example - Downloading index.html from the TicketMonster site.
$ repo-downloader -u http://ticketmonster.apicasystem.com/ticket-monster -t 'http' index.html
[2021-11-22T15:30:43.656] [INFO] @apica-io/repo-downloader - @apica-io/repo-downloader (3.5.0) started
[2021-11-22T15:30:43.659] [INFO] @apica-io/repo-downloader - files=[ 'index.html' ]
[2021-11-22T15:30:43.913] [INFO] HttpDownLoader - Downloading file index.html to /Users/janostgren/work/node/repoDownload/work/index.html
Postman workspace
We call Postman workspace for a postman repository. You can have personal or enterprise license to the Postman platform. The platform have a REST based API for retrieving collections and environments. You need an api key to access a workspace. You must put the api key in the auth option. The files argument can be a list collection names and environment names . The retrieved files are stored in the working directory wit with name file.json. The url option must be https://postman.com
Example
repo-downloader -u https://postman.com -a <api-key> HTTP-Bin-Requests
The postman collection HTTP-Bin-Requests will be downloaded to HTTP-Bin-Requests.json
ssh repositories
You can use Secure Shell(SSH) with a ssh server as a repository. See ssh wikipedia for details. SCP is used to copy the files from the server.
Example
repo-downloader -u ssh://[email protected]:2022/home/ems -pk ~/keys/ems-key.pem -l debug data
The syntax for a ssh repository url must match ssh://username@hostname:port/rootdir. The port and rootdir parts are optional. Default port is 22 and default directory is the users home directory. The files arguments can be files or directories. If a directory is given the content of the directory is downloaded recursively inclusive sub directories with content to the working directory. The private key option is mandatory. Username and password authentication is not supported.
Same example with specification of username
repo-downloader -u ssh://ec2-52-57-166-83.eu-central-1.compute.amazonaws.com:2022/home/ems -user ems -pk ~/keys/ems-key.pem -l debug data
Decryption/Encryption
Decryption of encrypted files
Repo-downloader supports decryption of files encrypted with the npm cryptify package. The process for decrypt encrypted files are:
- Download and install the npm cryptify package with npm install -g cryptify
- Encrypt a file with cryptify decrypt -p
- Password must be the same as use in repo-downloader with the -dk or --decryptKey option
- Upload the encrypted file to the repository.
- Use the repo-downloader with -dk and -dfm options to download and decrypt the file(s).
Encrypt/Decrypt of files should only be used for files containing confidential information like certificates or configuration files containing credentials.
Example of decrypting a file
repo-downloader -l debug -u https://github.com/ApicaSystem/cbt-demo-apitools/master -a cb-demo-user:<secret> -dk '<A secret password>' -dfm '.+\.p12' bigdata/client.p12
[2021-06-28T12:19:30.497] [INFO] @apica-io/repo-downloader - @apica-io/repo-downloader (3.0.0) started
[2021-06-28T12:19:30.502] [INFO] @apica-io/repo-downloader - files=[ 'bigdata/client.p12' ]
[2021-06-28T12:19:31.540] [INFO] GitDownLoader - Downloading file bigdata/client.p12 to /Users/janostgren/work/node/repoDownload/work/bigdata/client.p12
[2021-06-28T12:19:31.557] [INFO] GitDownLoader - Decrypting /Users/janostgren/work/node/repoDownload/work/bigdata/client.p12
Request caching
Overview
Repo_downloader supports caching of requests by storing the responses of latest version of content in a cache. The purpose of the cache is to solve the issues with request limits for some of the supported repository types. The cache works as web content cache. With the cache enabled conditional gets are used to get the content from an repository. A conditional request returning http status 304 (not modified) will not impact the request limit.
- If http status 304 is returned the content will be retrieved from the cache.
- If http status 200 is returned the content from response is used and the cache is updated.
Request and response headers used for caching
The following http headers are supported for conditional gets.
Response header | Request header
---------------|----------------
etag | If-None-Match
last-modified | If-Modified-Since
Cache technical implementation
The request cache is implemented with a memcached in memory database. See https://memcached.org/ The memcached database can be local or deployed on a share resource. You specify the location of the memcached database with --cache option on the command line. Syntax is host:port (localhost:11211). The cache can also be set with the environment variable DOWNLOADER_CACHE. If repo-downloader can not connect to the cache, the utility works as without cache support. In current version there is noway of specify that cache is required for downloading a resource.
Memcached can be downloaded on a Docker container and started with the docker-compose command.
Start the docker container
$ docker-compose up
Starting memcached_memcached_1 ... done
Attaching to memcached_memcached_1
memcached_1 | memcached 09:51:37.79
memcached_1 | memcached 09:51:37.86 Welcome to the Bitnami memcached container
memcached_1 | memcached 09:51:37.93 Subscribe to project updates by watching https://github.com/bitnami/bitnami-docker-memcached
memcached_1 | memcached 09:51:37.94 Submit issues and feature requests at https://github.com/bitnami/bitnami-docker-memcached/issues
memcached_1 | memcached 09:51:37.95
memcached_1 | memcached 09:51:37.95 INFO ==> ** Starting Memcached setup **
memcached_1 | memcached 09:51:38.18 INFO ==> Initializing Memcached
memcached_1 |
memcached_1 | memcached 09:51:38.18 INFO ==> ** Memcached setup finished! **
memcached_1 | memcached 09:51:38.29 INFO ==> ** Starting Memcached **
The docker-compose.yml file
version: '2'
services:
memcached:
image: docker.io/bitnami/memcached:1
ports:
- '11211:11211'