@discoveryjs/scan-git
v0.1.5
Published
A tool set for fast and efficient git scanning to capture data with focus on large repos
Downloads
162
Readme
@discoveryjs/scan-git
Usage
npm install @discoveryjs/scan-git
API
import { createGitReader } from '@discoveryjs/scan-git';
const repo = await createGitReader('path/to/.git');
const commits = await repo.log({ ref: 'my-branch', depth: 10 });
console.log(commits);
await repo.dispose();
createGitReader(gitdir, options?)
gitdir
: string - path to the git repooptions
– optional settings:maxConcurrency
– limit the number of file system operations (default: 50)cruftPacks
– defines how cruft packs are processed:'include'
ortrue
(default) - process all packs'exclude'
orfalse
- exclude cruft packs from processing'only'
- process cruft packs only
Refs
Common parameters:
ref
: string – a reference to an object in repositorywithOid
: boolean – a flag to include resolved oid for a reference
repo.defaultBranch()
Returns default branch name used in a repo:
const defaultBranch = await repo.defaultBranch();
// 'main'
The algorithm to identify a default branch name:
- if there is only one branch, that must be the default
- otherwise looking for specific branch names, in this order:
upstream/HEAD
origin/HEAD
main
master
repo.currentBranch()
Returns the current branch name along with its commit oid.
If the repository is in a detached HEAD state, name
will be null
.
const currentBranch = repo.currentBranch();
// { name: 'main', oid: '8bb6e23769902199e39ab70f2441841712cbdd62' }
const detachedHead = repo.currentBranch();
// { name: null, oid: '8bb6e23769902199e39ab70f2441841712cbdd62' }
repo.isRefExists(ref)
Checks if a ref
exists.
const isValidRef = repo.isRefExists('main');
// true
repo.expandRef(ref)
Expands a ref
into a full form, e.g. 'main'
-> 'refs/heads/main'
.
Returns null
if ref
doesn't exist. For the symbolic ref names ('HEAD'
, 'FETCH_HEAD'
, 'CHERRY_PICK_HEAD'
, 'MERGE_HEAD'
and 'ORIG_HEAD'
) returns a name without changes.
const fullPath = repo.expandRef('heads/main');
// 'refs/heads/main'
repo.resolveRef(ref)
Resolves ref
into oid if it exists, otherwise throws an exception.
In case if ref
is oid, returns this oid back. If ref is not a full path, expands it first.
const oid = repo.resolveRef('main');
// '8bb6e23769902199e39ab70f2441841712cbdd62'
repo.describeRef(ref)
Returns an info object for provided ref
.
const info = repo.describeRef('HEAD');
// {
// path: 'HEAD',
// name: 'HEAD',
// symbolic: true,
// ref: 'refs/heads/test',
// oid: '2dbee47a8d4f8d39e1168fad951b703ee05614d6'
// }
const info = repo.describeRef('main');
// {
// path: 'refs/heads/main',
// name: 'main',
// symbolic: false,
// scope: 'refs/heads',
// namespace: 'refs',
// category: 'heads',
// remote: null,
// ref: null,
// oid: '7b84f676f2fbea2a3c6d83924fa63059c7bdfbe2'
// }
const info = repo.describeRef('origin/HEAD');
// {
// path: 'refs/remotes/origin/HEAD',
// name: 'HEAD',
// symbolic: false,
// scope: 'refs/remotes',
// namespace: 'refs',
// category: 'remotes',
// remote: 'origin',
// ref: 'refs/remotes/origin/main',
// oid: '7b84f676f2fbea2a3c6d83924fa63059c7bdfbe2'
// }
repo.isOid(value)
Checks if a value
is a valid oid.
repo.isOid('7b84f676f2fbea2a3c6d83924fa63059c7bdfbe2'); // true
repo.isOid('main'); // false
repo.listRemotes()
const remotes = repo.listRemotes();
// [
// 'origin'
// ]
repo.listRemoteBranches(remote, withOid?)
Get a list of branches for a remote.
const originBranches = await repo.listRemoteBranches('origin');
// [
// 'HEAD',
// 'main'
// ]
const originBranches = await repo.listRemoteBranches('origin', true);
// [
// { name: 'HEAD', oid: '7c2a62cdbc2ef28afaaed3b6f3aef9b581e5aa8e' }
// { name: 'main', oid: '56ea7a808e35df13e76fee92725a65a373a9835c' }
// ]
repo.listBranches(withOid?)
Get a list of local branches.
const localBranches = await repo.listBranches();
// [
// 'HEAD',
// 'main'
// ]
const localBranches = await repo.listBranches(true);
// [
// { name: 'HEAD', oid: '7c2a62cdbc2ef28afaaed3b6f3aef9b581e5aa8e' }
// { name: 'main', oid: '56ea7a808e35df13e76fee92725a65a373a9835c' }
// ]
repo.listTags(withOid?)
Get a list of tags.
const tags = await repo.listTags();
// [
// 'v1.0.0',
// 'some-feature'
// ]
const tags = await repo.listTags(true);
// [
// { name: 'v1.0.0', oid: '7c2a62cdbc2ef28afaaed3b6f3aef9b581e5aa8e' }
// { name: 'some-feature', oid: '56ea7a808e35df13e76fee92725a65a373a9835c' }
// ]
File lists
repo.treeOidFromRef(ref)
Resolve a tree oid by a commit reference.
ref
: string (default:'HEAD'
) – commit reference
const treeOid = await repo.treeOidFromRef('HEAD');
// 'a1b2c3d4e5f6...'
repo.listFiles(ref, filesWithHash)
List all files in the repository at the specified commit reference.
ref
: string (default:'HEAD'
) – commit referencefilesWithHash
: boolean (default:false
) – specify to return blob's hashes
const headFiles = repo.listFiles(); // the same as repo.listFiles('HEAD')
// [ 'file.ext', 'path/to/file.ext', ... ]
const headFilesWithHashes = repo.listFiles('HEAD', true);
// [ { path: 'file.ext', hash: 'f2e492a3049...' }, ... ]
repo.getPathEntry(path, ref)
Retrieve a tree entry (file or directory) by its path at the specified commit reference.
path
: string - the path to the file or directoryref
: string (default:'HEAD'
) - commit reference
const entry = await repo.getPathEntry('path/to/file.txt');
// { isTree: false, path: 'path/to/file.txt', hash: 'a1b2c3d4e5f6...' }
repo.getPathsEntries(paths, ref)
Retrieve a list of tree entries (files or directories) by their paths at the specified commit reference.
paths
: string[] - an array of paths to files or directoriesref
: string (default:'HEAD'
) - commit reference
const entries = await repo.getPathsEntries([
'path/to/file1.txt',
'path/to/dir1',
'path/to/file2.txt'
]);
// [
// { isTree: false, path: 'path/to/file1.txt', hash: 'a1b2c3d4e5f6...' },
// { isTree: true, path: 'path/to/dir1', hash: 'b1c2d3e4f5g6...' },
// { isTree: false, path: 'path/to/file2.txt', hash: 'c1d2e3f4g5h6...' }
// ]
repo.deltaFiles(nextRef, prevRef)
Compute the file delta (changes) between two commit references, including added, modified, and removed files.
nextRef
: string (default:'HEAD'
) - commit reference for the "next" stateprevRef
: string (optional) - commit reference for the "previous" state
const fileDelta = await repo.deltaFiles('HEAD', 'branch-name');
// {
// add: [ { path: 'path/to/new/file.txt', hash: 'a1b2c3d4e5f6...' }, ... ],
// modify: [ { path: 'path/to/modified/file.txt', hash: 'f1e2d3c4b5a6...', prevHash: 'a1b2c3d4e5f6...' }, ... ],
// remove: [ { path: 'path/to/removed/file.txt', hash: 'a1b2c3d4e5f6...' }, ... ]
// }
Commits
repo.readCommit(ref)
repo.log(options)
Return a list of commits in topological order.
Options:
ref
– oid, hash, refdepth
(default50
) – limits commits count
const commits = await repo.log({ ref: 'my-branch', depth: 10 });
// [
// Commit,
// Commit,
// ...
// ]
Note: Pass
Infinity
asdepth
value to load all the commits that are reachable fromref
at once.
Statistics & info
repo.readObjectHeaderByHash(hash)
repo.readObjectByHash(hash, cache?)
repo.readObjectHeaderByOid(oid)
repo.readObjectByOid(oid, cache?)
repo.stat()
Returns statistics for a repo:
const stats = await repo.stat();
// {
// refs: { ... },
// objects: {
// loose: { ... },
// packed: { ... }
// }
// }
Utils
parseContributor()
parseTimezone()
parseAnnotatedTag()
parseCommit()
parseTree()
diffTrees()
Compare
| scan-git | isomorphic-git | Feature |
| :------: | :------------: | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| ✅ | ✅ | loose refs |
| ✅ | ✅ | packed refs |
| 🚫 | ✅ | index file Boosts fetching a file list for HEAD |
| ✅ | ✅ | loose objects |
| ✅ | ✅ | packed objects (*.pack
+ *.idx
files) |
| ✅ | 🚫 | 2Gb+ packs support Version 2 pack-*.idx
files support packs larger than 4 GiB by adding an optional table of 8-byte offset entries for large offsets |
| ✅ | 🚫 | On-disk reverse indexes (*.rev
files) Reverse index is boosting operations such as a seeking an object by offset or scanning objects in a pack order |
| 🚫 | 🚫 | multi-pack-index (MIDX) Stores a list of objects and their offsets into multiple packfiles, can provide O(log N) lookup time for any number of packfiles |
| 🚫 | 🚫 | multi-pack-index reverse indexes (RIDX) Similar to the pack-based reverse index |
| ✅ | 🚫 | Cruft packs A cruft pack eliminates the need for storing unreachable objects in a loose state by including the per-object mtimes in a separate file alongside a single pack containing all loose objects |
| 🚫 | 🚫 | Pack and multi-pack bitmaps Bitmaps store reachability information about the set of objects in a packfile, or a multi-pack index |
| 🚫 (TBD) | 🚫 | commit-graph A binary file format that creates a structured representation of Git’s commit history, boost some operations |
License
MIT