md-combine
v1.0.3
Published
Combine markdown files from a GitHub repository, useful for creating context for Claude, ChatGPT and other AI and RAG
Downloads
17
Maintainers
Readme
md-combine
A command-line tool to combine markdown files from GitHub repositories. This tool can recursively fetch and combine markdown files from GitHub repositories, useful for creating context for Claude, ChatGPT and other LLM system prompt.
Features
- 🔍 Recursively fetch files from GitHub repositories
- 📂 Support for multiple file extensions
- 🔄 Automatic rate limit handling
- 📝 Generated table of contents
- 🔑 GitHub token support for higher rate limits
Installation
You can run this tool directly using npx:
npx md-combine
Or install it globally:
npm install -g md-combine
Usage
Basic usage:
npx md-combine -u https://github.com/user/repo/tree/main/docs
Options
Options:
-u, --url <url> GitHub repository URL (required)
-o, --output <path> Output file path (default: "combined_output.md")
-r, --recursive Recursively fetch files from subdirectories
-t, --token <token> GitHub personal access token (optional)
-e, --extensions <exts...> File extensions to include (default: .md)
--help Display help for command
-V, --version Output the version number
Examples
- Basic usage (markdown files only):
npx md-combine -u https://github.com/user/repo/tree/main/docs
- Recursive search with custom output file:
npx md-combine -u https://github.com/user/repo/tree/main/docs -r -o documentation.md
- Multiple file extensions:
npx md-combine -u https://github.com/user/repo -r -e .md .txt .json
- Using GitHub token for higher rate limits:
npx md-combine -u https://github.com/user/repo -t your_github_token
- Combining specific file types recursively:
npx md-combine -u https://github.com/user/repo -r -e .vue .js .ts
GitHub Token
While the tool works without a token, GitHub API has rate limits:
- Without token: 60 requests per hour
- With token: 5,000 requests per hour
To use a token:
- Create a token at https://github.com/settings/tokens
- Use it in either way:
- Pass it via command line:
-t your_token
- Set environment variable:
export GITHUB_TOKEN=your_token
- Pass it via command line:
Output Format
The combined file includes:
- Generation timestamp
- Source repository information
- Table of contents with links
- Original file paths as headers
- File contents with proper separation
- Navigation-friendly anchor links
Example output structure:
# Combined Files from owner/repo
Generated on: 2024-11-10T12:00:00.000Z
Source: owner/repo/docs
Extensions: md, txt
## Table of Contents
- [docs/intro.md](#docs-intro-md)
- [docs/api/methods.md](#docs-api-methods-md)
---
<h2 id="docs-intro-md">docs/intro.md</h2>
[Content of intro.md]
---
<h2 id="docs-api-methods-md">docs/api/methods.md</h2>
[Content of methods.md]
Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
License
MIT
Author
Jithin Sha