@agentics/get
v0.2.2
Published
A versatile command-line tool and library for making HTTP requests, scraping web content, automating web interactions, loading/saving cookies, and using Puppeteer for JavaScript evaluation.
Downloads
6
Readme
@agentics/get
A versatile command-line tool and library for making HTTP requests, scraping web content, automating web interactions, loading/saving cookies, and using Puppeteer for JavaScript evaluation.
Table of Contents
Features
- Perform HTTP GET and POST requests.
- Scrape web pages using CSS selectors.
- Extract text, HTML, links, images, and cookies.
- Evaluate custom JavaScript on web pages using Puppeteer.
- Save and load cookies to/from files.
- Use customizable request headers and body.
- Puppeteer integration for headless or visible browser execution.
- Save responses, cookies, and evaluated JavaScript results to files.
Installation
npm install -g @agentics/get
Usage
Command-Line Usage
get [options]
Options
| Option | Alias | Type | Description | Default |
|------------------------|-------|---------|-------------------------------------------------------------------------------------------------------|---------|
| --get
| -g
| Boolean | Perform a GET request (default). | true
|
| --post
| -p
| Boolean | Perform a POST request. | false
|
| --url <url>
| -u
| String | URL to request. (Required) | |
| --eval <script>
| -e
| String | Evaluate JavaScript on the page using Puppeteer. | |
| --save
| -s
| Boolean | Save response, cookies, and evaluation result to files (response.json
, cookies.json
, eval.json
). | false
|
| --selector <css>
| -S
| String | CSS selector to scrape content from the page. | |
| --links
| -l
| Boolean | Return all links (<a href="">
) from the page. | false
|
| --text
| -t
| Boolean | Return the text content of the page (default if no other output option is specified). | false
|
| --html
| -H
| Boolean | Return the HTML content of the page. | false
|
| --cookies
| -c
| Boolean | Return cookies from the response headers. | false
|
| --header <key:value>
| -h
| String | Custom headers for the request (can be used multiple times). | |
| --jar <file>
| -j
| String | Load cookies from a file (in .json
format). | |
| --browser
| -b
| Boolean | Show browser window (Puppeteer headless mode is off). | false
|
Examples
Basic GET Request
get -u https://example.com
POST Request with Custom Headers and JSON Body
get -p -u https://example.com/api -h "Content-Type: application/json" -h "Authorization: Bearer token"
Scrape Text Using a CSS Selector
get -u https://example.com -S ".article-title"
Evaluate JavaScript on the Page (Using Puppeteer)
get -u https://example.com -e "document.title"
Get All Links from a Page
get -u https://example.com -l
Save Response, Cookies, and JavaScript Evaluation Result to Files
get -u https://example.com -e "document.title" -s
Get Page HTML
get -u https://example.com -H
Get Cookies from Response Headers
get -u https://example.com -c
Load Cookies from a File and Make a Request
get -u https://example.com -j cookies.json
Combining Options
You can combine multiple options to perform complex tasks. For example:
get -u https://example.com -S ".article-title" -l -c -s -e "document.title"
This command will:
- Scrape content matching
.article-title
. - Return all links from the page.
- Return cookies from the response headers.
- Save the response, cookies, and evaluation result to files.
- Evaluate JavaScript (
document.title
) on the page.
Programmatic Usage
You can also use @agentics/get
as a library in your Node.js projects.
Importing the Module
const get = require('@agentics/get');
Using exportFunctions
The exportFunctions
method allows you to perform web scraping, HTTP requests, and JavaScript evaluation programmatically.
Syntax
const results = await get(url, options);
Parameters
url
(String): The URL to request.options
(Object, optional): Configuration options.
Available Options
| Option | Type | Description |
|------------|-----------|------------------------------------------------------------------------|
| post
| Boolean | Use POST request instead of GET. |
| headers
| Object | Custom headers for the request. |
| cookies
| Boolean | Return cookies from the response headers. |
| links
| Boolean | Return all links from the page. |
| html
| Boolean | Return the HTML content of the page. |
| text
| Boolean | Return the text content of the page. |
| selector
| String | CSS selector to scrape content. |
| eval
| String | JavaScript code to evaluate on the page (using Puppeteer). |
| save
| Boolean | Save response, cookies, and evaluation result to files. |
| jar
| String | Load cookies from a file (for maintaining session between requests). |
| headless
| Boolean | Run Puppeteer in headless mode (true
by default, unless --browser
). |
Example: Making a Request with Cookies and JavaScript Evaluation
const get = require('@agentics/get');
(async () => {
try {
const url = 'https://example.com';
const options = {
headers: {
'User-Agent': 'CustomUserAgent/1.0',
},
text: true,
links: true,
cookies: true,
eval: "document.title",
};
const results = await get(url, options);
console.log('Text Content:', results.text);
console.log('Links:', results.links);
console.log('Cookies:', results.cookies);
console.log('Evaluated JS Result:', results.evalResult);
} catch (error) {
console.error('Error:', error);
}
})();
Example: Using Puppeteer with Cookies Loaded from File
const get = require('@agentics/get');
(async () => {
try {
const url = 'https://example.com';
const options = {
jar: 'cookies.json',
headless: false,
text: true,
eval: "document.querySelector('.article-title').innerText",
};
const results = await get(url, options);
console.log('Text Content:', results.text);
console.log('Evaluated JS Result:', results.evalResult);
} catch (error) {
console.error('Error:', error);
}
})();
Default Options
If no options are provided, the following defaults are used:
{
cookies: true,
links: true,
html: true,
text: true,
headless: true,
}
Saving Cookies and Responses
To persist cookies and responses between sessions, the following methods can be used:
- Use the
save
option to save the response, cookies, and evaluated result toresponse.json
,cookies.json
, andeval.json
. - Load cookies using the
jar
option to maintain the session between multiple requests.
Example: Saving and Loading Cookies
- First, save cookies from a request:
get -u https://example.com -s
- Then, load those cookies in a subsequent request:
get -u https://example.com -j cookies.json
This allows you to maintain sessions and reuse cookies for authenticated requests.
Contributing
Contributions are welcome! Please open an issue or submit a pull request on GitLab.
License
This project is licensed under the MIT License.
Author: Connor Etherington Email: [email protected] Website: https://agentics.co.za Upstream URL: https://gitlab.com/a4to/get