jquery-scrape
v0.0.8
Published
Use jQuery as your selection engine on the result of an HTTP request.
Downloads
10
Maintainers
Readme
jquery-scrape
Use jQuery as your selection engine on an HTTP request's response.
Installation
npm i jquery-scrape -S
Usage
jquery-scrape is a convenience wrapper for request and cheerio. With almost no code, you can make an HTTP request and get back a jQuery selection engine. Here's a working example that you can run in your terminal:
require("jquery-scrape")("https://www.example.com/", $ => {
// Get the inner HTML of the DOM's body element.
const html = $("body").html();
console.log(html);
// Get the href attribute of an anchor tag.
const href = $("a").attr("href");
console.log(href);
// Traverse through each paragraph...
$("p").each((i, p) => {
// ...and get its text.
const text = $(p).text().trim();
console.log(text);
});
});
Or suppose you want to scrape a table and output its result as JSON. Suppose the table is structured like the one in this repo's test directory:
<table>
<thead>
<tr>
<th>Number</th>
<th>Value</th>
</tr>
</thead>
<tbody>
<tr>
<td>1</td>
<td>Foo</td>
</tr>
<tr>
<td>2</td>
<td>Bar</td>
</tr>
</tbody>
</table>
Scrape it like so:
require("jquery-scrape")("https://raw.githubusercontent.com/HarryStevens/jquery-scrape/master/test/test.html", $ => {
let columns = [];
$("th").each((i, th) => {
columns.push($(th).text());
});
let data = [];
const rows = $("tbody tr");
console.log(`Found ${rows.length} rows. Scraping them...`);
rows.each((rowIndex, row) => {
let object = {};
$(row).find("td").each((cellIndex, cell) => {
object[columns[cellIndex]] = $(cell).text();
});
console.log(`${((rowIndex + 1) / rows.length * 100).toFixed(1)}%`);
data.push(object);
});
console.log("Writing file: data.json");
fs.writeFile("data.json", JSON.stringify(data), err => {
if (err) throw err;
console.log("File saved.");
process.exit();
});
});
Tests
npm test