whatdidibuy
v0.2.0
Published
Scrape your order history from Digi-Key, Mouser, and more
Downloads
1
Readme
WhatDidIBuy
This project is a set of scrapers that will let you collect your order history from electronics distributors and retailers, and save it as CSV files. Invoices in PDF format can also be downloaded and saved.
Currently, these websites are supported:
A note about scraping
This project is meant for personal use only -- and was created out of a need to catalog, in one place, what parts I had already ordered in the past, so that I do not end up re-ordering the same things.
Some of these websites do not like being scraped, but that is usually easy to get past, either with some scripts to hide the fact that we are using browser automation, or by solving a CAPTCHA.
Since the scripts are not fully automated (they require the user to log in manually), and will only hit the website in low volumes in order to get a single user's order history, I believe this use of browser automation should be within the usage policies of the websites in question.
In any case, please use your own judgement and restraint when using these scripts.
Installation and usage
Create a directory where you would like to save the collected CSV files. In that directory, install WhatDidIBuy with:
npm install whatdidibuy
This will also install Puppeteer as a dependency, which will download a local copy of Chromium. If you wish to skip the Chromium download and use your own copy of Chrome or Chromium, please see their documentation about environment variables.
Next, run WhatDidIBuy with:
./node_modules/.bin/whatdidibuy
This command will show the available scrapers. For example, to grab your order history from Digi-Key, run:
./node_modules/.bin/whatdidibuy digikey
This will launch a Chromium window with the Digi-Key website.
Manual actions
The scrapers will launch the appropriate website but will not automatically log in for you. When you see the website, you will need to:
- Log in with your credentials.
- Navigate to the order history page for the website.
The scrapers will wait for the order history page to be shown, and will swing into action at that point.
If everything goes well, you will get two CSV files per website: website-orders.csv
and website-items.csv
. Invoices, if available, will be saved to website/
.
CSV format
The *-orders.csv
files have these fields:
site
- the scraper that was used to get this dataid
- a site-specific order IDdate
- order date, inYYYY-MM-DD
formatstatus
- a site-specific order status
The *-items.csv
files have these fields:
site
- the scraper that was used to get this dataord
- the site-specific order IDidx
- line item index in the orderdpn
- distributor part numbermpn
- manufacturer part numbermfr
- manufacturerqty
- item quantitydsc
- item descriptionupr
- unit pricelnk
- link to the item page (may be relative)img
- image URL (may be relative)
Note that all scrapers might not output all of these fields, depending on what data is actually available.
License
Licensed under the Apache License, version 2.0.
Credits
Created by Nikhil Dabas.