govpack
v0.0.20
Published
fetch, filter, and download CKAN (OpenData CSV metadata) as JSONP for use in a single page html app
Downloads
15
Maintainers
Readme
#govpack govpack is a tool to help download and explore CKAN datasets
all YO data is belong to us ###made for GovHack oz 2014
added DEMO at http://govpack.github.io/govpack/#68 or #1,#2,,,#100 etc (it's a copy of a.htm added at /gh-pages/index.html) which uses the big list of sites from http://instances.ckan.org/ the DEMO shows the hundred or so CKAN endpoints, BUT some of those are on the older v1, v2 'api/rest/dataset', 'api/2/rest/dataset' api's and not and not the latest v3 'api/3/action/' as fetched by govpack http://docs.ckan.org/en/ckan-1.8/apiv3.html -- so some don't show up
http://hackerspace.govhack.org/content/npm-install-g-govpack-or-github-govpackgovpack
also available on https://www.npmjs.org/package/govpack
npm install -f -g govpack
######help wanted, status and fixes required detailed below...
govpack is a command line tool (and node module) that seeks out the metadata for ALL available data sets on a given CKAN endpoint, namely.... (X=0|1|2)
0 http://demo.ckan.org/api/3/action/current_package_list_with_resources
1 https://data.qld.gov.au/3/action/current_package_list_with_resources
2 https://data.gov.au/api/3/action/current_package_list_with_resources
CLI Usage:
govpack {fetch:X} --> makes X.js module.exports=BigPackageList
govpack {filter:X} --> makes X.txt filtered JSONP IIII(filtered_csv_metadata)
govpack {download:X} --> downloads ./CSV/1.csv, ./CSV/2.csv,,, ./CSV/111.csv
the commands need to be run in that order because they depend on the previous result results are saved in the same folder as index.js ie in your global "./node_modules/govpack/index.js" folder downloaded "node_modules/govpack/format/1...n.format" files match up with the metadata in X.txt
###Output paths will be improved note: result paths will get changed to "node_modules/govpack/X/format/1...n.format" and have an option to put the results in a directory of your choice, which will be tidier and better for more ckans etc. With the X moved up to directory level, X.js and X.txt will have a common name like a.txt and b.txt for each.
###From your node code:
GP=require('govpack');
GP({fetch:0},function(){console.log('Done!!')})
.....which returns....->
Please be patient while we fetch from API#0
Downloading:
http://demo.ckan.org/api/3/action/current_package_list_with_resources
SavingAs:
C:/A/N/node_modules/govpack/0.js
####{filter:X ,format:'XYZ'} As an option you may wish to set the format for the filter step to filter for some other filetype
govpack {filter:0 ,format:'KML'}
txt|xlsx|jpg|json|html|png|pdf|xls|cvs|gif|xml|
rdf|hdf5|kml|pptx|docx|doc|odp|dat|jar|zip|shp|etc
would all be okay format:'XYZ' (case insensitive) values to try but by far CSV is the most popular default.
#a.htm a.htm (shown in the image above) is the page that uses the JSONP 0.txt, and displays the filtered JSONP metadata generated by govpack from the CKAN records, namely...
- links to the actual CSV files, (right click and choose Save File As)
- CSV file size [where available]
- table heading/description
- field names (hased and colourized so all of the same fields light up in the same color)
- field types
- column and row counts
a.htm should be useful to look at,as a sample of the final ouput. I wanted to do search and autocomplete on the field names, this is now possible :-) also CKAN has many GET verbs (including one that does SQL queries) so with our refined JSONP metatata one could genarate other ajax calls, from a web page, to open up the data even further.
###With the power of X (a simple integer as the primary key) more CKAN's can be added
govpack {filter:X,format:'XLS'}
presently in the source code they are referenced at:
CK[0]={url:'http://demo.ckan.org/api/3/action/'} // the demo data set as used by the CKAN docs
CK[1]={url:'https://data.qld.gov.au/api/3/action/'} //the state catalog of datasets
CK[2]={url:'https://data.gov.au/api/3/action/'} //the national catalog of datasets
CK[99]={url:'https://some_CKAN_action_endpoint/'} // ie ADD some more
// this CK[] array will probably end up in a seperate config file
objectified so we can describe them further and add more
NOW #2 (data.gov.au) is big and FAILS as a single request
the code has some in progress (INCOMPLETE) calls
to fetch it as several pagenated sub requests (todo)
namely GetBiggerList(x,cb){/*conglomerate page-enated package lists*/}
at one stage npm was not making the corect govpack.cmd or shell script
but as someone kindly pointed out the following 2 fixes worked!!
1) "bin": {"govpack": "index.js"}, /*add to your package.json*/
2) and Add
#!/usr/bin/env node
to the top of your index.js file
funnily enough the shebang is useful on windows!!
"C:/A/N/node.exe" "C:/A/B/2/9/Ax/20/index.js" {fetch:1}
(works for me) but govpack {fetch:1} is better since your paths will vary
index.js has code that should make govpack to work as both a Command Line tool AND a module
if(require.main === module){/*Use from the CommandLine*/}
else{module.exports=init/*work as a module*/}
####Finally (get me the data) after having run govpack {fetch:0} and govpack {filter:0} you may also call
govpack {download:0}
to download the filtered CSV file set from to disc
###more endpoints/fixes and addtions are wecolme
| CSV Tables | Are | Cool | | ------------- |:-------------:| -----:| | but what's? | inside | $1600 | | col 2 is | ???????? | $12 | | zebra stripes | are neat | $1 |
now we know
email to [email protected]