gigapr-data-csv
v1.1.1
Published
Calls GigaPR APIs and serializes the response to a csv file
Downloads
5
Readme
gigapr-data-csv
This project calls GigaPR APIs to get the data and serializes the response to a CSV file.
The project is implemented with NodeJS on Google Cloud Functions.
Description
This cloud functions does 4 steps:
- it calls
gigapr-data-api
and downloads the detailed data of each application - it trasforms the data into a
csv
format (saved to a local temporary file) - it uploads the file to Google Drive
- it publishes a Google Pub/Sub event to notify it'sdone
The detail data for each application is retrieved with a single API call to the
endpoint /allapplications/*/detail
.
The mapping between the data in JSON format and the csv format is specified in the
file src/mapper/acmColumnMap
.
Prerequisites
npm
installed.
- Note: The default shell for
npm
must bebash
. To set this up please do a$npm config edit
and set something likescript-shell=C:\Program Files\Git\bin\bash.exe
.
gcloud
is needed when you want to deploy to Google Cloud.
You can develop, test, run locally without it. If you use the
standard pipeline and
you do not from your own machine, you can avoid having gcloud
installed locally.
Getting Started
When you have the prerequisites ok, this is the shortest way to get started:
- clone the project to you local PC
- install node dependencies using
npm i
At this point you should be able to launch a local test with npm test
.
All tests should pass 👍👍👍!
Building
Google Cloud Functions by default support Javascript on node 10. To support the latest JS features and provide more type safety, this project is built with Typescript.
The Typescript compiler tsc
transpiles the code to JavaScript and puts it
into the target
folder.
From the target
folder you can either run the code locally or deploy it to Google Cloud.
To transpile the code run:
npm run clean
npm run build
Running Locally
In order to run locally, you need to:
- set up your local env file
- set up Google authentication
- either run locally with HTTP trigger
- or run locally with Google Pub/Sub trigger
If you run from inside Credem's LAN you also need to setup the proxy.
Setup your local env file
The setup of your local enviroment is done through the .env
file.
This is a file where you can specify enviroment variables that will
be loaded into node.js
when running locally.
You can find a .env.example
in the git repo. Copy it to .env
and
adjust it to your needs. The file .env
will be ignored when committing to git.
Required variables are:
# this is the path to the file with the credentials of the service account used for
# local runs
GOOGLE_APPLICATION_CREDENTIALS="credentials/privatekey.json"
# this is the typescript function that will be invoked when the function is deployed to
# Google with an http trigger (or when run locally with `npm run start_local_http`)
DEPLOY_HTTP_ENTRY_POINT='mainHttp'
# this is the typescript function that will be invoked when the function is deployed to
# Google with a Pub/Sub trigger (or when run locally with `npm run start_local_pubsub`)
DEPLOY_PUBSUB_ENTRY_POINT='mainPubsub'
# this is the url from which data are retrieved
CREDEM_API_URL="https://europe-west3-gigapr-tst.cloudfunctions.net/gigapr-data-api-test"
# this is the id of the csv file on google drive
CREDEM_DRIVE_FILE_ID=1lSFQy6Uiy6NmpDlwvP5kIco9BY4r9BgO
# this is the local path where the file will be temporarily written before
# uploading to Google drive
CREDEM_LOCAL_FILE_PATH=tmp/tempfile.csv
# The following are the parameters used to identify the Pub/Sub topic where
# the function will publish an event when it's finished.
# The Pub/Sub topic will be: projects/$CREDEM_PROJECT_ID/topics/CREDEM_PUBSUB_TOPIC
CREDEM_PROJECT_ID="gigapr-tst"
CREDEM_PUBSUB_TOPIC="gigapr-bus-events"
A Google service-account credential file is required. See the specific paragraph for instructions to set it up.
Setup Google authentication
The first step is creating a Google Service Account, that is a technical account on GCP,
and giving you local node.js
the credentials to login with that account:
- create a Google service account
- create a key in json format
- download the json and save it here under
./credentials
folder (it will be ignored by Git) - set the enviroment variable (possibly using
.env
)GOOGLE_APPLICATION_CREDENTIALS
The second step is to ensure that the service account is authorized to access the target Application data api. That is part of the cofiguration of that API.
Running locally with HTTP trigger
To run the function locally as a http-triggered function use:
npm run clean
npm run build
npm start_local_http
At this point you can invoke the function with a GET
call to http://localhost:8082/serialize
Running locally with Pub/Sub trigger
To run locally as a function triggered by Pub/Sub use:
npm run clean
npm run build
npm start_local_pubsub
At this point you can invoke the function with a POST
call to http://localhost:8082/serialize
. The body of the POST call must contain properly structured event data.
You can use ./scripts/simulatePubsubEvent.sh
which in turn uses the event spciefied in ./data/mockPubsubEvent.json
:
{
"@type": "type.googleapis.com/google.pubsub.v1.PubsubMessage",
"attributes": {
"eventType": "RequestProcessingEnded",
"user": "[email protected]"
},
"data": ""
}
Oh, my proxy
When running inside Credem's network you may encounter several obstacles dealing with the proxy.
As of Feb 2020 the following works for me:
- For
googleapis
library to work correctly you should haveHTTP_PROXY
andHTTPS_PROXY
set tohttp://proxyre02.group.credem.net:8080
(possibly using.env
) - You should point Internet Exporer or another proxy to the same proxy and authenticate.
- The authentication will last for 15 minutes for the shole machine. So be sure to visit a new page in IE every 15 minutes.
Testing
Test is done with jest
and ts-jest
.
You can also run tests continuously within vscode using the extension Jest extension by Orta.
Unit tests do not use environment variables and they do not require an .env
file.
See README-TESTING-STRATEGY.md for other test-related information.
Deploying to GCP
Note deploying from your local machine to Google cloud is deprecated. please use the official Azure Devops pipeline instead.
Prerequisites for deployments
As a prerequisite you need to have gcloud setup, login done and running fine. You must be logged in with an account who is authorized to deploy a cloud function to the target project.
Deploy configuration
Deploy is configured through parameters set in the config section of your .env
file:
# The followign are the variables used to deploy on GCP
TODO
As a minimum, These values define the target for your deploy.
Executing the deploy
Note: if deploying for the first time Google will ask you whether to allow unauthenticated calls. Answer N!
Read data from Spreadsheet
To use a Google Spreadsheet as the source of data instead use:
npm run clean
npm run build
npm run deploy_with_spreadsheet_data
This requires to have the following set properly in your package.json
:
# The following are the variables used to deploy on GCP
...
DEPLOY_SPREADSHEET_ID=1hNDt9vt7JdPH6AxzQhHHU9iyJ0OqGb4ZVgGeO31CiHc
DEPLOY_SHEET_NAME=Sheet1
...
Read data from local CSV file
To use a CSV file as the source of data instead use:
npm run clean
npm run build
Then be sure to copy your CSV file inside the target
folder and only after that run:
npm run deploy_with_csv_data
This requires to have the following set properly in your package.json
:
# The followign are the variables used to deploy on GCP
...
DEPLOY_CSV_FILEPATH=./data/MegaDownload_ElencoApplicazioniAll.txt
...
In this case, you need to have your file save as ./target/data/MegaDownload_ElencoApplicazioniAll.txt
on your local machine before deploying.
There is a little helper file to copy the Mega download file directly into your target
when running from a Credem PC. In that case you may do:
npm run deploy_with_csv_data
npm run copy_csv_file
npm run deploy_with_spreadsheet_data
Environment variables in GCP
This .env
file will not be available in the deployed environment. You are expected
to set the corresponding env variables via configuration of the target environment.
Google cloud functions allow you to set environment variables at deploy time by the command:
gcloud functions deploy ... --set-env-vars VAR1=VALUE1,VAR2=VALUE2,...
To simplify your life, I already prepared 3 scripts to deploy with diffent configurations:
TODO
How to allow the API gateway to call the function
When deploying for the first time you want allow the API-Gateway service account to call this function. That's easy, just digit:
npm run enableapicinvoker
This will enable as invoker of the cloud function the service account
specified in your .env
file:
# The followign are the variables used to deploy on GCP
...
DEPLOY_INVOKER_SERVICE_ACCOUNT=apiman@gigapr.iam.gserviceaccount.com
Troubleshooting
My tests are failing
- check your
.env
file is set properly for:
... TODO ...
check your proxy settings (remember that if you set environment variables, those will prevail over your
.env
file)- HTTP_PROXY="http://proxyre02.group.credem.net:8080"
- HTTPS_PROXY="http://proxyre02.group.credem.net:8080"
check that your service account is authorized ...
ensure the sheets APIs are enabled for the project where your service account is defined
When running locally the data are different than my spreadsheet
- check CREDEM_USE_LOCAL_CSV_DATA in
.env
file, if set to YES, the data will be retrieved from a local CSV file and from Google Spreadsheet