ibm-ia-rest
v0.3.0
Published
Library to use IBM Information Analyzer REST API
Downloads
5
Maintainers
Readme
README
ibm-ia-rest
Re-usable functions for interacting with Information Analyzer's REST API
Examples
// runs column analysis for any objects in Automated Profiling that have not been analyzed since the moment the script is run (new Date())
var iarest = require('ibm-ia-rest');
var commons = require('ibm-iis-commons');
var restConnect = new commons.RestConnection("isadmin", "isadmin", "hostname", "9445");
iarest.setConnection(restConnect);
iarest.getStaleAnalysisResults("Automated Profiling", new Date(), function(errStale, projectRID, aStaleSources) {
iarest.runColumnAnalysisForDataSources(projectRID, aStaleSources, function(errExec, tamsAnalyzed) {
// Note that the API returns async; if you want to busy-wait you need to poll events on Kafka
});
});
Meta
- license: Apache-2.0
setConnection
Set the connection for the REST API
Parameters
restConnect
RestConnection RestConnection object, from ibm-iis-commons
makeRequest
Make a request against IA's REST API
Parameters
method
string type of request, one of ['GET', 'PUT', 'POST', 'DELETE']path
string the path to the end-point (e.g. /ibm/iis/ia/api/...)input
string? any input for the request, i.e. for PUT, POSTinputType
string? the type of input, if any provided ['text/xml', 'application/json']callback
requestCallback callback that handles the responseThrows any will throw an error if connectivity details are incomplete or there is a fatal error during the request
getAllItemsToIgnore
Retrieves a list of all items that should be ignored, i.e. where they are labelled with "Information Analyzer Ignore List"
Parameters
callback
itemsToIgnoreCallback
addIADBToIgnoreList
Adds the IADB schema to a list of objects for Information Analyzer to ignore (to prevent them being added to projects or being analysed); this is accomplished by creating a label 'Information Analyzer Ignore List'
Parameters
callback
requestCallback callback that handles the response
createOrUpdateAnalysisProject
Create or update an analysis project, to include ALL objects known to IGC that were updated after the date received -- necessary before any tasks can be executed
Parameters
name
string name of the projectdescription
string description of the projectupdatedAfter
Date? include into the project any objects in IGC last updated after this datecallback
requestCallback callback that handles the response
getProjectList
Get a list of Information Analyzer projects
Parameters
callback
listCallback callback that handles the response
getProjectDataSourceList
Get a list of all of the data sources in the specified Information Analyzer project
Parameters
projectName
stringcallback
dataSourceListCallback callback that handles the response (will be entries with HOST||DB.SCHEMA.TABLE and HOST||PATH:FILE)
runColumnAnalysisForDataSources
Run a full column analysis against the list of data sources specificed (based on TAM RIDs)
Parameters
projectRID
string the RID of the project in which to execute the analysisaDataSources
Array<Object> an array of data sources, as returned by getProjectDataSourceListcallback
columnAnalysisCallback callback that handles the response
publishResultsForDataSources
Publish analysis results for the list of data sources specified
Parameters
projectRID
string RID of the IA projectaTAMs
Array<string> an array of TAM RIDs whose analysis should be publishedcallback
requestCallback callback that handles the response
getStaleAnalysisResults
Retrieve previously published analysis results
Parameters
projectName
string name of the IA projecttimeToConsiderStale
Date the time before which any analysis results should be considered stalecallback
staleAnalysisCallback callback that handles the response
reindexThinClient
Issues a request to reindex Solr for any resutls to appear appropriately in the IA Thin Client
Parameters
batchSize
int The batch size to retrieve information from the database. Increasing this size may improve performance but there is a possibility of reindex failure. The default is 25. The maximum value is 1000.solrBatchSize
int The batch size to use for Solr indexing. Increasing this size may improve performance. The default is 100. The maximum value is 1000.upgrade
boolean Specifies whether to upgrade the index schema from a previous version, and is a one time requirement when upgrading from one version of the thin client to another. The schema upgrade can be used to upgrade from any previous version of the thin client. The value true will upgrade the index schema. The value false is the default, and will not upgrade the index schema.force
boolean Specifies whether to force reindexing if indexing is already in process. The value true will force a reindex even if indexing is in process. The value false is the default, and prevents a reindex if indexing is already in progress. This option should be used if a previous reindex request is aborted for any reason. For example, if InfoSphere Information Server services tier system went offline, you would use this option.callback
reindexCallback status of the reindex ["REINDEX_SUCCESSFUL"]
getRuleExecutionFailedRecordsFromLastRun
Retrieves a listing of any records that failed a particular Data Rule or Data Rule Set (its latest execution)
Parameters
projectName
string The name of the Information Analyzer project in which the Data Rule or Data Rule Set existsruleOrSetName
string The name of the Data Rule or Data Rule SetnumRows
int The maximum number of rows to retrieve (if unspecified will default to 100)callback
recordsCallback the records that failed
getRuleExecutionResults
Retrieves the statistics of the executions of a particular Data Rule or Data Rule Set
Parameters
projectName
string The name of the Information Analyzer project in which the Data Rule or Data Rule Set existsruleOrSetName
string The name of the Data Rule or Data Rule SetbLatestOnly
boolean If true, returns only the statistics from the latest execution (otherwise full history)callback
statsCallback the statistics of the historical execution(s)
listCallback
This callback is invoked as the result of an IA REST API call, providing the response of that request.
Type: Function
Parameters
errorMessage
string any error message, or null if no errorsaResponse
Array<string> the response of the request, in the form of an array
requestCallback
This callback is invoked as the result of an IA REST API call, providing the response of that request.
Type: Function
Parameters
errorMessage
string any error message, or null if no errorsresponseXML
string the XML of the response
itemsToIgnoreCallback
This callback is invoked as the result of retrieving a list of items that Information Analyzer should ignore
Type: Function
Parameters
errorMessage
string any error message, or null if no errorstypeToIdentities
Object dictionary keyed by object type, with each value being an array of objects of that type to ignore (as identity strings, /-delimited)
statsCallback
This callback is invoked as the result of an IA REST API call to retrieve historical statistics on Data Rule executions
Type: Function
Parameters
errorMessage
string any error message, or null if no errorsstats
Array<Object> an array of stats, each stat being a JSON object with ???
recordsCallback
This callback is invoked as the result of an IA REST API call to retrieve records that failed Data Rules
Type: Function
Parameters
errorMessage
string any error message, or null if no errorsrecords
Array<Object> an array of records, each record being a JSON object keyed by column name and with the value of the column for that rowcolumnMap
Object key-value pairs mapping column names to their context (e.g. full identity in the case of database columns like RecordPK)
reindexCallback
This callback is invoked as the result of an IA REST API call to re-index Solr for IATC
Type: Function
Parameters
errorMessage
string any error message, or null if no errorsstatus
string the status of the reindex operation ["REINDEX_SUCCESSFUL"]
statusCallback
This callback is invoked as the result of an IA REST API call, providing the response of that request.
Type: Function
Parameters
errorMessage
string any error message, or null if no errorsstatus
Object the response of the request, in the form of an object keyed by execution ID, with subkeys for executionTime, progress and status ["running", "successful", "failed", "cancelled"]
columnAnalysisCallback
This callback is invoked as the result of an IA REST API call to execute column analysis.
Type: Function
Parameters
errorMessage
string any error message, or null if no errorstamsToSources
Object a dictionary of TAM RIDs to data sourcesschedule
Object an object containing 'scheduleRids', which is an array of scheduler execution IDs
staleAnalysisCallback
This callback is invoked as the result of an IA REST API call to determine which data sources have not been refreshed within a provided time period.
Type: Function
Parameters
errorMessage
string any error message, or null if no errorsprojectRID
string the RID of the Information Analyzer projectaDataSources
Array<string> an array of entries with HOST||DB.SCHEMA.TABLE for database tables and HOST||PATH:FILE for data files, only for those that are stale
dataSourceListCallback
This callback is invoked as the result of an IA REST API call to retrieve a list of data sources within a project.
Type: Function
Parameters
errorMessage
string any error message, or null if no errorsprojectRID
string the RID of the Information Analyzer projectaDataSources
Array<string> an array of entries with HOST||DB.SCHEMA.TABLE for database tables and HOST||PATH:FILE for data files
Project
Project class -- for handling Information Analyzer projects
getProjectDoc
Retrieve the Project document
setDescription
Set the description of the project
Parameters
desc
addTable
Add the specified table to the project
Parameters
datasource
string the database nameschema
stringtable
stringaColumns
Array<string> array of column names
addFile
Add the specified file to the project
Parameters
datasource
string the host name?folder
string the full path to the filefile
string the name of the fileaFields
Array<string> array of field names within the file
ColumnAnalysis
ColumnAnalysis class -- for handling Information Analyzer column analysis tasks
constructor
Parameters
project
Project the project in which to create the column analysis taskanalyzeColumnProperties
boolean whether or not to analyze column propertiescaptureResultsType
string specifies the type of frequency distribution results that are written to the analysis database ["CAPTURE_NONE", "CAPTURE_ALL", "CAPTURE_N"]minCaptureSize
int the minimum number of results that are written to the analysis database, including both typical and atypical valuesmaxCaptureSize
int the maximum number of results that are written to the analysis databaseanalyzeDataClasses
boolean whether or not to analyze data classes
setSampleOptions
Use to (optionally) set any sampling options for the column analysis
Parameters
type
string the sampling type ["random", "sequential", "every_nth"]size
number if less than 1.0, the percentage of values to use in the sample; otherwise the maximum number of records in the sample. If you use the "random" type of data sample, specify the sample size that is the same number as the number of records that will be in the result, based on the value that you specify in the Percent field. Otherwise, the results might be skewed.seed
string if type is "random", this value is used to initialize the random generators (two samplings that use the same seed value will contain the same records)step
int if type is "every_nth", this value indicates the step to apply (one row will be kept out of every nth value rows)
setEngineOptions
Use to (optionally) set any engine options to use when running the column analysis
Parameters
retainOSH
boolean whether to retain the generated DataStage job or notretainData
boolean whether to retain generated data sets (ignored when data rules are running)config
string specifies an alternative configuration file to use with the DataStage engine during this rungridEnabled
string whether or not the grid view will be enabledrequestedNodes
string the name of requested nodesminNodes
string the minimum number of nodes you want in the analysispartitionsPerNode
string the number of partitions for each node in the analysis
setJobOptions
Use to (optionally) set any job options to use when running the column analysis
Parameters
debugEnabled
boolean whether to generate a debug table containing the evaluation results of all functions and tests contained in the expression (only used for running data rules)numDebuggedRecords
int how many rows should be debugged, if debugEnabled is "true"arraySize
int the size of the array (?)autoCommit
booleanisolationLevel
intupdateExistingTables
boolean whether to update existing tables in IADB or create new ones (only used for column analysis)
addColumn
Use to add a column to the column analysis task -- both table and column can be '*' to specify all tables or all columns
Parameters
addFileField
Use to add a file field to the column analysis task -- column can be '*' to specify all fields within the file
Parameters
connection
string e.g. "HDFS"path
string directory path, not including the filenamefilename
stringcolumn
string name of the field within the filehostname
string?
PublishResults
PublishResults class -- for handling Information Analyzer results publishing tasks
constructor
Parameters
project
Project the project from which to publish analysis results
addTable
Use to add a table whose results should be published -- the table can be '*' to specify all tables
Parameters
addFile
Use to add a file whose results should be published -- file can be '*' to specify all files
Parameters