@riboseinc/glossarist-csv-converter
v0.1.0
Published
= Glossarist CSV conversion script in TypeScript
Downloads
4
Keywords
Readme
= Glossarist CSV conversion script in TypeScript
Initializes a Paneron repository with a Glossarist dataset from a CSV file with terminological data.
Requires Node 15, NPM.
The repository also includes scaffolding for building a static site from the glossary, and a GHA workflow for deploying the site to AWS S3 + CF infrastructure (see the Deployment section).
== Usage
=== Generating data
Assuming you have the CSV ready (see CSV structure below), invoke the script as follows:
[source,console]
% npx @riboseinc/glossarist-csv-converter -l <language_code> -i <glossary_id> -d <domain_name> </path/to/file.csv> -o </repository/container/directory>
Where:
. Paths can be relative.
. Language code is expected to be a three-letter ISO 639-2/T code (“eng” for English).
. Glossary ID can be a descriptive alphanumeric string without spaces.
. Domain name should match the domain at which the registry will eventually be made accessible
(uniformResourceIdentifier
per ISO 19135-1).
Note that the glossary can only be deployed at domain root currently.
. Repository root will be created at /repository/container/directory/glossary_id
; it must not exist
(but container directory must)
=== Finalizing repository setup
After navigating to /repository/container/directory/glossary_id
,
initialize local repository, assign Github repository as a remote, and push:
[source,console]
% git init % git commit -m "Initial migration complete" % git branch -M main % git remote add origin "" % git push
=== Building site locally
After navigating to /repository/container/directory/glossary_id
,
install dependencies and build the site, after which you can serve it locally:
[source,console]
% yarn % yarn build % cd dist % python3 -m http.server 8000
== CSV structure
NOTE: For a sample file, refer to sample/glossary.csv
.
The fields are as follows:
. Human-readable identifier (required): a string . Date accepted (default is now) . Definition (required): plain text, with possible AsciiMath . Note 1: plain text, with possible AsciiMath . Note 2: plain text, with possible AsciiMath . Note 3: plain text, with possible AsciiMath . Example 1: plain text, with possible AsciiMath . Example 2: plain text, with possible AsciiMath . Example 3: plain text, with possible AsciiMath . Authoritative source reference (to a standard, e.g., "ITU-R Recommendation 592") . Authoritative source clause (in the standard referenced, e.g., "4.4") . Authoritative source link (URL, to the standard referenced) . Term 1 designation (required): as text with possible AsciiMath . Term 1 type: expression | symbol | prefix . Term 1 part of speech (expressions only): noun | adjective | adverb | verb . Term 1 grammatical number (nouns only): plural | singular | mass . Term 1 grammatical gender (nouns only): common | feminine | masculine | neuter . Term 1 participle marker (adjectives and adverbs only): non-empty value for “true” . Term 1 abbreviation marker: non-empty value for “true” . Term 2 designation: as text with possible AsciiMath . Term 2 type: expression | symbol | prefix . Term 2 part of speech (expressions only): noun | adjective | adverb | verb . Term 2 grammatical number (nouns only): plural | singular | mass . Term 2 grammatical gender (nouns only): common | feminine | masculine | neuter . Term 2 participle marker (adjectives and adverbs only): non-empty value for “true” . Term 2 abbreviation marker: non-empty value for “true” . Term 3 designation: as text with possible AsciiMath . Term 3 type: expression | symbol | prefix . Term 3 part of speech (expressions only): noun | adjective | adverb | verb . Term 3 grammatical number (nouns only): plural | singular | mass . Term 3 grammatical gender (nouns only): common | feminine | masculine | neuter . Term 3 participle marker (adjectives and adverbs only): non-empty value for “true” . Term 3 abbreviation marker: non-empty value for “true”
== Deployment
This requires you to have an S3 bucket associated with a CloudFront distribution, which in turn is associated with a domain name.
Provided the following variables are specified in Github repository secrets, the GHA workflow provided should deploy the site automatically:
- DOMAIN_NAME
- AWS_REGION
- AWS_ACCESS_KEY_ID
- AWS_SECRET_ACCESS_KEY
- CLOUDFRONT_DISTRIBUTION_ID
- S3_BUCKET_NAME
== Roadmap
. Explain how to load the created dataset in Paneron . Render static HTML site locally