@ezs/conditor
v2.12.4
Published
ezs statements for Conditor
Downloads
469
Readme
conditor
Présentation
Ce plugin est propose une série d'instructions pour traiter (aligner les affiliations avec le RNSR), requêter les documents de l'API Conditor.
installation
npm install @ezs/core
npm install @ezs/conditor
Scripts
$ ./bin/affAlign.js < data/1000-notices-conditor-hal.json | ./bin/compareRnsr.js
recall: 0.7104885057471264
correct: 989
total: 1392
Warning: to use the scripts, you need to install
@ezs/basics
too.
Règles certaines
Les règles certaines utilisées par affAlign, appliquées à l'adresse de l'affiliation à aligner sont les suivantes:
- le
code_postal
ou laville_postale
de la structure doivent être présents, - et pour au moins une des tutelles (
etabAssoc.*.etab
, etetabAssoc.*.etab.natTutEtab
vautTUTE
):- soit
etabAssoc.*.etab.sigle
ou leetabAssoc.*.etab.libelle
sont présents, - soit
etabAssoc.*.etab.libelle
commence parUniversité
et leetabAssoc.*.etab.libelle
est présent (mais pas leetabAssoc.*.etab.sigle
).
- soit
- et on trouve la bonne structure:
- soit
etabAssoc.*.label
etetabAssoc.*.numero
sont présents proches et en séquence (ex:GDR2945
,GDR 2945
ouGDR mot 2945
), - soit
sigle
est présent, - soit
intitule
est présent.
- soit
- et la structure existait lors de la publication: une des
xPublicationDate
est entreannee_creation
et l'éventuellean_fermeture
.
Sachant qu'on appauvrit (casse, accents, tiret, apostrophe) tous les champs.
usage
Table of Contents
affAlign
Find the RNSR identifiers in the authors affiliation addresses.
Input file:
[{
"xPublicationDate": ["2012-01-01", "2012-01-01"],
"authors": [{
"affiliations": [{
"address": "GDR 2989 Université Versailles Saint-Quentin-en-Yvelines, 63009"
}]
}]
}]
Script:
[use]
plugin = basics
plugin = conditor
[JSONParse]
[affAlign]
[JSONString]
indent = true
Output:
[{
"xPublicationDate": ["2012-01-01", "2012-01-01"],
"authors": [{
"affiliations": [{
"address": "GDR 2989 Université Versailles Saint-Quentin-en-Yvelines, 63009",
"conditorRnsr": ["200619958X"]
}]
}]
}]
Parameters
year
number Year of the RNSR to use instead of the last one (optional, default2023
)
compareRnsr
Take Conditor JSON documents and compute the recall of
authors.affiliations.conditorRnsr
in relation to
authors.affiliations.rnsr
.
Examples
Input
[{
"authors": [{
"affiliations": [{
"address": "GDR 2989 Université Versailles Saint-Quentin-en-Yvelines, 63009",
"rnsr": ["200619958X"],
"conditorRnsr": ["200619958X"]
}]
}]
}]
Output
{
"correct": 1,
"total": 1,
"recall": 1
}
conditorScroll
Use scroll to return all results from Conditor API.
:warning: you have to put a valid token into a
.env
file, underCONDITOR_TOKEN
variable:
CONDITOR_TOKEN=eyJhbG...
Parameters
q
string query (optional, default""
)scroll
string duration of the scroll (optional, default"5m"
)page_size
number size of the pages (optional, default1000
)max_page
number maximum number of pages (optional, default1000000
)includes
string fields to get in the responseexcludes
string fields to exclude from the responsesid
string User-agent identifier (optional, default"ezs-conditor"
)progress
boolean display a progress bar in stderr (optional, defaultfalse
)
Examples
Input
{
"q": "Test",
"page_size": 1,
"max_page": 1,
"includes": "sourceUid"
}
Output
[[
{
"sourceUid": "hal$hal-01412764",
"_score": 5.634469,
"_sort": [
0
]
}
]]
CORHALFetch
Take String
as URL, throw each chunk from the result
Input:
[
{ q: "toto" },
]
Script:
[CORHALFetch]
url = https://corhal-api.inist.fr
Output:
[{...}, {"a": "b"}, {"a": "c" }]
Parameters
url
String? corhal api urltimeout
Number Timeout in milliseconds (optional, default1000
)retries
Number The maximum amount of times to retry the connection (optional, default5
)
Returns Object
CORHALFetch
Take String
as URL, throw each chunk from the result
Input:
[
{ q: "toto" },
]
Script:
[CORHALFetch]
url = https://corhal-api.inist.fr
Output:
[{...}, {"a": "b"}, {"a": "c" }]
Parameters
url
String? corhal api urltimeout
Number Timeout in milliseconds (optional, default1000
)retries
Number The maximum amount of times to retry the connection (optional, default5
)
Returns Object
getRnsr
Find the RNSR identifier(s) matching the address
and the publication year
of an article.
Get objects with an id
field and a value
field.
The value
field is an object containing address
and year
.
Returns an object with id
and value
fields. The value
is an array of
RNSR identifiers (if any).
Input:
[{
"id": 1,
"value": {
"address": "GDR 2989 Université Versailles Saint-Quentin-en-Yvelines, 63009",
"year": "2012"
}
}]
Output:
[{ "id": 1, "value": ["200619958X"] }]
Parameters
year
number Year of the RNSR to use instead of the last one (optional, default2023
)
getRnsrInfo
Find the RNSR information matching the address
and the publication year
of an article.
Get objects with an id
field and a value
field.
The value
field is an object containing address
and year
.
Returns an object with id
and value
fields. The value
is an array of
RNSR information objects (if any).
Input:
[{
"id": 1,
"value": {
"address": "Laboratoire des Sciences du Climat et de l'Environnement (LSCE), IPSL, CEA/CNRS/UVSQ Gif sur Yvette France",
"year": "2019"
}
}]
Output:
[{
"an_fermeture": "",
"annee_creation": "2014",
"code_postal": "75015",
"etabAssoc": [{
"etab": {
"libelle": "Centre national de la recherche scientifique",
"libelleAppauvri": "centre national de la recherche scientifique",
"sigle": "CNRS",
"sigleAppauvri": "cnrs"
},
"label": "UMR",
"labelAppauvri": "umr",
"numero": "8253"
}, {
"etab": {
"libelle": "Institut national de la sante et de la recherche medicale",
"libelleAppauvri": "institut national de la sante et de la recherche medicale",
"sigle": "INSERM",
"sigleAppauvri": "inserm"
},
"label": "U",
"labelAppauvri": "u",
"numero": "1151"
}, {
"etab": {
"libelle": "Université Paris Cité",
"libelleAppauvri": "universite paris cite",
"sigle": "U PARIS Cité",
"sigleAppauvri": "u paris cite"
},
"label": "UM",
"labelAppauvri": "um",
"numero": "111"
}],
"intitule": "Institut Necker Enfants Malades - Centre de médecine moléculaire",
"intituleAppauvri": "institut necker enfants malades centre de medecine moleculaire",
"num_nat_struct": "201420755D",
"sigle": "INEM",
"sigleAppauvri": "inem",
"ville_postale": "PARIS",
"ville_postale_appauvrie": "paris"
}]
Parameters
year
number Year of the RNSR to use instead of the last one (optional, default2023
)
WOSFetch
Take String
as URL, throw each chunk from the result
Input:
[
{ q: "toto" },
]
Script:
[WOSFetch]
token = SDQedaeaazedsqsd
Output:
[{...}, {"a": "b"}, {"a": "c" }]
Parameters
url
String corhal api url (optional, defaulthttps://wos-api.clarivate.com/api/wos
)token
String? WOS API TOKENtimeout
Number Timeout in milliseconds (optional, default1000
)retries
Number The maximum amount of times to retry the connection (optional, default5
)
Returns Object