@menome/sentences-to-graph
v1.1.0
Published
Menome Technologies Inc: takes incoming text and turns it into a structure suitable for a graph
Downloads
3
Readme
SentencesToGraph
Module for generating object array structure suitable for turning into a knowledge graph of the incoming text.
Objective
In order to construct a graph document from unstructured text, it is necessary to decompose the text from the document into a parent-child 'graph document' that is amenable to turning into a graph structure.
This is accomplished by using an array of sentences that have been decomposed from the source text using the Pragmatic Segmenter or equivalent process to break the source text down into lines. The array is then fed into the ParentGenerator whose purpose is to iterate through the text array one line at a time, and identify which parent each line belongs to, generate a unique reference code for that line and return the results as an object array amenable to downstream processing.
Example Input Text
Regulatory or other types of structured text often comes in a form that is difficult to analyze. By decomposing the text down, we can put it into a form that is more amenable to various types of analysis.
The goal is to turn source text such as this into structured text segments. By leveraging various signals in the text, we can generate logical 'index cards' that consitue groupings of text that can be rendred into a 'graph document' structure. A graph document is a structure that is more amenalbe to analysis through decomposing the contents of a monolithic file into a node-relationship pattern. The resulting structure is a 'Trail' linking the 'index cards' together.
(a) When you have established a DBE contract goal, you must award the contract only to a bidder/offeror who makes good faith efforts to meet it. You must determine that a bidder/offeror has made good faith efforts if the bidder/offeror does either of the following things:
(1) Documents that it has obtained enough DBE participation to meet the goal; or
(2) Documents that it made adequate good faith efforts to meet the goal, even though it did not succeed in obtaining enough DBE participation to do so. If the bidder/offeror does document adequate good faith efforts, you must not deny award of the contract on the basis that the bidder/offeror failed to meet the goal. See Appendix A of this part for guidance in determining the adequacy of a bidder/offeror s good faith efforts.
(b) In your solicitations for DOT-assisted contracts for which a contract goal has been established, you must require the following:
(1) Award of the contract will be conditioned on meeting the requirements of this section;
(2) All bidders or offerors will be required to submit the following information to the recipient, at the time provided in paragraph (b)(3) of this section:
(i) The names and addresses of DBE firms that will participate in the contract;
(ii) A description of the work that each DBE will perform. To count toward meeting a goal, each DBE firm must be certified in a NAICS code applicable to the kind of work the firm would perform on the contract;
(iii) The dollar amount of the participation of each DBE firm participating;
(iv) Written documentation of the bidder/offeror s commitment to use a DBE subcontractor whose participation it submits to meet a contract goal; and
(v) Written confirmation from each listed DBE firm that it is participating in the contract in the kind and amount of work provided in the prime contractor s commitment.
natory bond requirements.
(iv) The listed DBE subcontractor becomes bankrupt, insolvent, or exhibits credit unworthiness;
(c) Regulations are applicable where necessary;
Sample Resulting Structure
| Text Segment | base code | parent code | -------------------------|----------------|---------------| | Regulatory or other .. | 0 | 100.0 | 100 | "a) When you have est", | 1 | 100.a | 100 | "1) DOcuments that it", | 1 | 100.a.1 | 100.a | "2) Documents that it", | i | 100.a.2 | 100.a | "b) In your solicit", | b | 100.b | 100 | "1) Award of the ", | 1 | 100.b.1 | 100.b | "2) All bidders", | 2 | 100.b.2 | 100.b | "i) The names and", | i | 100.b.2.i | 100.b.2 | "ii) A description", | ii | 100.b.2.ii | 100.b.2 | "iii)The dollar amount", | iii | 100.b.2.iii | 100.b.2 | "iv) Written document", | iv | 100.b.2.iv | 100.b.2 | "v) written confirm", | v | 100.b.2.v | 100.b.2 | "vi) run for the hills",| iii | 100.b.2.iii | 100.b.2 | "c) Regulations are " | c | 100.c | 100
Basic Algorithm
rules for code and parent: start - initialize parent to root, increment (root, root + increment)
-> type current-type previous=0 same level - use priror parent, increment: 100.1,100.2 (prior parent,prior parent + increment) -> type current-type previous=1 increment depth - : 100.2.i (prior code, prior code + increment) -> type current-type previous=-1 -> find first instance of prior type, use parent code - 100.3 (first type same parent, first type same parent + increment)
prior same & next different - decrement depth - find prior same, use found same parent, increment counter (prior found parent, found parent + increment) use prior parent,