Semi-Formal Grammar
In structuring I propose the value of an automated process which can turn natural language into a semi-formal structure. This page outlines my proposed grammar.
The design goals of this grammar are to be close enough to natural language to be easily readable by humans while being precise enough to be easily usable by algorithms. These goals are often in tension.
Consider the following
This shows a directed acyclic graph of statements, with AND groupings which represent the composite of all children statements, and ANY groupings which indicate child statements can be considered to independently support the parent. This is concise, while still keeping the key statements in natural language. Such a visual interface isn’t ideal to show a human (or an LLM) so we need to have isomorphic (two way) transformations to formats which are.
Natural language form
One option is to turn it into natural language via standard conjunctions. For example:
I should be vegan because all of being vegan minimizes suffering, and I should minimize suffering when easy, and being vegan would be easy. Also I should be vegan because being vegan results in a smaller CO2 footprint and I should minimize my CO2 footprint.
Note: the written version has a statement order, while the graph version doesn’t imply one. This is a concern, as certain language such as “former” or “latter” can change meanings based on statement order. There is a research item here to see if such order-sensitive language can be detected and replaced by an auto-structuring preprocessor, so that such a format is practical to use.
Bullet list form
An alternative is to create a bullet list where ANY groups are the default implication, and AND groups must be explicit:
- I should be vegan
- Because all of
- Being vegan minimizes suffering
- I should minimize suffering when easy
- Being vegan would be easy
- Because all of
- Being vegan results in a smaller CO2 footprint
- I should minimize my CO2 footprint
- Because all of
This stays closer to the structure in the graph version while being a textual format which is easier to store, edit and share.
Another example:
- We should build new power plants
- Because all of
- Power plants leads to net human flourishing
- We should seek human flourishing
- Building power plants maintains national sovereignty
- Because all of
- If we stop building power plants we risk losing the capacity to build more later.
- Sovereign nations can build their own power plants.
- Because all of
- Because all of
Aggregate claims
Consider this example:
- Power plants leads to net human flourishing
- Because all of
- Power plants create negative flourishing through pollution
- Power plants create positive flourishing through electricity
- Because all of
The parent statement surfaces a case of a statement with “net” claim. Supporting such a claim with multiple children statements which should be summed. Here I propose we use the “AND” (Because of all) to imply aggregation but I’m still not sure if this is ideal.
Such an aggregate claim is common. Most high level political statements (i.e. “we should raise taxes”) will be examples of this. There is some research to be done on how to best handle these, as this stretches the definition of “AND” logic, and thus it may be that yet another kind of aggregation node needs to be added to the grammar.
Term expansion
In the last example all three statements start with “Power plants” and this is perhaps more verbose than ideal. If we instead wrote:
- Power plants leads to net human flourishing
- They reduce it through pollution
- They increase it through electricity
“They” is a pronoun referring to “Power plants” and “it” is a noun clause referring to “net human flourishing.” What these refer to are only clear you’re reading it together. For some of the processing steps we want to do after auto-structuring, we’ll want to compare individual statements written by different people, and it’s going to be easier to do so if all such terms are expanded first.