Follow project on Twitter
NederlandsEnglish

1.5.1. Blind spot in NLP

Natural language is like algebra and programming languages:

Natural language has “variables” (keywords) and “functions” (structure words). However, in NLP, only the keywords are used, while the natural structure of the knowledge is discarded. As a consequence, the field of NLP got stuck with “bags of keywords”, which have lost their meaning (=natural structure).

In natural language, keywords – mainly nouns and proper nouns – provide the knowledge, while the logical structure of sentences is provided by words like definite article “the”, conjunction “or”, basic verb “is/are”, possessive verb “has/have” and past tense verbs “was/were” and “had”. My challenge document describes some basic reasoning constructions, based on the logical structure of sentences.

Scientists are ignorant of the logical structure of sentences. Instead of preserving this natural structure, they teach us to throw away the natural structure, and to link keywords by an artificial structure (semantic techniques). Hence the struggling of this field to grasp the deeper meaning expressed by humans, and the inability to automatically construct readable sentences from derived knowledge (automated reasoning in natural language).

As a consequence, this field has a blind spot on the conjunction of logic and language.

A science integrates its involved disciplines. However, the field of AI and NLP doesn't integrate (automated) reasoning and natural language. There are roughly three categories in this field involved with natural language and/or reasoning. However, scientists are unable – or unwilling – to integrate them beyond reasoning with verb “is/are” in the present tense:
Chatbots, Virtual Assistants and Natural Language Generation (NLG) techniques are unable to reason logically. They are only able to select human-written sentences, in which they may fill-in user-written keywords;
• Reasoners like Prolog are able to reason logically. But they only have keywords as output. So, their results can't be expressed in automatically constructed sentences. As a consequence, laymen are unable to use this kind of reasoner;
Controlled Natural Language (CNL) reasoners are able to reason logically in a very limited grammar. But they are able to autonomously construct sentences, word by word.

In order to uplift this field to a science, the following three steps are required to close the loop for reasoning in natural language:
1. Conversion from a sentence in natural language to an almost language-independent knowledge structure;
2. Logical reasoning applied to the almost language-independent knowledge structure;
3. Conversion of the result of the reasoner – the derived knowledge – to a readable and autonomously – word by word – constructed sentences.

Only CNL reasoners tick all boxes mentioned above for reasoning in natural language. However, they are limited to sentences with verb “is/are” in the present tense. So, they don't accept, implement and use structure words like definite article “the”, conjunction “or”, possessive verb “has/have” and past tense verbs “was/were” and “had”.

Some people believe that meaning will evolve “by itself” (see Evolutionary Intelligence), while others believe that the meaning is preserved by parsing all words of a sentence. But they all fail to integrate reasoning and natural language beyond verb “is/are” in the present tense.