Follow project on Twitter
NederlandsEnglish

1.5.1. Blind spot in NLP

Natural language is like algebra and programming languages:

Natural language has “variables” (keywords) and “functions” (structure words). However, in NLP, only the keywords are used, while the natural structure of the knowledge is discarded. As a consequence, the field of NLP got stuck with “bags of keywords” and unstructured texts, which have lost their meaning (=natural structure).

In natural language, keywords – mainly nouns and proper nouns – provide the knowledge, while the logical structure of sentences is provided by words like definite article “the”, conjunction “or”, basic verb “is/are”, possessive verb “has/have” and past tense verbs “was/were” and “had”. My challenge document describes some basic reasoning constructions, based on the logical structure of sentences.

Scientists are ignorant of the logical structure of sentences. Instead of preserving this natural structure, they teach us to throw away the natural structure, and to link keywords by an artificial structure (semantic techniques). Hence the struggling of this field to grasp the deeper meaning expressed by humans, and the inability to automatically construct readable sentences from derived knowledge (automated reasoning in natural language).

Moreover, this field has a blind spot on the conjunction of logic and language:

A science integrates its involved disciplines. However, the field of AI and NLP doesn't. It is unable to integrate (automated) reasoning and natural language:

• Reasoners (like Prolog) are able to reason, but their results – derived knowledge – can't be expressed in readable and automatically constructed sentences;
Chatbots, Virtual (Personal) Assistants and Natural Language Generation (NLG) techniques are unable to reason logically. They are only able to select human-written sentences, in which they may fill-in user-written keywords;
Controlled Natural Language (CNL) reasoners are very limited in integrating both disciplines. They are limited to sentences with present tense verb “is/are”, and don't accept words like definite article “the”, conjunction “or”, possessive verb “has/have” and past tense verbs “was/were” and “had”.

Some people believe that meaning will evolve “by itself” (see Evolutionary Intelligence), while others believe that the meaning is preserved by parsing all words of a sentence. But they all fail to integrate reasoning and natural language, and to solve ambiguity.