Natural language processing AI

Published: Last Edited:

This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.


The challenging sphere of natural language processing has been a major concern in the field of computer science and artificial intelligence since the late 40's. It encompasses the next strive forward in artificial intelligence to make computers and human interface more flexible and' human understandable'. Various methods were adopted since its inscription like machine translation, speech recognition, e-teaching, auto tutor etc. Researchers saw it as a likely bridge between human spoken language and computers which used programming languages and binary codes. As mentioned earlier, it is still a challenging task of making a computer to understand human natural language as such. Hence, further enhancements and techniques will foster the demanding yet fruitful and futuristic computational trends.


NLP - Natural Language Processing, Semantic, Syntactic, Lexical, Phonology, MT - Machine Translation


The computational scheme has evolved from basic set of instructions in the form of binary codes to mnemonic instruction codes to programming languages that have prevailed intensively during the later part of twentieth century. Along that evolution came the inspirational research on making the computer understand natural human language and interact with the humans in short applying natural language processing to normal computer usage and beyond.

Natural language processing can be defined as a theoretical approach enclosing analysis and manipulation of natural language texts usually spoken by humans. This is done at various levels of linguistic analysis in order to attain a 'human-like' approach to processing of tasks and other problems.

It must be noted that NLP is not a single defined standard system but a collection of numerous language processing techniques and methods. Also, in view of facilitating the user and standing true to the name, texts must be of natural language usage and not a set of selected texts that could be used for processing. Because, the later approach would certainly forgo the real meaning of natural language processing.

In any NLP system, various levels of linguistic analysis of the text are performed. This is done because humans usually breakup linguistic texts into various levels and then process or understand the language. Human-like approach and processing in the NLP systems are considered as an integral part of AI. The applications of NLP are versatile and are currently being researched and implemented in fields like military science, security systems, virtual reality simulation, medicine and regular computer science and artificial intelligence.

The techniques and approaches that have been used or researched so far constitute to the basic platform of NLP. Some of them are based on classification of natural linguistic phonology, morphology, lexical variations, syntactic, semantic, pragmatic levels. Some of the notable works done in this field are:

  • Machine Translation - Weaver and Booth (1946)
  • Syntactic Structures - Chomsky (1957)
  • Case grammar - Fillmore
  • Semantic Networks - Quillain
  • Conceptual Dependency - Schank
  • Augmented Transition Networks - Woods
  • Functional Grammar - Kay

Also that there have been famous prototypes developed to highlight the impact of particular techniques and principles. They are:

  • ELIZA - Weizenbaum
  • SHRDLU - Winograd
  • LUNAR - Woods

The scope of the article revolves around the evolution of NLP and its implementation in security systems.


Strata of natural language processing:

The optimal descriptive way of putting forward the actions that are going on in natural language processing system is through the 'strata of natural language processing. During the early days of natural language processing, it was held that the different data of natural language processing followed a sequential pattern. But current Psycholinguistic researches have revealed that the system follows rather a synchronic pattern. This is because humans use all of the strata of language processing and they don't follow a sequential pattern. For this reason, in order to achieve high efficiency of NLP system more strata of language processing must be adopted.

This stratum deals with the interpretation of speech sounds within and across words. There are three types of rules that are typically used:

  1. Phonetic rules - for sounds within words
  2. Phonemic rules - for variations of pronunciation when words are spoken together
  3. Prosodic rules - for fluctuation in stress and intonation across a sentence.


This strata deal with the componential nature of words, which are composed of morphemes - the smallest units of meaning. For example, the word postproduction can be morphologically analyzed into three separate morphemes: the prefix 'post', the root 'product' and the suffix 'tion'. Since the meaning of each morpheme remains the same across words, humans break down an unknown word into its constituent morphemes in order to understand its meaning. In the same way, an NLP system recognizes the meaning given by each morpheme in order to achieve and interpret meaning.


Both the humans and NLP systems at this stratum, interpret the meaning of individual words.

Several types of processing contribute to word-level understanding - the first of these being assignment of a single part-of-speech tag to each word. In this processing, words that can function as more than one part-of-speech are assigned the most probable part-of speech tag based on the context in which they occur.

Moreover at the lexical stratum, those words that have only one possible sense or meaning can be replaced by a semantic representation of that meaning. The nature of the representation varies according to the semantic theory utilized in the NLP system. One can notice that, a single lexical unit is split into its more basic properties. If there is a set of semantic primitives used across all words, these simplified lexical representations make it possible to unify meaning across words and to produce complex interpretations, much the same as humans do.


The concept of analysing the sentence by looking into the grammatical composition of a sentence and its dependency is used here. This needs both grammar and a parser. The output achieved here is a representation of the sentence that gives the structural dependency relationships between the words. The efficiency of a parser depends on the different grammars used. Not all NLP applications require a full parse of sentences, therefore the remaining challenges in parsing of prepositional phrase attachment and conjunction scoping no longer stymie those applications for which phrasal and clausal dependencies are sufficient. Syntax conveys meaning in most languages because order and dependency contribute to meaning. For example the two sentences: 'I smoked a cigarette.' and 'the cigarette smoked me' differ only in terms of syntax, but convey contrasting meanings.


It is usually thought that meaning is determined at this stratum. Semantic processing accomplishes the possible meanings of a sentence by looking on the interactions among word-level meanings in the sentence. This state of processing include the semantic disambiguation of words with multiple meanings in an analogous way to how syntactic disambiguation of words that can function as multiple parts-of-speech is determined at the syntactic level. Semantic disambiguation allows one and only one sense of polysemous words to be selected and included in the semantic interpretation of the sentence.


Sentence-length units are used by syntactic and semantic strata. The discourse strata works with texts longer than a sentence. It approaches the interpretation of a multi sentence in a holistic manner. That is, it interprets meaning by creating connections between component sentences. There are two types of discourse processing that occur at this stratum, the anaphora resolution and discourse/text structure recognition. The replacement of words which are semantically vacant is carried out. They are replaced with the appropriate entity to which they refer to. In Discourse/text structure recognition, the meaningful representation of text is achieved by determining the function of the sentences in the text. For example, essays can be broken down into components such as: title, synopsis, body, quotations and conclusion.


This stratum focuses on the use of natural language at times when extra context is needed to understand the given text in the sentence. These extra texts are not actually present in the text but are needed to understand the text completely. Such texts are taken from knowledge basis and modules.

Natural Language processing in textual information retrieval

Textual Information Retrieval relies on NLP techniques for both facilitating descriptions of document content and for presenting the user's queries.

A textual information retrieval system performs the following tasks in response to a user's query:

  1. Indexing: Here, NLP techniques are generally used to generate an index containing document descriptions. Usually each document is described through a set of terms which best represents its content.
  2. When a user puts forth a query, the system analyses it, and if necessary, transforms it in view of representing the user's needs in the exact way as the document content is represented.
  3. The system compares the description of each document with that of the query, and presents the user with those documents whose descriptions are closest to the query description.
  4. The results are listed in order of relevancy.


Owing to the development of NLP through syntactic theory and genetic algorithms, during the 1960's the ALPAC report was produced which highlighted the inefficiencies and constraints posed by the then prevailing systems. This eventually led to the grounding of MT because it was clear that such possibilities were far beyond for the time period. During the late 1960's and early 1970's theoretical researches were carried out and they proved that computer traceable solutions were possible to those which emphasised on grammar and other previous theories. In the later part of 1970's, the attention shifted to semantic, discourse phenomena, communicative goals and plans. Grosz put forward her work on relation between the structure of a task and structure of a task oriented dialogue. As the decade moved into the 1980's, the availability of the computer resources and researches were adequate enough to provide isolated results and techniques to natural language processing systems. The works from the 1980's were either improvised or extended into 1990's. Such works were more or less identical to their previous counterparts. Now at present, NLP researchers are developing the next generation natural language processing systems which would eventually turn out as more sophisticated and innovative when compared to the yester years' models. The present day NLP systems manage well with the normal text and complexity and ambiguity of the Natural language sentences.


Programming languages have spun computer world predominantly through the later part of the twentieth century. Despite this vast and important usage, it is still used only by very few percentages of people who are mostly trained in computers and dedicated users. With the idea of replacing conventional programming languages with the natural language on the horizon, inroads into this area will make computers reach people far more when compared o the number of people using computers now. In this article, we have seen that NLP can be used to process natural language texts in far sophisticated way.

Thus, one can infer that the traditional gap between humans and computer language can be bridged if not eradicated completely by using natural languages. Its applications are potentially large scaled varying from personal information security to military application to other security applications to text processing tools, etc. This has already proven to be promising domain in the near future. Several researches and engineering are done on this to improvise the capabilities. It could eventually foster technical leap forward in computer application.

We personally hope that the Natural Language Processing field would be sought as a major trend setter in Information technology as it eventually replaces the need for programming languages. Though the constraints faced are enormous, the technology will be distributed wide across the paradigm of computers.


  1. BALLARD, B., AND BIERMAN, A. Programming in natural language: "NLC" as a prototype. In Proceedings of the 1979 annual conference of ACM/CSC-ER (1979).
  2. BRILL, E. Transformation-based error driven learning and natural language processing: A case study in part-of-speech tagging. Computational Linguistics 21, 4 (December 1995), 543-566.
  3. DIJKSTRA, E. On the foolishness of "Natural Language Programming". In Program Construction, International Summer School (1979).
  4. KATE, R., WONG, Y., GE, R., AND MOONEY, R. Learning to transform natural to formal languages. In Proceedings of the Twentieth National Conference on Artificial Intelligence (AAAI-05) (Pittsburgh, 2005).
  5. LIEBERMAN, H., AND LIU, H. Feasibility studies for programming in natural language. Kluwer Academic Publishers, 2005.
  6. LIU, H., AND LIEBERMAN, H. Metafor: Visualizing stories as code. In ACM Conference on Intelligent User Interfaces (IUI-2005) (San Diego, 2005).
  7. LIU, H., AND LIEBERMAN, H. Programmatic semantics for natural language interfaces. In Proceedings of the ACM Conference on Human Factors in Computing Systems (CHI-2005) (Portland, OR, 2005).
  8. PANE, J., RATANAMAHATANA, C., AND MYERS, B. Studying the language and structure in non-programmers' solutions to programming problems. International Journal of Human-Computer Studies 54, 2 (2001).