Emotion Extraction Bayesian Network Computer Science Essay

Published: Last Edited:

This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.


Nowadays we are communicating with each other through different mediums like text messages, voice and video calls etc. We often start chatting without knowing mode of our opponent and may get unpredictable responses. To avoid this we can use start a topic according to the mode. For this a simple technique is proposed in this study; according to which if chatting is done through voice, we will convert voice into text then applying simple techniques of data mining with Naïve Bayes, we will get emotion of users.


Chat, chat mapper, emotion extraction, Bayesian Network, Text, Speech


Chatting through text is common today; we may not be able to judge other person's current mode and we might start such a topic which does not suits other person's mode. This paper presents an approach to emotion estimation that assesses the content from textual messages. In this paper, the emotion estimation module is applied to text messages produced by a chat system and text messages coming from the voice-recognition system.

Our objective is to adapt a multimedia presentation by detecting emotions contained in the textual information through thematic analysis; we can determine how to communicate with fellow. The estimation of emotions or identification of personalities in chat rooms has several advantages mainly guarding the chatters from conflicting personalities and matching people of similar interests.

2. Materials and Methods

2.1 Related Work

Lot of work has been done for identification of emotions from text. Approaches that exist can be categorized into non-verbal, semantic and symbolic.

Lars E. Holzman and William M. Pottenger [1]: Textual chat messages are automatically converted into speech and then instance vectors are generated from frequency counts of speech phonemes present in each message. In combination with other statistically derived attributes, the instance vectors are used in various machine-learning frameworks to build classifiers for emotional content. Anjo Anjewierden, Bas Koll¨offel, and Casper Hulshof [18] derived two models for classifying chat messages using data mining techniques and tested these on an actual data set. The reliability of the classification of chat messages is established by comparing the models performance to that of humans.

2.2 Java Speech API

Java Speech API contains speech synthesis and speech recognition. Speech Recognition technology works by converting audio input containing speech into text. It has several phases through which speech is converted into text with some accuracy. Also some third party API is also available on the basis of Java Speech API.

2.3 Bayesian Network

Classification is a basic task in data analysis and pattern recognition that requires the construction of a classifier, that is, a function that assigns a class label to instances described by a set of attributes. The induction of classifiers from data sets of pre classified instances is a central problem in machine learning. Numerous approaches to this problem are based on various functional representations such as decision trees, decision lists, neural networks, decision graphs, and rules.

3. Chat Emotion Mapper: CHATEM

3.1 Approach

We first convert Voice into text. [1]Early speech recognition systems tried to apply a set of grammatical and syntactical rules speech. If the words spoken fit into a certain set of rules, the program could determine what the words were. However, human language has numerous exceptions to its own rules, even when it's spoken consistently.

Today's speech recognition systems use powerful and complicated statistical modeling systems. These systems use probability and mathematical functions to determine the most likely outcome. According to John Garofolo, the two models that dominate the field today are the Hidden Markov Model and neural networks. These methods involve complex mathematical functions, but essentially, they take the information known to the system to figure out the information hidden from it. The Hidden Markov Model is the most common, so we'll take a closer look at that process. During this process, the program assigns a probability score to each phoneme, based on its built-in dictionary and user training. There is some art into how one selects, compiles and prepares this training data for "digestion" by the system and how the system models are "tuned" to a particular application.

Table : Emotion icons

S No.

Emotion Name

Emotion Shape











Figure1. System architecture diagram

Then we analyze text and apply following four processes [2].

3.2 Parsing Phase

The first stage after receiving an input sentence is to create a parse tree using the Stanford Parser. The parser works out the grammatical structure of sentences, for instance: which groups of words go together as "phrases" and which word is the subject or the object of a verb. We also analyze it in order to find if there is a negation.

3.3 Emotion Extraction Phase

At this phase we assign every word with an object that will hold the following information: array of emotions (happiness, sadness, anger, fear, surprise and disgust), negation information, the dominant emotion of the word and the word itself. Once we've established the POS type for each word in the sentence, we proceed by extracting the possible senses hidden behind each word using [3] Jwordnet ( JWordNet is a large lexical database of English) In this database, nouns, verbs, adjectives and adverbs are grouped into sets of cognitive synonyms called "synsets", each expressing a distinct concept. Synsets are interlinked by means of conceptual-semantic and lexical relations, resulting in the formation of a network of meaningfully related words and concepts to construct a mapping between synset offsets from WordNet, and one of the possible emotion types. In order to do that, we needed to choose base words that will represent each of the emotion types. At the end of this stage we now know which of the synsets has an emotional value as described above, allowing us to update the emotion array of the object holding the word being analyzed, and eventually assign a word with its most probable emotional sense out of the possible emotional senses available.

3.4 Negation Detection

The intuitive way to deal with negation is to emphasis the counter emotion of the emotion found as most dominant in the word. For example "Happy" and "Sad", the negation will turn a word marked with emotional value "Happy", to be marked with emotional value "Sad" and vice versa.

Figure2: Thematic analysis and emotion extraction

3.5 Sentence Tagging

The method we use to deal with multi-emotional sentence is: When we reach a word with an emotional value, we open an appropriate tag and close this tag either when we reach a word with a different emotional value, or at the end of the sentence. In case we reached a word with a different emotional value, we open a new emotion tag and in case that the emotional value is similar to the previous one, we continue on to the rest of the sentence.

4. Discussion and Conclusion

Above mentioned technique was repeatedly applied to different group of users, we come to know that, Java Speech API was not accurate 100% and there was limitation and initially results were not appealing, but it performed well on chatting done using text messages.


Figure 3. Chat using EChatem

5. Future Research Work

In our future work, we plan to improve the Emotion Estimation module, e.g. by integrating the recorded user (client) information into the analysis of emotions. Most importantly, past emotional states could be important parameters for deciding the affective meaning of the user's current message. Some analysis of voice features like pitch, frequency and tone can help us to identify emotions and mode of user.