Text entry has been done in computer applications unambiguously using single keys or sometimes key combinations with 102-key keyboard. However, the upsurge usage of mobile telephones which have a minimal 12-button keyboard prompted the need to enter text in a most efficient way. Text entry is by no means new in mobile computing; there has been a burst of research on the topic in recent years. There are several reasons for this heightened interest: first, mobile computing is on the rise and has spawned new application domains such as wearable computing, two-way paging, and mobile web and email access. Second, word processors, spreadsheets, personal schedulers, and other traditional desktop applications are increasingly available on mobile platforms. Third, there is a strong demand for the input of text or alphanumeric information that is easily and efficiently entered, recognized, stored, forwarded, or searched, via traditional software techniques. Fourth, the phenomenal success of text messaging with mobile phone users has inspired considerable speculation on future spin-off technologies, all expected to benefit from text entry. The statistics for text messaging on mobile phones are remarkable. In January 2001, GSM Europe reported that fifteen billion (15,000,000,000) SMS text messages are transmitted per month worldwide (Alsio G. & Goldstein M. (2000). This is particularly interesting in view of the limited capability for text input with the current generation of mobile phone technology. While the ubiquitous QWERTY keyboard reigns supreme as the primary text entry device on desktop systems, mobile phones and handheld systems use equivalent techniques for the text entry. But how efficient and simple are these techniques? And so, the challenge of text entry for mobile phones presents itself. Though the QWERTY keyboard has the obvious advantage of familiarity, it is bulky and unless the keyboard is full-size, touch typing is hampered or impossible. Basically, there are two competing paradigms for mobile text input: pen-based input and keyboard-based input. User experience with typing and handwriting greatly influences expectations for text entry on mobile devices.
To maximally utilize the limited keypads on mobile communication devises, Multitap method is often used for text entry. The drawbacks of this method include pressing a key more than once to enter each desired letter and also require a significant amount of visual searching to find a needed letter on a key making it relatively inefficient from the standpoint of the number of keystrokes required to enter each word (Jun, Bryan, & Peter).
To overcome the inefficiencies of Multitap method, Predictive text entry system was developed. Predictive text entry reduces the number of keystrokes when entry texts by analyzing the sequence of keystrokes being entered and thereby predicts the intended word for the users. However, studies shows that users are dissatisfied with how existing systems for Predictive text entry work because it interferes with their communication (YH af Segerstad, 2003 ). Multi-tap text entry system combined with Predictive text entry called Predictive multi-tap text entry system works to eliminate this dissatisfaction.
In order to effectively predict words, language models are used to select and decipher words that correspond to entered keystrokes sequence. Language modeling being used includes n-gram and bi-gram word model, corpus and dictionary model. A proposed model in this paper is used to implement predictive multi-tap text entry system for mobile communication devices.
Text entry methods could be subdivided into two basic classifications; Stylus (Pen)-based text entry and Key-based text entry.
Stylus-based text entry uses a pointing device, typically a pen (or stylus), to select characters through tapping or gesture. Examples of this method include:
Handwriting recognition: This was once touted as the solution for mobile text entry, but the system has two major problems that handwriting recognizers must solve; segmentation and recognition. However, there are no mobile consumer products in the market today where natural handwriting recognition is the sole text input method. The products that support stylus-based text input work with constraints or stylized alphabets, implying that handwriting recognition does not perform adequately.
Gesture-based Text Input: The text entry methods in this section are classified as gesture because of their informality and fluidity. Character recognition-based and soft keyboard-based input techniques have fixed characters that are entered in a certain way, or the stylus must be tapped in a certain location to select characters for input. Gesture-based text input technologies do not have a fixed set of strokes that a recognizer turns into characters; gesture text input methods have a framework in which informal stylus motions are interpreted as characters. An example of this is Cirrin, a technology presented by Mankoff and Abowd (1998).
Soft Keyboards: A soft keyboard is a keyboard implemented on a display with built-in digitizing technology. Text entry is performed by tapping on keys with a stylus or finger. The advantages of soft keyboards include simplicity, and efficient use of space. When no text entry is occurring, the soft keyboard disappears, thus freeing screen space for other purposes.
The major problem with the soft keyboard is the display of mobile phones. Mobile phones usually have a small display. Therefore, implementing a keyboard on the display means reducing the size of items on the screen, which could make the display difficult to read, or the display could become clumsy.
Key-based text entry techniques range from those that use a keyboard where each key represents one or more letters, to those with as few as three keys. Examples of this method include:
Small QWERTY Keyboards: The most prevalent text input technology for low-end PDAs is the miniature QWERTY keyboard; an example is the Nokia Communicator shown in Figure 1. The Nokia Communicator is a mobile phone with text messaging functionality.
Figure 1: Nokia Communicator 9110 (actual size 158 Ã- 112 mm)
Joystick Text Entry: This method is similar to the soft keyboard text entry technique. A keyboard (alphabetic order or QWERTY arrangement) is presented on the display, and the joystick is used to move the cursor through the alphabet. The desired letter is selected by pressing the joystick or an ENTER key. This technique is sometimes called the date stamp method because, similar to a date stamp, the desired character is selected by rotating through the character set. Video arcade games often use this technique for players to enter their name when they achieve a high score. The technique is also commonly used for entering text into some electronic musical instruments. Although this method is reasonable for entering small amounts of text into devices with a simple interface, the method is frustratingly slow and not suitable for even modest amounts of text entry.
Telephone Keypad: The desire for an effective text entry using the telephone keypad is fuelled by the increase in text messaging services, and the movement toward consolidation of technologies such as wireless telephony and handheld computers. Text entry on most mobile phones is based on the standard 12-key telephone keypad as shown in Figure 2.
Figure 2: The standard 12 key telephone keypad
The 12-key keypad consists of number keys 0-9 and two additional keys "*" and "#". Characters A to Z are spread over keys 2-9 in alphabetical order. The placement of characters is based on an international standard (Grover, King, & Kuschler, 1998). Since there are fewer keys than the 26 needed for the characters A-Z, three or four characters are grouped on each key, and so, ambiguity arises. There are three main approaches for overcoming this ambiguity: Multitap, two-key, and one-key with disambiguation.
Multitap: The Multitap method is currently the most common text and simplest input method for mobile phones. With this approach, the user presses each key one or more times to specify the input character (Jun, Bryan, & Peter). For example, the 2 key is pressed once for the character A, twice for B, and three times for C. A problem arises when the user attempts to enter two letters from same key consecutively. To overcome this, MultiTap employs a time-out on the key presses, usually 1-2 seconds, such that no key presses during the timeout indicates completion of the current letter. Although Multitap eliminates ambiguity, it is quite slow, with keystrokes per character (KSPC) rate of approximately 2.03 (Wigdor and Balakrishnan, 2003). For example, to enter the word "ON" the user presses the 6 key three times, waits for the system to timeout, and then presses the 6 key twice more to enter the N.
In the two-key method, the user presses two keys successively to specify a character. The first key selects the group of characters; the second key specifies the position within the group. For example to enter the character 'K' the user presses '5' key to select the group 'J', 'K', or 'L' followed by 2 to select 'K' which is the second character in the group. It has a KSPC of 2, since all letters require two consecutive key presses (Wigdor and Balakrishnan 2003).
In one-key with disambiguation, the user presses the sequence of keys to form the required text, the system then computes all possible combinations of the sequence, look them up in a dictionary and then presents the only valid combinations from where the users pick the needed one. This method is also referred to as linguistic disambiguation. Evidently, the term "one-key" in "one-key with disambiguation" is an oversimplification! T9 was the first disambiguating technology to work with a standard mobile phone keypad, but not the only such technology.
Predictive Text Input
Predictive texting is intended to simplify text entry and to reduce the input burden by predicting what the user is entering (Ling, 2008). This can be accomplished by analyzing a large collection of documents called corpus to establish the relative frequency of characters, digrams, that is, pairs of characters, trigrams, words, or phrases in the language of interest. These statistical properties are used to suggest or predict letters or words as text is entered. Predictive text input method, instead of using a single key for a group of character as it is used in Multitap method, allows users to enter groups of characters per keystroke and let the built-in algorithms decipher the input key sequences into most probably intended words and phrases (Rick et al). Predictive input combined with Multitap (e.g. eZiTap) increases the rate at which text is entered with less taps, however linguistic knowledge must be added to the system in order to avoid meaningless words. Naturally, linguistic disambiguation is not perfect, since multiple words may have the same key sequence. For example, pressing keys 843 will gives a possibility of 27 words, but with linguistic knowledge, two are valid words are available to be chosen. In this case, the user must press additional keys to obtain the desired word. Predictive text input techniques strive to reduce the input burden by predicting what the user is entering. This can be accomplished by analyzing a corpus (large collection of documents) to establish the relative frequency of characters, digrams (pairs of characters), trigrams, words, or phrases in the language of interest. These statistical properties are used to suggest or predict letters or words as text is entered.
To motivate this approach, consider a user writing the message "Meet at home later". Assuming the standard telephone keypad using T9, a user who has typed the first two words and is keying the sequence for "home" ('4663') in will be shown "i", "in", "inn", "good" as the keys are typed, with "home" finally entered after a 'Next' key press. However, a corpus analysis can reveal that "home" is the most likely word being typed, even when only three letters have been typed, based on the previous word, "at".
According to Yijue How and Min-Yen Kan (2005), in language modeling terms, the n-gram model with a bi-gram word model is used to make a prediction. That is, we select the word with the highest conditional probability given the joint evidence from the typed sequence and previous word. Counts are again estimated from the collected corpus. All word suggestions can thus be ordered by their probability. However, according to R.Shriram et al (2006), there are a few caveats to consider in basing a language model on a standard corpus, these include:
The corpus may not be representative of the user language: The idea that a corpus is "representative of a language" is questionable. This is because users typically use a much richer set of characters and words than appear in any corpus, and the statistical properties in the user's set may differ from those in the corpus. A simple example is the space key, which is the most common character in English text (Soukoreff R. W. & MacKenzie I. S. (1995).
The corpus ignores the editing process: A corpus contains no information about the editing process, and we feel this is an unfortunate omission. Users are fallible and the creation of a text message - or interaction with a system on a larger scale - involves much more than the perfect linear input of alphanumeric symbols. The input process is really the editing process.
The corpus does not capture input modalities: Text documents do not reflect how they were created. For example, a corpus includes both capital and lowercase characters. In simple language models this distinction is ignored (e.g., "A" and "a" are considered the same). A more expansive model can easily accommodate this distinction simply by treating capital and lowercase characters as distinct symbols. Yet, from the input perspective, both approaches are wrong. Uppercase and lowercase characters are never entered via separate keys on a keyboard; thus, the seemingly more accurate treatment of uppercase and lowercase characters as distinct symbols is just as wrong.
Dictionary model, a database of words with their frequency of use, can also be used for predictive text input. In this case, prediction is done by searching the dictionary for matches and ordering by relative frequency of words. Predictions made using dictionary model are usually not as accurate as predictions made from a corpus. But the problem with dictionary model is that it is memory intensive. As memory is both limited and expensive, dictionary model has a very significant cost, both in terms of money paid for memory and what could have been used to provide other features like additional games, larger address book, storage of more SMS messages, longer voice memos, screen savers, and so on, is lost in providing language dictionary (R.Shriram et al (2006). However, considering the limited processing power of mobile phones, and the fact that a corpus might not reflect a user's language the dictionary model is mostly used. Examples of dictionary-based Predictive Text solutions products include T9 and eZiText.
This paper implements text entry in three Nigerian languages for mobile communication using predictive multi-tap text entry system. The system is an application for creating and sending text messages via SMS on mobile phones. The system is aimed at reducing the time spent and also, to ease the problems encountered by Nigerian users in composing text messages in their native language. The user composes text messages using a predictive text input system which is complimentary to the basic multi-tap text entry method most users are familiar with. It uses the text entry behaviour of multi-tap while at the same time providing prediction of whole words for the user to select. For example, to write the word "nigeria", predictive multi-tap text entry system predicts the word for selection. The prediction is done by looking up a dictionary of words for any match with the substring. Words that match are presented in a list ordered based on their probability of occurring by frequency and recency of use, in combination with some conditions.
P(word) = Pfrequency(word) x Precency(word) (2)
Where P(word), Pfrequency(word) and Precency(word) are probability of the word occurring, probability of the word using relative frequency and probability of the word using recency conditions, which are:
Words that start with the search substring are placed at the top of the list and are ordered by most probable to least probable. Most probable is given by argMax P(words)
Words that don't start with the search substring are placed at the bottom of the list and are ordered by most probable to least probable.
If there are words in the same class (words that begin with the search substring or words that don't start with the search substring) have equal probabilities. Then these words are ordered by length, with the greatest length preceding.
If no match was found or the intended word is not in the prediction list, the user can add the intended word to the system, thus, improving the performance of the system in future.
After the message has been composed, it is then sent via SMS to an address (phone number) specified by the user.
Results and Discussion
The text input model proposed above was simulated using the Java Micro Edition. The results are shown as screen shots.
When the application is launched, the compose message interface will be activated first, Figure 3 shows the message composition interface. This interface allows the user to type in the message using multitap method. It also has a prediction command which is used to place a demand on the system to predict the word being typed. When the prediction key is pressed, the predicted words list is displayed allowing the user to select the intended word. After selecting the intended word, the system automatically completes the word being typed. The composed message is sent by typing a valid phone number in the destination address interface and then pressing the Ok button.
Figure 3: Message composition interface
The user interface was designed to be user friendly, very simple and straight forward. Commands for operations explicitly describe what is being done.
Figure 4: Prediction list
In Figure 4, the user enters text using multitap input method and presses the prediction button, so that the system can predict the word being typed. The user selects the desired word or adds the intended word to the dictionary by pressing the add word command.
Figure 5: Menu options
From the compose message interface shown in Figure 5, the menu contains five options. The continue option allows user to continue with the composition of their message. The Dictionary submenu is use to select which of the three Nigerian languages to use. The "My words" submenu options allows users to add new words, to delete a word, and to edit a word. The Help submenu provides users with contextual help, while Exit allows users to exit from the menu.
Figure 6: Dictionary selection
From the dictionary list presented as shown in Figure 6, the desired dictionary is selected and the OK button pressed.
From this paper, it can be seen that combining multitap input technique with predictive text input can help reduce the number of keystrokes per letter, thereby increasing the rate at which text is entered, and reducing stress associated text input on mobile devices. In Nigeria, applications developed on this paper will help increase the daily use of our mother tongue and can help other people who are not fluent with the languages (Hausa, Igbo and Yoruba) communicate better by reducing their spelling mistakes.
To develop applications for use in Nigeria, the followings are needed to be included in the input technique, prediction of when last the word was used, prediction of the next word to be typed, and phrase prediction. Also, anticipated increase in the processing power of future mobile devices by using a corpus based predictive model which will give better predictions can be implemented.