A Study On Speech And The Vocal Tract Biology Essay

Published:

This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.

The vocal tract (the acoustic tube) is modeled as a co-axial concatenation of lossless acoustic tubes of different lengths and diameters. The cross-sectional area of any of the tubes can be varied independently to simulate the changing shape of the vocal tract. The first tube starts at the glottis and the last tube ends at the lips or the nostrils. The vocal tracts of an average adult male and female speaker are approximately 17 cm and 15 cm long respectively (Dew and Jensen 1977). The vocal tract includes the oral cavity, pharyngeal cavity, and nasal cavity. It is the most important component in speech production. The number of tubes, diameter, and length of each tube acoustically determine the resonance and anti-resonance of the vocal tract, and place of articulation.

The vibrations of vocal cords classify the speech into voiced or unvoiced sounds based on their periodicity [1]. The vocal tract shape is characterized by a set of formant frequencies, the basic formant frequency for an average adult male being 500 Hz. These formants are numbered from low to high frequencies and are called the first formant F1, second formant F2, third formant F3, fourth formant F4 and so on (Pickel 1980). Their values depend on the vocal tract shapes.

Each vowel has its own sound spectrum and hence has a unique combination of formants. The first four formants have been found sufficient for classification of the different vowels. The shape of the vocal cavity modifies the spectrum of the excitation signal to create recognizable speech sounds. This forward transformation, using the shape of the vocal tract from glottis to lips, forms the acoustic characteristics of the sound (Fant, 1960; Flanagan, 1972). Hence from a given description of the vocal tract shape, the resulting sound can be estimated accurately. The inverse transformation from the speech acoustics to the vocal tract shapes is not yet very well understood. The problem of estimating vocal tract shape from speech sound is called inverse vocal tract problem. This inverse vocal tract problem has a variety of practical and theoretical applications. Various estimation techniques like measurement of acoustic impedance at the lips, measurement of formant frequencies, and LPC based analysis have been used for the estimation of vocal tract shapes. LPC based analysis of speech is the most preferred technique, as it is capable of providing real time estimation of the vocal tract shape directly from the speech signals with current computational/hardware capabilities. Further LPC co-efficients can be transformed to other parameter sets, useful for investigating and estimating the intra-speaker vocal tract shape.

Determining the intra-speaker vocal tract shape from the speech signal is a practical necessity, for the phonetic distinctiveness and speaker individuality among the vowels uttered by an individual male/female speakers have high correlations with the formants F1-F2. The gross vocal tract shapes which are estimated from vowels arrived from F1-F2 is found to create the widest spread among the vowels for individual speakers.

We also proposed to look at the spread of formants for individual speakers at different occurrences and investigate the variability role on speaker identification and speaker specific recognition applications, useful in low cost ASR applications for home/bank security, tele-banking and individual application specific access to data and data-bases.

2 PROBLEM STATEMENT

Determining the intra-speaker vocal tract shapes from the speech signal is an important problem. There is some evidence that the phonetic distinctiveness and speaker individuality are deeply ingrained in the vocal tract shapes, estimated from the vowels using formants and area function approximation of the vocal tract shapes. Their solution is useful in speech applications like speaker identification, forensic applications, and speech recognition and speech-coding.

In practice, the output signal is observed without direct measurement of the input (glottal excitation). The ambiguities in the vocal tract shapes arise due to the limited bandwidth of the speech signal. To solve these difficulties in determining the intra-speaker vocal tract shapes from the speech signal, additional prior knowledge of the speech production mechanism is deployed.

This research employs a vocal tract model and determines the set of vocal tract shapes for same vowel utterance, at different interval of times, capturing the minimal-maximal movements of the articulator parameters.

Minimal-maximal variation of acoustic features will appear in speech due to change of status of the speaker, speech environment, speaker health, emotion and intentional imitation, or disguises. Previous studies [28], [32], [41] and [42] indicate that speaker variation results in the spread of spectral amplitudes, pitch, formant frequencies, formant bandwidth turbulent noises, etc.

They are usually characterized by source of voice and filter. The other features, like spectrum, the formant vector, and F0 vector provide high probability measures enabling discrimination among various speakers. Formant frequencies, as one of these features, are rather important parameters that are typically measured and compared in actual forensic and speech related applications.

Higher formants F3 and F4 further carry speaker specific information. Gross vocal tract shape estimation from the lower formants causes the largest spread among the vowels and of individual speakers.

We propose to use area function approximation of a person taken at different times, in different contexts. The steady state vowels of adult male and female speakers are recorded at different times and the variability of the resulting vocal tract shapes and formants spread are measured on intra and inter-speaker basis.

We also propose to look at the spread of formants for individual male and female speakers at different occurrences to investigate the variability role on speaker specific recognition applications.

3 RESEARCH OBJECTIVE

The research objective is to study the variation of the vocal tract shapes and formants spread for the male and female speakers, and the reliability of educated of Andhra Pradesh, and establish the use of this variation of vocal tract shapes and formants spread for speaker recognition purposes. There is evidence that the phonetic distinctiveness and speaker individuality are deeply ingrained in the vocal tract shape, estimated from the vowels using formant frequencies. This is demonstrated by Acoustic Articulatory Model on vowels and speaker based area function approximation of the vocal tract.

Here, a new technique has been proposed to approximate the area function of a person at different times, and in different contexts. The variability of the resulting shapes is measured on intra and inter - speaker basis. Such vocal tract shapes are arrived at for each subject, for pre-defined set of phonemes namely /a/, /e/, /i/, /o/ and /u/. The vocal tract shape correlation graphs of a vowel superimposed on itself, versus the discrimination provided against other vowels pronounced by male speakers are calculated. The time averages of the worst and the best patterns of the ensemble are plotted.

Error minimization is carried out using an all-pole, LPC filter. Analysis is done for a vowel in the above format, to get the vocal tract shape for vowels of male and female speakers by taking 30 samples of 30 subjects with different contexts. The vocal tract shape arrived at for each subject for 30 sets of data at different times for predefined set of phonemes namely /a/, /e/, /i/, /o/, /u/.

Using LPC and Correlation Analysis, the vocal tract shape variability of the individual subject is found. Study of variability of the above vocal tract shape among 30 different speakers is highlighted to identify intra-speaker variability. The identified variability can be used as a cue for personal identification in speaker specific recognition.

These two techniques are used for analysis of phonetic distinctiveness and speaker individuality deeply ingrained in vocal tract shapes, estimated from the vowels using formants and area function approximation of the vocal tract shapes.

The proposed new technique approximates an area function to the vocal tract shapes. We propose to use area function approximation of a person taken at different times and in different contexts. Then the resulting variability of the vocal tract shapes are measured in intra and inter-speaker basis, using the minimal and maximal vocal tract shapes for vowels of male and female speakers.

Phonetic distinctiveness and speaker individuality are ingrained in the vocal tract shapes estimated from the vowels using formant frequencies. This is demonstrated by the acoustic model on vowels and speaker based area function approximation, and formant spread where the formant spread will be more in F1 and F2 than in F3 and F4.

4 DESCRIPTION OF THE RESEARCH WORK

The purpose of speech is communication. Speech is characterized as a signal carrying message or information [1]. It is an acoustic waveform that carries temporal information from a speaker to the listener. Acoustic transmission and reception of speech work efficiently, but only over limited distance. At frequencies used by the vocal tract and ear, radiated acoustic energy spreads spatially and diminishes rapidly in intensity. Even if the source could produce large amounts of acoustic power, the medium supports only a fraction of it without distortion, and the remaining balance gets dissipated in air-dust particles, molecular disturbance, and in overcoming aero-molecular viscosity. The sensitivity of the ear is limited by ambient acoustic noise in the environment and by physiological noises, in and around ear drum.

4.1 The Speech Communication Pathway

A simplified view of speech [2] is a communication pathway, from the speaker to the listener. At the linguistic level of communication, an idea first originates in the mind of the speaker. The idea is then transformed into words, phrases, and sentences according to the grammatical rules of the language. At the physiological level of communication, the brain creates electrical signals that move along the motor nerves. These electric signals activate muscles in the vocal cords, and vocal tract. This vocal cord movement results in pressure changes within the vocal tract and in particular, at the lips initiating the sound wave that propagates in space.

4.2 Sound Propagation

Sound waves are created by vibration and are propagated in air or other media by vibration of the media particles. In speech production, air particles are perturbed near the lips, and this perturbation moves as a chain reaction through lossy free space to the listener. The laws of physics describe the generation and propagation of sound in the vocal system.

A detailed acoustic theory must consider the effects of the following:

Losses due to heat conduction and viscous friction at vocal tract walls.

Elasticity of the vocal tract walls, radiation of sound at the lips and nostrils.

Nasal coupling.

Excitation of sound in the vocal tract.

Time variation of the vocal tract shape.

The vocal tract is modeled as a tube of non-uniform, time-varying, cross-section, for frequencies corresponding to wavelengths that are long compared to the dimensions of the vocal tract (less than about 4000 Hz). It is reasonable to assume plane wave propagation along the axis of the tube. A further simplifying assumption is that, there are no losses due to viscosity or thermal conduction, either in the bulk of the fluid or at the walls of the tube. With these assumptions, and the laws of conservation of mass, momentum and energy, Portnoff has shown that sound waves in the tube satisfy the following pair of equations shown in 4.1 and 4.2.

(4.1)

(4.2)

where

p = p(x, t) is the variation in sound pressure in the tube at position x and time t.

u = u(x, t) is the variation in the volume velocity flow at position x and time t.

ρ is the density of air in the tube.

c is the velocity of sound.

A = A(x, t) is the "area function" of the tube; i.e., the value of cross-section area, normal to the axis of the tube, as a function of a distance along the tube and as a function of time.

Complete solution of differential equations requires that pressure and volume velocity be found for values of x and t in the region bounded by the glottis and the lips.

4.3 Concatenated Tube Model

Fig. 4.1 Concatenated tube model.

A widely used model for speech production is based upon the assumption that the vocal tract can be represented as a concatenation of many small cylindrical tubes, as illustrated in Fig. 4.1. The cross-sectional area of any of the tubes is varied independently to simulate the changing shape of the vocal tract. The length of any tubular segment is changed to reflect the movements of articulators such as the lips, the jaws, the cheeks, the tongue, and the hyoid bone. This variation in the shape and length of the tract at different places, along its length, leads to the production of different sounds.

In time-domain articulatory synthesis, wave propagation equations are used for modeling the passage of airflow through the piecewise-cylindrical model, and a time-varying airflow generated by the glottis is fed into the glottal end of the tract as an input. The time-varying pressure signal obtained at the lip-end of the vocal tract is the synthesized speech output.

Digital Signal Processing (DSP) techniques are employed during modeling, using the speech signal converted into its discrete time sequence, assuming all small cylindrical segments as shown in Fig. 4.1 of equal length. Constant wave propagation with velocity c, sampled in the spatial domain is directly mapped to a sampling in the time domain. Kelly and Lochbaum proposed a method for time-domain articulatory synthesis, based on these principles in 1962.

4.4 Wave Propagation in Concatenated Lossless Tube

If we consider Kth tube with cross-sectional area Ak, the pressure and volume velocity in that tube have the form as shown in the Eqs. (4.3) and (4.4).

(4.3)

(4.4)

Where x is the distance measured from the left-hand end of the Kth tube (0≤x≤ lk) and uk+( ) and uk-( ) are positive-going and negative-going traveling waves in the Kth tube. The relation between the traveling waves in adjacent tubes can be obtained by applying the physical principle that pressure and volume velocity must be continuous in the time and space everywhere in the system. This provides boundary conditions that can be applied at both ends of each tube.

Consider in particular the junction between the Kth and (K+1)st tubes as depicted in Fig. 4.2

Fig. 4.2 Illustration of the junction between two lossless tubes.

Applying the continuity conditions at the junctions gives

Pk(lk,t) = Pk+1(0,t) (4.5)

Uk(lk,t) = Uk+1(0,t) (4.6)

Substituting Eq. (4.3) and Eq. (4.4) into Eq. (4.5) and Eq. (4.6) and gives

­[Uk+(t-τk) + Uk-(t+τk)] = Uk+1+(t) + Uk+1-(t) (4.7)

Uk+(t-Ï„k) - Uk-(t+Ï„k) = Uk+1+(t) - Uk+1-(t) (4.8)

Where Ï„k = lk/c is the time for a wave to travel the length of the Kth tube. From Fig. 4.2 we observe that part of the positive going wave that reaches the junction is propagated on to the right while part is reflected back to the left. Likewise part of the backward traveling wave is propagated on to the left while part is reflected back to the right. Thus, if we solve for and in terms of and , we observe how the forward and reverse traveling waves propagate in the overall system. Solving Eq. (4.8) for Uk-(t+Ï„k) and substituting the result into Eq. (4.7) yields

(4.9)

Subtracting Eq. (4.8) from Eq. (4.7) gives

) (4.10)

It can be seen from Eq. (4.10) that the quantity

(4.11)

The quantity rk is called the reflection co-efficient for the Kth junction.

5 PROPOSED PLAN OF THESIS CHAPTERS

The complete outline of the thesis report is structured into six main sections, namely, Introduction, Literature survey, Theoretical background, Technical design, Discussion of results, Conclusion, Recommendation of future work.

Chapter 2 discusses a literature review for estimation of the dynamic vocal tract shapes using different techniques that are used by different research persons and their limitations in human vocal tract modeling; this is followed by the mechanical measuring methods and their limitations. It provides a brief description about the mechanical to electrical synthesis of vowels. It explains the different techniques used for the estimation of vocal tract shapes from acoustic measurements, speech signals and their limitations. The concept of phonetic distinctiveness and formants frequency spread are the important parameters that are typically measured and compared on intra and inter-speaker basis in actual speaker identification and forensic applications. Nevertheless, both, between and within speaker variations in 'F' pattern are still not well established. Therefore, an attempt is made to focus further in this context.

Chapter 3 is focused on the basic uniform lossless acoustic tube model, the sound wave propagation in concatenated lossless tube models. Emphasis is placed on the time-dependent processing of speech signal. In addition it gives an introduction to different signal processing methods using LPC for speech analysis that are used in this work.

Chapter 4 elaborates the implementation techniques using LPC based dynamic vocal tract shape estimation for vowels of male and female speakers. It briefly describes the intra-speaker vocal tract shape variability estimation for vowels using LPC for male and female speakers. It includes time varying minimal and maximal vocal tract shape variability estimation of 30 samples of 30 subjects of male and female speakers. At the end, this chapter discusses the bounds for average maximal and average minimal for the test sample vocal tract shape estimation of vowels for male and female speakers. It also explains the Correlation Analysis of Vowels superimposed on itself versus the discrimination provided against the vowels pronounced by them and their percentage lying in various vowels.

Chapter 5 is devoted to the classification of vowels based on formants

using peak detection of spectrum using LPC. It emphasizes on an algorithm for intra-speaker formant estimation for vowels of 30 samples from 30 subjects, both male and female, and their averages F1, F2 are plotted. Further it presents an algorithm for inter-speaker formants estimation of 30 samples from 30 subjects, both male and female, and calculating the average mean and standard deviation of vowels. The proposed technique is compared with Praat Software and formants F1 and F2 of male speakers with that of female ones. Independent Speaker Recognition for vowels using Euclidean distance is also discussed in this chapter.

Chapter 6, the last chapter, gives a summary of the investigations, conclusions drawn from the results and some suggestions for further work.

The vocal tract shape area values obtained from 30 samples of a speaker for vowel /a/ are tabulated in Appendix A. Also the formant frequencies obtained from 30 samples of 30 different speakers for vowel /a/ are tabulated in Appendix B.

6 RESULTS

6.1 Analysis of Vocal Tract Shape Bounds for Male Speakers

The bounds of vocal tract shape for maximal average and minimal average for the test samples collected from male speakers for vowel /a/ are obtained. The vocal tract shape correlation graph of a vowel /a/ is superimposed on itself versus the discrimination provided against other vowels pronounced by male speakers. The experimental results of gross vocal tract shapes for vowels are obtained using LPC modeling implemented in MATLAB. The table 6.1 summarizes the vocal tract shape correlation graph of a vowel superimposed on itself versus the discrimination provided against other phonemes. When the vowel /a/ is superimposed on itself and other vowels, the correlations or percentage matching are as follows:

/a/ in /a/ = 87%

/a/ in /e/ = 74%

/a/ in /i/ = 70%

/a/ in /o/ = 54%

/a/ in /u/ = 41%

The correlation or percentage of remaining vowels /e/, /i/, /o/ and /u/, their results and discriminations are shown in table 6.1. This clearly shows that the percentage of matching (i.e. correlations) with the right vowel is the highest, compared to the other vowels. This also explains that there is an opportunity of parametric use in speech pattern recognition. The table 6.1 also shows the probability of fault recognition for vowel /a/ in vowel /e/ followed by vowel /i/, /o/ and /u/ respectively. The fault recognition results when there is least variability in the correlation graph.

The Fig. 6.1 shows the test sample of male speaker. The bounds of vocal tract shape for maximal average and minimal average for vowel /a/ is shown in the Fig. 6.2.

Fig. 6.1 The speech signal for vowel /a/ test sample of male speaker.

vts=test sample

min

0.03

0.03

0.03

0.03

0.04

0.04

0.04

0.04

0.05

0.06

0.07

0.1

0.13

0.16

0.183

0.21

0.27

0.33

0.26

0.25

0.46

0.08

0.55

0.08

max

0.09

0.09

0.09

0.1

0.11

0.11

0.12

0.12

0.14

0.17

0.2

0.28

0.38

0.45

0.517

0.56

0.67

0.75

0.58

0.52

0.88

0.21

0.82

0.51

al

Fig. 6.2 The approximate vocal tract shape representation for vowel /a/, for single male speaker test sample, where vts = test sample, Min vts and Max vts is the minimal and maximal bounds of vowel /a/.

The vocal tract shape correlation graph of a vowel /a/ of a male speaker is superimposed on itself versus the discrimination provided against other speaker pronounced vowels as shown in the Fig. 6.3.

a

e

i

o

u

0

10

20

30

40

50

60

70

80

90

100

% percentage matching

Fig. 6.3 Percentage of matching of vowel /a/ in various vowels for single male speaker test sample where BLUE represents the vowel that lies inside the boundary, BROWN represents the vowel that lies outside the boundary.

1 Vowel /a/ in /a/,

2 Vowel /a/ in /e/,

3 Vowel /a/ in /i/,

4 Vowel /a/ in /o/,

5 Vowel /a/ in /u/.

vts=test samples

min

0.03

0.03

0.03

0.03

0.04

0.04

0.04

0.04

0.05

0.06

0.07

0.1

0.13

0.16

0.183

0.21

0.27

0.33

0.26

0.25

0.46

0.08

0.55

0.08

max

0.09

0.09

0.09

0.1

0.11

0.11

0.12

0.12

0.14

0.17

0.2

0.28

0.38

0.45

0.517

0.56

0.67

0.75

0.58

0.52

0.88

0.21

0.82

0.51

al

Fig. 6.4 The approximate vocal tract shape representation for vowel /a/, for three male speakers test samples, where vts = test sample, Min vts and Max vts is the minimal and maximal bounds of vowel /a/.

Table 6.1 shows the vocal tract correlation graph of a vowel super imposed on it Verses the discrimination provided against other vowels pronounced by male speakers.

Vowels

/a/

/e/

/i/

/o/

/u/

/a/

87

74

70

54

41

/e/

50

82

58

58

41

/i/

62

65

89

40

74

/o/

30

54

34

95

10

/u/

58

65

82

65

96

6.2 Analysis of Vocal Tract Shape Bounds for Female Speakers

The bounds of vocal tract shape for maximal average and minimal average for the test samples collected from female speakers for vowel /a/ are obtained. The vocal tract shape correlation graph of a vowel /a/ is superimposed on itself versus the discrimination provided against other vowels pronounced by female speakers. The experimental results of gross vocal tract shapes for vowels are obtained using LPC modeling implemented in MATLAB. The table 6.2 summarizes the vocal tract shape correlation graph of a vowel superimposed on itself versus the discrimination provided against other phonemes. When the vowel /a/ is superimposed on itself and other vowels, the correlations or percentage matching are as follows:

/a/ in /a/ = 96%

/a/ in /e/ = 76%

/a/ in /i/ = 84%

/a/ in /o/ = 82%

/a/ in /u/ = 78%

The correlation or percentage of remaining vowels /e/, /i/, /o/ and /u/, their results and discriminations are shown in table 6.2. This clearly shows that the percentage of matching with the right vowel is the highest, compared to the other vowels. This also explains that there is an opportunity of parametric use in speech pattern recognition. The table 6.2 also shows the probability of fault recognition for vowel /a/ in vowel /e/ followed by vowel /i/, /o/ and /u/ respectively.

The fault recognition results when there is least variability in the correlation graph.

The Fig. 6.5 shows the test sample of female speaker. The bounds vocal tract shape for maximal average and minimal average for a vowel /a/ is shown in the Fig. 6.6.

Fig. 6.5 The speech signal for vowel /a/ test sample of female speaker.

vts=test sample

min

0.03

0.03

0.03

0.03

0.04

0.04

0.04

0.04

0.05

0.06

0.07

0.1

0.13

0.16

0.183

0.21

0.27

0.33

0.26

0.25

0.46

0.08

0.55

0.08

max

0.09

0.09

0.09

0.1

0.11

0.11

0.12

0.12

0.14

0.17

0.2

0.28

0.38

0.45

0.517

0.56

0.67

0.75

0.58

0.52

0.88

0.21

0.82

0.51

al

Fig. 6.6 The approximate vocal tract shape representation for vowel /a/, for single female speaker test sample, where vts = test sample, Min vts and Max vts is the minimal and maximal bounds of vowel /a/.

The vocal tract shape correlation graph of a vowel /a/ of a female speaker is superimposed on itself versus the discrimination provided against other speaker pronounced vowels as shown in the Fig. 6.7.

a

e

i

o

u

0

10

20

30

40

50

60

70

80

90

100

% percentage matching

Fig. 6.7 Percentage of matching of vowel /a/ in various vowels for single female speaker test sample where BLUE represents the vowel that lies inside the boundary, BROWN represents the vowel that lies outside the boundary.

1 Vowel /a/ in /a/,

2 Vowel /a/ in /e/,

3 Vowel /a/ in /i/,

4 Vowel /a/ in /o/,

5 Vowel /a/ in /u/.

vts= test samples

min

0.03

0.03

0.03

0.03

0.04

0.04

0.04

0.04

0.05

0.06

0.07

0.1

0.13

0.16

0.183

0.21

0.27

0.33

0.26

0.25

0.46

0.08

0.55

0.08

max

0.09

0.09

0.09

0.1

0.11

0.11

0.12

0.12

0.14

0.17

0.2

0.28

0.38

0.45

0.517

0.56

0.67

0.75

0.58

0.52

0.88

0.21

0.82

0.51

al

Fig. 6.8 The approximate vocal tract shape representation for vowel /a/, for three female speakers test samples, where vts = test sample, Min vts and Max vts is the minimal and maximal bounds of vowel /a/.

Table 4.2 shows the vocal tract correlation graph of a vowel superimposed on itself verses the discrimination provided against other vowels pronounced by female speakers.

Vowels

/a/

/e/

/i/

/o/

/u/

/a/

96

72

84

82

78

/e/

82

97

80

70

62

/i/

79

76

96

82

75

/o/

75

66

79

87

66

/u/

29

75

52

25

96

7 SUMMARY OF RESULTS

The Auto-Regressive Method of speech analysis based on Linear Prediction has been used. This model depends only on the previous outputs of the system. The simplest model of a vocal tract consists of many, coaxially linked, cylindrical tubes, producing an all-pole transfer function. Vocal tract shape is estimated from the reflection co-efficients obtained using LPC analysis of speech signals, using Wakita's speech analysis model. Vocal tract area values are obtained for the natural vowels for male and female speakers. The speech samples are acquired with sampling frequencies 22,100Hz per second in 30ms blocks with an overlapping of 10ms. The LPC filter order of 25 is used.

The dynamic vocal tract models, obtained using 25 cylindrical lossless tubes with 24 reflection co-efficients are calculated. Using the calculated reflection co-efficients, the denominator co-efficients for the transfer function V(z) = VL(z)/VG(z) are carried out. The reflection co-efficient at the lips and the area values for vowels namely /a/, /e/, /i/, /o/ and /u/ vary. The largest reflection co-efficients occur where the relative changes in the area are greatest.

Phonetic distinctiveness and speaker individuality are ingrained in the vocal tract shapes estimated from the vowels. This is demonstrated by the acoustic models on vowels and speaker based area function approximation. A new technique has been proposed to approximate an area function of a person, taken at different times, and in different contexts. Then the resulting variability of the vocal tract shapes are measured on intra and inter-speaker basis using the minimal and maximal vocal tract shapes for vowels of male and female speakers. The bounds of vocal tract shapes for maximal average and minimal average for the test samples of vowels collected from male and female speakers have been obtained. The vocal tract shape correlation graphs of a vowel superimposed on itself versus the discrimination provided against other male and female speakers pronounced vowels have been computed.

Phonetic distinctiveness and speaker individuality are ingrained in the vocal tract shapes estimated from the vowels using formant frequencies. This is demonstrated by the acoustic model on vowels and speaker based area function approximation and formants spread.

The area function approximation approach is discussed and implemented in the 4th chapter. 5th chapter is about the development and implementation of a new technique for intra and inter-speaker variation in the form of formant spreads F12, F23 and F34.

The Phonetic distinctiveness and speaker individuality are measured in the form of formant spreads i.e., F1, F2 and F3, F4. Phonetic distinctiveness among the vowels uttered by an intra and inter-speakers is well understood and mostly ingrained in the lowest formants F1 and F2. i.e., the formant spread will be more for F1 and F2. It is probable that the higher formants should carry more speaker specific information i.e., the formant spread will be less when compared to the formants F1 and F2.

In the previous section, the experimental results for formants F1 and F2 obtained from AR modeling using MATLAB are compared with the Praat software. The first formant F1 of male is compared with the first formant F1 of female and the second formant F2 of male is compared with the second formant F2 of female speaker.

In the section 5.11.2, the experimental results obtained by the Euclidean distance method and MLR using Standard Deviation method for the vowels of the male and female speakers are plotted.

8 FUTURE SCOPE

From the investigations reported in this thesis, it is concluded that the vocal tract shape correlation graph of a vowel /a/ superimposed on itself, versus the discrimination provided against the other phonemes. When the vowel /a/ is superimposed on itself and other vowels, the correlations or the percentage matching of the respective vowel recognition score is better when compared to the other methods. Also, this technique explains the possibility of fault recognition, when there is a least variability in the correlation graph.

In this thesis, it can also be concluded that the phonetic distinctiveness can be measured for front vowels in the form of formant spread F1 and F2. The formant spread will be more for F1 and F2 than F3 and F4.

The dynamic vocal tract shape estimations are found consistent by fixing the LPC order as 25. Even the order can be varied depending upon applications. Hence it is suggested that the future investigations for LPC based shape estimation should be carried out to model the VCV and CVC words for word recognition and speaker identification. It is further suggested that the future investigations for LPC based shape estimation for glides and nasal phonemes oriented vocal tract signature, may reduce further uncertainty in recognizing and identifying, and thus improve the accuracy of identification and recognition.

In this thesis, the LPC analysis has been applied on the vocal tract shape estimation based on reflection co-efficients. The vocal tract shape estimation may be investigated with various other techniques like excited LPC, MFCC etc for modeling vowels. Application of the technique is to record a large number of speakers with different age groups and different formal backgrounds, and investigation of vocal tract shape estimation using other analysis techniques like formant tracking, articulatory analysis by synthesis etc. These techniques can be applied for regional languages as these methodologies will considerably recognize the words, detection of vowels from words like /a/ in father.

Writing Services

Essay Writing
Service

Find out how the very best essay writing service can help you accomplish more and achieve higher marks today.

Assignment Writing Service

From complicated assignments to tricky tasks, our experts can tackle virtually any question thrown at them.

Dissertation Writing Service

A dissertation (also known as a thesis or research project) is probably the most important piece of work for any student! From full dissertations to individual chapters, we’re on hand to support you.

Coursework Writing Service

Our expert qualified writers can help you get your coursework right first time, every time.

Dissertation Proposal Service

The first step to completing a dissertation is to create a proposal that talks about what you wish to do. Our experts can design suitable methodologies - perfect to help you get started with a dissertation.

Report Writing
Service

Reports for any audience. Perfectly structured, professionally written, and tailored to suit your exact requirements.

Essay Skeleton Answer Service

If you’re just looking for some help to get started on an essay, our outline service provides you with a perfect essay plan.

Marking & Proofreading Service

Not sure if your work is hitting the mark? Struggling to get feedback from your lecturer? Our premium marking service was created just for you - get the feedback you deserve now.

Exam Revision
Service

Exams can be one of the most stressful experiences you’ll ever have! Revision is key, and we’re here to help. With custom created revision notes and exam answers, you’ll never feel underprepared again.