This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.
This paper seeks to investigate photonic reservoir computing as a new approach for optical speech recognition on an examination isolated digit recognition task. Analytical approach in photonic reservoir computing is further drawn on to decrease time consumption, compared to numerical methods, which is very important in processing large signals such as speech recognition. It was also observed that adjusting reservoir parameters and a good nonlinear mapping from input signal into reservoir would boost recognition accuracy performance. Perfect recognition accuracy (100%) can be achieved for speech signals without noise. For noisy signals with 0 to10 db of signal to noise ratios, however, the accuracy ranges observed varied between 92% and 98%. In fact, photonic reservoir application demonstrated 9% to 18% improvement compared to classical reservoir networks with hyperbolic tangent nodes.
Keywords: Photonic reservoir computing, Classic reservoir computing, Semiconductor optical amplifiers, Speech recognition.
Speech recognition has well proved a considerable solving task. In this domain, the methods relying on artificial neural networks have long been holding the state of the art. Many of such methods seem to have reached their puberty given the strides already made to remove their limitations. Photonic reservoir computing is a new approach for speech recognition task.
Reservoir computing (RC) is a training concept from the field of machine learning that was developed independently by Mass [1,2] and Jaeger . They illustrated the possibility of using a recurrent neural network without any need for adapting weights, in which output can be generated by a simple linear discriminate or regression algorithm .
Reservoir computing is a particular example of neural networks that combines the advantages of both recurrent and feed forward neural networks. It is based on the recurrent network of simple computational nodes with complex nonlinear dynamics that is combined with readout function. Due to the simplicity and lack of need for training weights, modelling of reservoir computing is much simpler than that of other neural network structures. Using optical components is a way of all optical neural networks which could outperform the speed of electronics in future [5,6].
In this paper, reservoir computing is modelled using semiconductor optical amplifiers- a structure which is used for speech recognition task as well. Semiconductor optical amplifiers showed gain saturation and, in terms of their steady state characteristics , they somewhat resembled the upper branch of the hyperbolic tangent curve used in classical reservoirs. It is because semiconductor optical amplifiers are deemed as a perfect bridge between classical reservoir computing and photonic reservoir computing.
This paper revealed that applying analytical approach in photonic reservoir computing, adjusting reservoir parameters and finally a good nonlinear mapping from input signal into reservoir can improve recognition accuracy for speech recognition with Wight noise. The results further demonstrated that photonic reservoir application could improve the accuracy results by about 9% to 18% compared to classical reservoirs with hyperbolic tangent nodes. Section 2 below describes the structure of photonic reservoir computing. The isolated digit recognition task as well as the obtained results will be discussed in Section 3, which will be followed by a brief conclusion in Section 4.
Photonic reservoir computing
Photonic reservoir computing is a photonic implementation of reservoir computing concept. It is a new problem classification and recognition framework that offers the potential for high speed, power efficient and massively parallel information processing [8-11].
2.1. Principles of photonic reservoir computing
The field of machine learning is, in general, an attempt to emulate human like computational power. Neural networks, as a component of this field, are trained to perform a specific task. Similarly, reservoir computing is an umbrella term for dynamical recurrent neural networks. Reservoir computing consists of two parts. The first part, the so-called reservoir, is a network of simple computational nodes with recurrent connections. The second part is a simple readout function. Training the system simply implies updating the weights of readout usually by a simple linear regression based on reservoir states [12,13].
Fig. 1 is a schematic representation of reservoir and its input and output. In fact, reservoir works as a preprocessing or filtering of the input and performs a nonlinear mixing of the input in feature space so that the reservoir can extract important features more easily. This is the basis of kernel methods in machine learning algorithms .The reservoir dynamics are formulated in Eq. 1:
where x(t) is reservoir state at time t, is the input at time t, is weight matrix from input to reservoir and is internal weight matrix of reservoir,is a nonlinear function which is hyperbolic tangent most of the time, is leak rate that indicates the strength with which input depends on previous states of the reservoir .
For hardware implementation for reservoir computing, the based-on-light potential was noticeable because of an array of advantages including speed and power efficiency, large band width and fast nonlinear effects on different time scales. Semiconductor optical amplifiers with gain saturation in their behaviour are the closest devices to hyperbolic tangent function in the reservoir equation and can, hence, be an interesting candidate technology for building reservoirs. In addition, semiconductor optical amplifiers encode information about past input into new input because of carrier lifetime. This simply means that carriers have certain time to live when excited, after which they go to their ground state. This property of semiconductor optical amplifier can model parameter in Eq. 1.
These properties offer a network of coupled semiconductor optical amplifiers as a reservoir with an off-line electronic readout. But, this network varies, at times, from classical reservoirs. In contrast to classical reservoirs, in photonic reservoirs, there are fixed interconnection weights due to gain and losses in SOA. Also, parameter is a property of SOA, which cannot be justified. Below, properties and simulation results are investigated for a semiconductor optical amplifier such as a reservoir's node.
2-2- Semiconductor optical amplifier simulation model
Agrawal model of SOA has been used in simulations . This model captures basic features such as carrier lifetime, gain saturation and phase shift depending on the gain. Moreover, coherent light is considered in these simulations (for example, coming from a laser) so all the state variables and connection weights are complex; but, only power values are used for readout. In this model, the rate equation below must be solved.
Where shows the gain g integrated over the length of amplifier L,and are output power and phase of reservoir, and are input power and phase, and show carrier lifetime and gain saturation, indicates small signal gain and is line width enhancement factor . To solve the last rate equation, numerical methods could be drawn on but, in our simulation, an analytical approach was used, which was shown to have the following closed form for in Eq. 5. Further, output power and phase were in the form of Eq. 6 and Eq.7 in which h(0) and are the initial condition of and line width enhancement factor.
Therefore output phase was obtained as follow:
And output phase could be representing through the below expression:
Through these closed forms for output power and phase of SOA, an acceptable simulation -with very low time consumption compared to numerical methods - can be achieved. This is an important feature in speech recognition due to the availability of large scale speech data. The present simulations decreased time consumption by several orders, which was a step ahead of the previous works.
Simulation results showed that SOA's output curve had gain saturation in its behavior and can, hence, model hyperbolic tangent in reservoir equation. Fig. 2 shows that the gain curve of an SOA is good approximation of the hyperbolic tangent. Another important parameter of SOA in modeling reservoir was carrier lifetime, which made the response of SOA to be like a first order system with the carrier lifetime as time constant. Fig. 3 demonstrates the response of this SOA model to the pulse input. This can model parameter in reservoir equations.
3-Speech recognition task
Speech recognition is a large scale pattern recognition problem and very difficult to solve [18,19]. Photonic reservoir computing is a new approach in recognition and classification tasks, which was successfully implemented for speech recognition problem in the present paper.
The speech recognition task utilized in the present simulations consisted of ten digits, i.e. zero to nine. Each of these words was uttered 10 times by five different speakers. Thus, there were 500 speech samples from T146 speech groups . The speech signals needed to undergo preprocessing, which involved transformation to the frequency domain along with selective filtering based on human ear properties and within Lyon ear model .
The applied topology for reservoir in the present simulations was Swirl topology , as illustrated in Fig.4. In this structure, the inputs were given to each of the SOAs and information was oriented through the nodes.
3-3- Network parameters
Although the reservoir was left untrained, its optimal performance depended on the operation near the edge of stability. Dynamic regime of the reservoir was determined by spectral radius - the largest Eigen value of the systems' Jacobean at its maximal gain state. Optimal performance found for spectral radius equalled one.
The second parameter affecting the results was phase. Optimized value of phase was 3.953radian in Eq. 9 where is vacuum wavelength and is an effective index of waveguide on silicon-on-insulator platforms. The third parameter was the input weight matrix of reservoir that mapped the inputs into the reservoir. Through the mapping, a combination of speech properties was formed as the input of nodes. Therefore, the weights of this matrix must be properly selected to have the appropriate combination of properties, which was an important stage in simulations. The final parameter was the size of reservoir. A greater number of nodes resulted in the better performance of reservoir but increased the size of reservoir in hardware implementations well. A balance was required between the numbers of nodes and network accuracy. Fig. 5 shows the clean speech recognition accuracy for different reservoir sizes. As shown in this figure, the best performance was obtained with 100 SOAs in the reservoir.
3-4- Speech recognition without noise
The present results of simulation were compared with the results of a classical reservoir with hyperbolic tangent nodes that using adequate photonic reservoir structure and tuning appropriate weight matrixes and parameters could improve recognition accuracy in different SNRs about 9% to 18%. Fig. 6 shows one of the input speech signals, the mapped speech signal into photonic reservoir and output speech signal of photonic reservoir (state vector of reservoir for this speech signal).
3-5- Noisy speech recognition
In this stage, noisy speech signals were used for speech recognition in order to investigate a more intricate task. White noise was applied with different signals to the noise ratios from 10db to 0db. Table 1 show the error rate and Figs. 7 and 8 reveal the results of speech recognition accuracy for photonic reservoir, asserting that the application of this sort of pre-processing can beget salient improvement in results.
In this paper, photonic reservoir computing was drawn on as a new approach to optical speech recognition. A network of coupled SOAs was applied as a reservoir and its performance was evaluated on benchmark noisy speech recognition. The results showed that this structure could produce satisfactory results in recognition and classification tasks and, accordingly, could instigate future photonic hardware implementation of neural networks. In the experiments undertaken, analytical approach was used to decrease time consumption compared to earlier works - an element which is of utmost importance in processing large signals such as speech recognition. In addition, adjusting reservoir parameters and a good mapping from input signal into reservoir were very significant in simulations. These results showed good performance in recognition accuracy. Perfect recognition accuracy (100%) could be achieved for speech signals without noise. For noisy signals with 0 to 10 db signal to noise ratios, using photonic reservoir, showed about 9% to 18% improvement in the results. The results obtained ranged from 92% to 98%.