Bound On The Variance Of Biased Estimators English Language Essay

Published: Last Edited:

This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.

The Direction of Arrival (DOA) estimation algorithm which may take various forms generally follows from the homogeneous solution of the wave equation. The models of interest in this dissertation may equally apply to an EM wave as well as to an acoustic wave. Assuming that the propagation model is fundamentally the same, we will, for analytical expediency, show that it can follow from the solution of Maxwell's equations, which clearly are only valid for EM waves. In empty space the equation can be written as:

=0 (3.1)

=0 (3.2)



where "." and "Ã-", respectively, denote the "divergence" and "curl." Furthermore, B is the magnetic induction. E denotes the electric field, whereas and are the magnetic and dielectric constants respectively. Invoking 3.1 the following curl property results as:




The constant c is generally referred to as the speed of propagation. For EM waves in free space, it follows from the derivation c = 1 / = 3 x m / s. The homogeneous wave equation (3.7) constitutes the physical motivation for our assumed data model, regardless of the type of wave or medium. In some applications, the underlying physics are irrelevant, and it is merely the mathematical structure of the data model that counts.

3.2 Plane wave

In the physics of wave propagation, a plane wave is a constant-frequency wave whose wave fronts are infinite parallel planes of constant peak-to-peak amplitude normal to the phase velocity vector[].

Figure 3.1

Actually, it is impossible to have a rare plane wave in practice, and only a plane wave of infinite extent can propagate as a plane wave. Actually, many waves are approximately regarded as plane waves in a localized region of space, e.g., a localized source such as an antenna produces a field which is approximately a plane wave far enough from the antenna in its far-field region. Likely, we can treat the waves as light rays which correspond locally to plane waves, when the length scales are much longer than the wave's wavelength, as is often appearing of light in the field of optics.

3.2.1 Mathematical definition

Two functions which meet the criteria of having a constant frequency and constant amplitude are defined as the sine or cosine functions. One of the simplest ways to use such a sinusoid involves defining it along the direction of the x axis. As the equation shown below, it uses the cosine function to express a plane wave travelling in the positive x direction.


Where A(x,t) is the magnitude of the shown wave at a given point in space and time.  is the amplitude of the wave which is the peak magnitude of the oscillation. k is the wave's wave number or more specifically the angular wave number and equals 2π/λ, where λ is the wavelength of the wave. k has the units of radians per unit distance and is a standard of how rapidly the disturbance changes over a given distance at a particular point in time.

x is a point along the x axis. y and z are not considered in the equation because the wave's magnitude and phase are the same at every point on any given y-z plane. This equation defines what that magnitude and phase are.

 is the wave's angular frequency which equals 2π/T, and T is the period of the wave. In detail,  \omega\, has the units of radians per unit time and is also a standard of how rapid the disturbance changing in a given length of time at a particular point in space.

 is a given particular point in time, and \varphi \, is the wave phase shift with the units of radians. It must make clear that a positive phase shift will shifts the wave along the negative x axis direction at a given point of time. A phase shift of 2π radians means shifting it one wavelength exactly. Other formulations which directly use the wave's wavelength, period T, frequency f and velocity c\, are shown as follows:

A=A_o \cos[2\pi(x/\lambda- t/T) + \varphi]\, (3.9)

A=A_o \cos[2\pi(x/\lambda- ft) + \varphi]\, (3.10)

A=A_o \cos[(2\pi/\lambda)(x- ct) + \varphi]\, (3.11)

To appreciate the equivalence of the above set of equations denote that




3.2.2 Application

Plane waves are solutions for a scalar wave equation in the homogeneous medium. As for vector wave equations, e.g., waves in an elastic solid or the ones describing electromagnetic radiation, the solution for the homogeneous medium is similar. In vector wave equations, the scalar amplitude  is replaced by a constant vector. e.g., in electromagnetism  is the vector of the electric field, magnetic field, or vector potential. The transverse wave is a kind of wave in which the amplitude vector is perpendicular to k, which is the case for electromagnetic waves in an isotropic space. On the contrast, the longitudinal wave is a kind of wave in which the amplitude vector is parallel to k, typically, such as for acoustic waves in a gas or fluid.

The plane wave equation is true for arbitrary combinations of ω and k. However, all real physical mediums will only allow such waves to propagate for these combinations of ω and k that satisfy the dispersion relation of the mediums. The dispersion relation is often demonstrated as a function, ω(k), where ratio ω/|k| gives the magnitude of the phase velocity and dω/dk denotes the group velocity. As for electromagnetism in an isotropic case with index of refraction coefficient n, the phase velocity is c/n, which equals the group velocity on condition that the index is frequency independent.

In linear uniform case, a wave equation solution can be demonstrated as a superposition of plane waves. This method is known as the Angular Spectrum method. Actually, the solution form of the plane wave is the general consequence of translational symmetry. And in the more general case, for periodic structures with discrete translational symmetry, the solution takes the form of Bloch waves, which is most famous in crystalline atomic materials, in the photonic crystals and other periodic wave equations.

3.3 Propagation

Many physical phenomena are either a result of waves propagating through a medium or exhibit a wave like physical manifestation. Though 3.7 is a vector equation, we only consider one of its components, say E(r,t) where r is the radius vector. It will later be assumed that the measured sensor outputs are proportional to E(r,t). Interestingly enough, any field of the form E(r,t) = , which satisfies 3.7, provided with T denoting transposition. Through its dependence on only, the solution can be interpreted as a wave traveling in the direction, with the speed of propagation. For the latter reason, α is referred to as the slowness vector. The chief interest herein is in narrowband forcing functions. The details of generating such a forcing function can be found in the classic book by Jordan [59]. In complex notation [63] and taking the origin as a reference, a narrowband transmitted waveform can be expressed as:


where s(t) is slowly time varying compared to the carrier . For, where B is the bandwidth of s(t), we can write:


In the last equation 3.13, the so-called wave vector was introduced, and its magnitude is the wavenumber. One can also write, where is the wavelength. Make sure that k also points in the direction of propagation, e.g., in the x-y plane we can get:


where is the direction of propagation, defined counter clockwise relative the x axis. It should be noted that 3.12 implicitly assumed far-field conditions, since an isotropic, which refers to uniform propagation/transmission in all directions, point source gives rise to a spherical traveling wave whose amplitude is inversely proportional to the distance to the source. All points lying on the surface of a sphere of radius R will then share a common phase and are referred to as a wave front. This indicates that the distance between the emitters and the receiving antenna array determines whether the spherical degree of the wave should be taken into account. The reader is referred to e.g., [10, 24] for treatments of near field reception. Far field receiving conditions imply that the radius of propagation is so large that a flat plane of constant phase can be considered, thus resulting in a plane wave as indicated in Eq. 8. Though not necessary, the latter will be our assumed working model for convenience of exposition.

Note that a linear medium implies the validity of the superposition principle, and thus allows for more than one traveling wave. Equation 8 carries both spatial and temporal information and represents an adequate model for distinguishing signals with distinct spatial-temporal parameters. These may come in various forms, such as DOA, in general azimuth and elevation, signal polarization, transmitted waveforms, temporal frequency etc. Each emitter is generally associated with a set of such characteristics. The interest in unfolding the signal parameters forms the essence of sensor array signal processing as presented herein, and continues to be an important and active topic of research.

3.4 Smart antenna

Smart antennas are devices which adapt their radiation pattern to achieve improved performance - either range or capacity or some combination of these [1].

The rapid growth in demand for mobile communications services has encouraged research into the design of wireless systems to improve spectrum efficiency, and increase link quality [7]. Using existing methods more effective, the smart antenna technology has the potential to significantly increase the wireless. With intelligent control of signal transmission and reception, capacity and coverage of the mobile wireless network, communications applications can be significantly improved [2].

In the communication system, the ability to distinguish different users is essential. The smart antenna can be used to add increased spatial diversity, which is referred to as Space Division Multiple Access (SDMA). Conventionally, employment of the most common multiple access scheme is a frequency division multiple access (FDMA), Time Division Multiple Access (TDMA), and Code Division Multiple Access (CDMA). These independent users of the program, frequency, time and code domain were given three different levels of diversity.

Potential benefits of the smart antenna show in many ways, such as anti-multipath fading, reducing the delay extended to support smart antenna holding high data rate, interference suppression, reducing the distance effect, reducing the outage probability, to improve the BER (Bit Error Rate)performance, increasing system capacity, to improve spectral efficiency, supporting flexible and efficient handoff to expand cell coverage, flexible management of the district, to extend the battery life of mobile station, as well as lower maintenance and operating costs.

3.4.1 Types of Smart Antennas

The environment and the system's requirements decide the type of Smart Antennas. There are two main types of Smart Antennas. They are as follows:

Phased Array Antenna

In this type of smart antenna, there will be a number of fixed beams between which the beam will be turned on or steered to the target signal. This can be done, only in the first stage of adjustment to help. In other words, as wanted by the moving target, the beam will be the Steering [2].

Adaptive Array Antenna

Integrated with adaptive digital signal processing technology, the smart antenna uses digital signal processing algorithm to measure the signal strength of the beam, so that the antenna can dynamically change the beam which transmit power concentrated, as figure 3.2 shows. The application of spatial processing can enhance the signal capacity, so that multiple users share a channel.

Adaptive antenna array is a closed-loop feedback control system consisting of an antenna array and real-time adaptive signal receiver processor, which uses the feedback control method for automatic alignment of the antenna array pattern. It formed nulling interference signal offset in the direction of the interference, and can strengthen a useful signal, so as to achieve the purpose of anti-jamming [3].

Figure 2 - click for text version

Figure 3.2

3.4.2 Advantages and disadvantages of smart antenna


First of all, a high level of efficiency and power are provided by the smart antenna for the target signal. Smart antennas generate narrow pencil beams, when a big number of antenna elements are used in a high frequency condition. Thus, in the direction of the target signal, the efficiency is significantly high. With the help of adaptive array antennas, the same amount times the power gain will be produce, on condition that a fixed number of antenna elements are used.

Another improvement is in the amount of interference which is suppressed. Phased array antennas suppress the interference with the narrow beam and adaptive array antennas suppress by adjusting the beam pattern [2].


The main disadvantage is the cost. Actually, the cost of such devices will be more than before, not only in the electronics section, but in the energy. That is to say the device is too expensive, and will also decrease the life of other devices. The receiver chains which are used must be decreased in order to reduce the cost. Also, because of the use of the RF electronics and A/D converter for each antenna, the costs are increasing.

Moreover, the size of the antenna is another problem. Large base stations are needed to make this method to be efficient and it will increase the size, apart from this multiple external antennas needed on each terminal.

Then, when the diversity is concerned, disadvantages are occurred. When mitigation is needed, diversity becomes a serious problem. The terminals and base stations must equip with multiple antennas.

3.5 White noise 

White noise is a random signal with a flat power spectral density []. In another word, the signal contains the equal power within a particular bandwidth at the centre frequency. White noise draws its name from white light where the power spectral density of the light is distributed in the visible band. In this way, the eye's three colour receptors are approximately equally stimulated [].

In statistical case, a time series can be characterized as having weak white noise on condition that {} is a sequence of serially uncorrelated random vibrations with zero mean and finite variance. Especially, strong white noise has the quality to be independent and identically distributed, which means no autocorrelation. In particular, the series is called the Gaussian white noise [1], if  is normally distributed and it has zero mean and standard deviation.

Actually, an infinite bandwidth white noise signal is just a theoretical construction which cannot be reached. In practice, the bandwidth of white noise is restricted by the transmission medium, the mechanism of noise generation, and finite observation capabilities. If a random signal is observed with a flat spectrum in a medium's widest possible bandwidth, we will refer it as "white noise".

3.5.1 Mathematical definition

White random vector

A random vector W is a white random vector only if its mean vector and autocorrelation matrix are corresponding to the follows:

\mu_w = \mathbb{E}\{ \mathbf{w} \} = 0 (3.15)

R_{ww} = \mathbb{E}\{ \mathbf{w} \mathbf{w}^T\} = \sigma^2 \mathbf{I} . (3.16)

That is to say, the white random vector is a zero mean vector, and its autocorrelation matrix is equal to the multiple of the identity matrix. When the autocorrelation matrix is the multiple of the identity, then we can regard it as spherical correlation.

White random process

A time continuous random process  where  is a white noise signal only if its mean function and autocorrelation function are satisfied with the following equation:

\mu_w(t) = \mathbb{E}\{ w(t)\} = 0 (3.17)

R_{ww}(t_1, t_2) = \mathbb{E}\{ w(t_1) w(t_2)\} = (N_{0}/2)\delta(t_1 - t_2). (3.18)

Since its autocorrelation function is the Dirac delta function, it is sure that the process is zero mean for all time and has infinite power at zero time shift.

The autocorrelation function 3.18 implies the following power spectral density function 3.19.

S_{ww}(\omega) = N_{0}/2 ,\! (3.19)

The Fourier transform of the delta function is equal to one. Because this power spectral density function 3.19 is the same with each other at all frequencies, we can define it white as an analogy to the frequency spectrum of white light. The generalization to random elements on the infinite dimensional spaces, e.g. random fields, is the white noise measure.

3.6 Normal Distribution

According to the probability theory, the normal distribution (or Gaussian distribution) is a continuous probability distribution which has a bell-shaped probability density function, known as the Gaussian function or informally as the bell curve [1].

f(x;\mu,\sigma^2) = \frac{1}{\sigma\sqrt{2\pi}} e^{ -\frac{1}{2}\left(\frac{x-\mu}{\sigma}\right)^2 } (3.20)

Where the parameter μ is the mean or expectation (location of the peak) and σ 2 is the variance. And σ is known as the standard deviation. The distribution, as figure 3.3 shown, with μ = 0 and σ 2 = 1 is defined as the standard normal distribution or the unit normal distribution. The normal distribution is often regarded as a first approximation to demonstrate real valued random vibration that cluster around a single's mean value.

Figure 3.3

The normal distribution is considered the most prominent probability distribution in statistics. There are several reasons for this [1]. Firstly, the normal distribution comes from the central limit theorem, which declares that the mean of a large series of random variables drawn from the same distribution is distributed almost normally, irrespective of the form of the original distribution in mild conditions. Then we can implies it exceptionally wide application in, e.g., sampling. Secondly, the standard distribution is greatly tractable analytically. That is to say, a large series of results including this distribution can be derived in explicit form.

In this case, the standard distribution is commonly appeared in practice, and is used throughout statistics, natural sciences, and even social sciences [2].  As for this dissertation, the observational error in the direction of arrival estimation is usually assumed to follow a normal distribution, and the propagation of uncertainty is computed by this assumption. It is sure that a standard distributed data has a symmetric distribution about its mean axis.

3.6.1 Mathematical Definition

Standard normal distribution is the simplest case of a standard distribution, and it can be described mathematically by the probability density function (PDF) as 3.21:

\phi(x) = \frac{1}{\sqrt{2\pi}}\, e^{- \frac{\scriptscriptstyle 1}{\scriptscriptstyle 2} x^2}. (3.21)

Where the factor \scriptstyle\ 1/\sqrt{2\pi} in this equation expresses that the total area under the curve ϕ(x) must be equal to one, and the coefficient in the exponent ensures the "width" of the curve also equal to one. It is conventional in statistics to represent this function with the Greek letter ϕ. And the probability density functions for all other distributions are always denoted with letters f [5]. 

Every normal distribution is the result of exponentiation outcome of a quadratic function as 3.22:

f(x) = e^{a x^2 + b x + c}. \, (3.22)

This draws the classic bell curve shape. If a < 0, the quadratic function will be concave for x when it is close to 0 and f(x) > 0 is true everywhere. We can adjust parameter a to control the width of the bell shape, and adjust b to change the central peak axis of the bell shape along the x axis. Also we can choose c so that the equation\scriptstyle\int_{-\infty}^\infty f(x)\,dx\ =\ 1 , which is only possible when a < 0, is true.

However it is more often to describe a standard distribution by its mean  and variance σ2 =, rather than by a, b, and c. If we replace these new parameters to 3.22 we will rewrite the probability density function (PDF) in a more normal form as 3.23:

f(x) = \frac{1}{\sqrt{2\pi\sigma^2}}\, e^{\frac{-(x-\mu)^2}{2\sigma^2}} = \frac{1}{\sigma}\, \phi\!\left(\frac{x-\mu}{\sigma}\right). (3.23)

As for a standard normal distribution, μ = 0 and σ2 = 1. As the last part of the equation in 3.23 shows, any other standard distribution can be seen as a version of the standard normal distribution which has been stretched horizontally by the factor parameter σ and then translated rightward by the distance parameter μ. Therefore, the parameter μ represents the position of the bell curve's central peak axis, and parameter σ specifies the width of the bell curve.

Meanwhile, the parameter μ is called the mean, the median or the mode of the standard distribution. The parameter σ2 is the variance, which is a significant stand in this dissertation about DOA estimation comparison, and in any random variable case, it demonstrates the concentration degree of the distribution around its mean. Besides, the square root of σ2 is called the standard deviation or root mean square and describes the width of the density function.

The standard distribution is usually denoted by N (μ, σ2) [6]. Therefore, if a random function of X is distributed normally with mean μ and variance σ2, we can write it as 3.24:

X\ \sim\ \mathcal{N}(\mu,\,\sigma^2). \, (3.24)

3.7 Cramer-Rao Bound

In estimation theory and statistics, the Cramér-Rao bound (CRB) or Cramér-Rao lower bound (CRLB), named in honor of Harald Cramer and Calyampudi Radhakrishna Rao who were among the first to derive it,[1][2][3] expresses a lower bound on the variance of estimators of a deterministic parameter. The bound is also known as the Cramér-Rao inequality or the information inequality.

In its simplest form, the bound states that the variance of any unbiased estimator is at least as high as the inverse of the Fisher information. An unbiased estimator which achieves this lower bound is said to be (fully) efficient. Such a solution achieves the lowest possible mean squared error among all unbiased methods, and is therefore the minimum variance unbiased (MVU) estimator. However, in some cases, no unbiased technique exists which achieves the bound. This may occur even when an MVU estimator exists.

The Cramér-Rao bound can also be used to bound the variance of biased estimators of given bias. In some cases, a biased approach can result in both a variance and a mean squared error that are below the unbiased Cramér-Rao lower bound; see estimator bias.


The Cramér-Rao bound is stated in this section for several increasingly general cases, beginning with the case in which the parameter is a scalar and its estimator is unbiased. All versions of the bound require certain regularity conditions, which hold for most well-behaved distributions. These conditions are listed later in this section.

Scalar unbiased case

Suppose \theta is an unknown deterministic parameter which is to be estimated from measurements x, distributed according to some probability density function f(x;\theta). The variance of any unbiased estimator \hat{\theta} of \theta is then bounded by the reciprocal of the Fisher information I(\theta):

\mathrm{var}(\hat{\theta}) \geq \frac{1}{I(\theta)}

where the Fisher information I(\theta) is defined by

I(\theta) = \mathrm{E} \left[ \left( \frac{\partial \ell(x;\theta)}{\partial\theta} \right)^2 \right] = -\mathrm{E}\left[ \frac{\partial^2 \ell(x;\theta)}{\partial\theta^2} \right]

and \ell(x;\theta)=\log f(x;\theta) is the natural logarithm of the likelihood function and \mathrm{E} denotes the expected value.

The efficiency of an unbiased estimator \hat{\theta} measures how close this estimator's variance comes to this lower bound; estimator efficiency is defined as

e(\hat{\theta}) = \frac{I(\theta)^{-1}}{{\rm var}(\hat{\theta})}

or the minimum possible variance for an unbiased estimator divided by its actual variance. The Cramér-Rao lower bound thus gives

e(\hat{\theta}) \le 1.\

General scalar case

A more general form of the bound can be obtained by considering an unbiased estimator T(X) of a function \psi(\theta) of the parameter \theta. Here, unbiasedness is understood as stating that E\{T(X)\} = \psi(\theta). In this case, the bound is given by

\mathrm{var}(T) \geq \frac{[\psi'(\theta)]^2}{I(\theta)}

where \psi'(\theta) is the derivative of \psi(\theta) (by \theta), and I(\theta) is the Fisher information defined above.

Bound on the variance of biased estimators

Apart from being a bound on estimators of functions of the parameter, this approach can be used to derive a bound on the variance of biased estimators with a given bias, as follows. Consider an estimator \hat{\theta} with biasb(\theta) = E\{\hat{\theta}\} - \theta, and let \psi(\theta) = b(\theta) + \theta. By the result above, any unbiased estimator whose expectation is \psi(\theta) has variance greater than or equal to (\psi'(\theta))^2/I(\theta). Thus, any estimator \hat{\theta} whose bias is given by a function b(\theta) satisfies

\mathrm{var} \left(\hat{\theta}\right) \geq \frac{[1+b'(\theta)]^2}{I(\theta)}.

The unbiased version of the bound is a special case of this result, with b(\theta)=0.

It's trivial to have a small variance − an "estimator" that is constant has a variance of zero. But from the above equation we find that the mean squared errorof a biased estimator is bounded by


using the standard decomposition of the MSE. Note, however, that this bound can be less than the unbiased Cramér-Rao bound 1/I(θ). See the example of estimating variance below.

Multivariate case

Extending the Cramér-Rao bound to multiple parameters, define a parameter column vector

\boldsymbol{\theta} = \left[ \theta_1, \theta_2, \dots, \theta_d \right]^T \in \mathbb{R}^d

with probability density function f(x; \boldsymbol{\theta}) which satisfies the two regularity conditions below.

The Fisher information matrix is a d \times d matrix with element I_{m, k} defined as

I_{m, k} = \mathrm{E} \left[ \frac{d}{d\theta_m} \log f\left(x; \boldsymbol{\theta}\right) \frac{d}{d\theta_k} \log f\left(x; \boldsymbol{\theta}\right) \right].

Let \boldsymbol{T}(X) be an estimator of any vector function of parameters, \boldsymbol{T}(X) = (T_1(X), \ldots, T_n(X))^T, and denote its expectation vector \mathrm{E}[\boldsymbol{T}(X)] by \boldsymbol{\psi}(\boldsymbol{\theta}). The Cramér-Rao bound then states that the covariance matrix of \boldsymbol{T}(X) satisfies

\mathrm{cov}_{\boldsymbol{\theta}}\left(\boldsymbol{T}(X)\right) \geq \frac {\partial \boldsymbol{\psi} \left(\boldsymbol{\theta}\right)} {\partial \boldsymbol{\theta}} [I\left(\boldsymbol{\theta}\right)]^{-1} \left( \frac {\partial \boldsymbol{\psi}\left(\boldsymbol{\theta}\right)} {\partial \boldsymbol{\theta}} \right)^T


The matrix inequality A \ge B is understood to mean that the matrix A-B is positive semi definite, and

\partial \boldsymbol{\psi}(\boldsymbol{\theta})/\partial \boldsymbol{\theta} is the Jacobian matrix whose ijth element is given by \partial \psi_i(\boldsymbol{\theta})/\partial \theta_j.

If \boldsymbol{T}(X) is an unbiased estimator of \boldsymbol{\theta} (i.e., \boldsymbol{\psi}\left(\boldsymbol{\theta}\right) = \boldsymbol{\theta}), then the Cramér-Rao bound reduces to

\mathrm{cov}_{\boldsymbol{\theta}}\left(\boldsymbol{T}(X)\right) \geq I\left(\boldsymbol{\theta}\right)^{-1}.

If it is inconvenient to compute the inverse of the Fisher information matrix, then one can simply take the reciprocal of the corresponding diagonal element to find a (possibly loose) lower bound (See eqn. (11) of Bobrovsky, Mayer-Wolf, Zakai, "Some classes of global Cramer-Rao bounds", Ann. Stats., 15(4):1421-38, 1987).

\mathrm{var}_{\boldsymbol{\theta}}\left(T_m(X)\right) = \left[\mathrm{cov}_{\boldsymbol{\theta}}\left(\boldsymbol{T}(X)\right)\right]_{mm} \geq \left[I\left(\boldsymbol{\theta}\right)^{-1}\right]_{mm} \geq \left(\left[I\left(\boldsymbol{\theta}\right)\right]_{mm}\right)^{-1}.

Regularity conditions

The bound relies on two weak regularity conditions on the probability density function, f(x; \theta), and the estimator T(X):

The Fisher information is always defined; equivalently, for all x such that f(x; \theta) > 0,

\frac{\partial}{\partial\theta} \ln f(x;\theta)exists, and is finite.

The operations of integration with respect to x and differentiation with respect to \theta can be interchanged in the expectation ofT; that is,

\frac{\partial}{\partial\theta} \left[ \int T(x) f(x;\theta) \,dx \right] = \int T(x) \left[ \frac{\partial}{\partial\theta} f(x;\theta) \right] \,dx

whenever the right-hand side is finite.

This condition can often be confirmed by using the fact that integration and differentiation can be swapped when either of the following cases holds:

The function f(x;\theta) has bounded support in x, and the bounds do not depend on \theta;

The function f(x;\theta) has infinite support, is continuously differentiable, and the integral converges uniformly for all \theta.

Simplified form of the Fisher information

Suppose, in addition, that the operations of integration and differentiation can be swapped for the second derivative of f(x;\theta) as well, i.e.,

\frac{\partial^2}{\partial\theta^2} \left[ \int T(x) f(x;\theta) \,dx \right] = \int T(x) \left[ \frac{\partial^2}{\partial\theta^2} f(x;\theta) \right] \,dx.

In this case, it can be shown that the Fisher information equals

I(\theta) = -\mathrm{E} \left[ \frac{\partial^2}{\partial\theta^2} \log f(X;\theta) \right].

The Cramèr-Rao bound can then be written as

\mathrm{var} \left(\widehat{\theta}\right) \geq \frac{1}{I(\theta)} = \frac{1} { -\mathrm{E} \left[ \frac{\partial^2}{\partial\theta^2} \log f(X;\theta) \right] }.

In some cases, this formula gives a more convenient technique for evaluating the bound.

Chapter4 Direction of Arrival estimation algorithm

In previous chapters, the problems of Direction of Arrival (DOA) are introduced and the background knowledge of implementation on DOA is demonstrated. In this chapter, the algorithms of DOA will be discussed. It will make a comparison between MUSIC, Root-MUSIC and ESPRIT algorithm in the aspect of theory and simulation.

4.1Music Algorithm

MUSIC is an acronym which stands for Multiple Signal classification. MUSIC algorithm is a relatively simple and efficient spectral estimation method, based on the space of matrix eigenvalue decomposition method [5, 6]. In the geometric field, the signal processing of the observation space can be decomposed into signal subspace and noise subspace, and obviously these two spaces are orthogonal. The eigenvectors of signal corresponding to the received signal subspace from the array data covariance matrix which composed of the noise subspace is the smallest eigenvalue (noise variance) from the covariance matrix of eigenvector. MUSIC algorithm uses the orthogonal between these two complementary spaces to estimate the orientation of the signal in space. Noise subspace of all vectors is used to construct the spectrum, in which the peak position corresponding to wave azimuth and elevation signal in the spectrum of all spatial orientation. MUSIC algorithm greatly improves the resolution direction finding, while adapting to the antenna array of arbitrary shape. But the prototype of the MUSIC algorithm requirements wave signal is irrelevant.

The Data Formulation

For the intended simulations, a few reasonable assumptions can be proposed to make the problem analytically reasonable. The transmission medium in MUSIC algorithm is assumed to be isotropic and non-dispersed which means the radiation propagating in straight lines, and the signals are assumed to be in the far-field of the smart antenna, so that the radiation impinging on the array is in the form of a sum of plane waves.

Mathematically, assume that the linear combinations of the D incident signals as well as noise are received by the smart antenna with M array elements and DM. The received complex vector X of the smart antenna in the multiple signal classification algorithm can be formulated as:



The incident signals are represented in amplitude and phase at some arbitrary reference point by the complex parameters ,,…, and appear as the complex vector F. The noises appear as the complex vector W. The vector A, which represents the relation among array elements, is also complex. are the elements of A, and depend on the relationship between signal arrival angles and the array element locations. Besides, the basic assumption of MUSIC algorithm is that the incident signals and the noise are uncorrelated.

The Covariance Matrix

MUSIC is an Eigen structure algorithm and it means that eigenvalue decomposition of the estimated covariance matrix R is the first step of this algorithm:


We can regard the noise as white noise which means the elements of vector W are mean zero, and variance. Under the basic assumption that the incident signals and the noise are uncorrelated, we can get:


The incident signals represented by the complex vector F may be uncorrelated or may contain completely correlated pairs. Then, FF* will be positive definite which reflects the degrees of arbitrary in pair-wise correlations occurring among the incident signals.

The number of incident wave fronts D is less than the number of array elements M, so the matrix AFF*A* is singular, and it has a rank less than M. Therefore:


When equal to one of the eigenvalues of R, this equation is satisfied. However, since A is full rank and FF* is positive definite, AFF*A* must be nonnegative definite. Therefore can only be the minimum eigenvalue. Then, any measured R = XX* matrix can be written:


Where is the smallest solution to 0. Based on the special case that the elements of the noise vector W are mean zero, variance , it implies that:


Eigenvalue and Eigen Matrix

Eigenvalue and Eigen structure are the key points of MUSIC. After decomposition we can get eigenvalues of R which directly determine the rank of AFF*A*(it is D). Because of the complete set of eigenvalues of R, is not always simple. Actually, in all cases, the eigenvalues of R and those of = differ by, so occurs repeated N = M - D times. Since the minimum eigenvalue of AFFA* is zero because of being singular, must occur repeated N times. Therefore, the number of incident signals sources is:

D = M - N

Where N is the multiplicity of (R, ), which means " of R in the metric of ."

Signal and Noise Subspace

It is important to know that the eigenvalues of R can be subdivided into two parts when the data consist of uncorrelated desired signals corrupted by uncorrelated white noise. The eigenvectors associated with (R,) are perpendicular to the space spanned by the columns of the incident signal mode vectors A, so it is acceptable that for each of which is equal to (there are N), we have AFF*A*= 0 or A*= 0.

Therefore, we can define the NÃ-M dimensional noise subspace which is spanned by the N noise eigenvectors and the DÃ-M dimensional signal subspace which is spanned by the D incident signal eigenvectors. These two subspaces are orthogonal.

Direction of Arrival Scan

Once the noise subspace has been estimated, the search for directions is made by scanning steering vectors that are as perpendicular to the noise subspace as possible. We now on the way to solve for the incident signal vectors. If is defined to be the M N dimensional noise subspace whose columns are the N noise eigenvectors, and we use the ordinary Euclidean distance (squared) from a vector which is a continuum function of ,to the signal subspace for the judgment standard:

For the convenient of distinction, we use the graph of 1/ rather than, and define that is:

Where does not depend on the data. In this case, we get the DOA by searching for peaks in the spectrum. Clearly, R is asymptotically perfectly measured so is asymptotically perfectly measured. Then, it is acceptable that MUSIC is asymptotically unbiased even for multiple incident signals. After finding the directions of arrival of the D incident signals, the A matrix becomes available to compute other parameters of the incident signals.


MUSIC algorithm is an experimental and theoretical techniques involved in estimating the parameters of multiple signals received by a smart antenna. MUSIC approach solve the problem of DOA by searching for peaks in the spectrum given by an MN dimensional noise subspace with its N columns of the eigenvectors refer to the smallest eigenvalues of the array correlation matrix R.

The steps of the MUSIC algorithm in practice can be shown in summary as:

Step 0: collect data, form correlation matrix R;

Step 1: calculate Eigen structure of R in metric of ;

Step 2: decide number of signals D;

Step 3: choose N columns to form noise subspace ;

Step 4: evaluate versus;

Step 5: pick D peaks of .

In conclusion, the MUSIC algorithm is significant because it can be implemented as a basic algorithm to provide asymptotically unbiased estimates of number of signals, directions of arrival (DOA), strengths and cross correlations among the directional signals, polarizations, and strength of noise interference.

4.2 Root-MUSIC

The Root-MUSIC method [11], as the name shown, is a polynomial-rooting promotion of the original MUSIC method. For a ULA, the search for DOA can be transform to finding the roots of a corresponding polynomial. Root-MUSIC solves a polynomial rooting problem rather then finding the identification and localization of spectral peaks in MUSIC algorithm. After lots of research and simulation, it is proved that Root-MUSIC has a better performance than spectral MUSIC []. The pre-process of Root-MUSIC is the same with MUSIC and the only difference between Root-MUSIC and MUSIC is the Direction Finding method.

Direction of Arrival Scan

From MUSIC algorithm, we can get:

Which is used to scan by degree. However, for the moment, if we restrict our attention to uniform linear arrays with inter element spacing, so that the ith element of may be written as:

, M

Let us restrict our attention to the denominator , it may be written as:

where is the sum of entries of along the ith diagonal:

If we define the polynomial as:

Then, evaluating the spectrum is equivalent to evaluating the polynomial on the unit circle. We can use the roots of for direction of arrival estimation rather than scanning for peaks in Definitely, peaks in are due to roots of lying close to the unit circle. Take the pole of at for example:

It will result in a peak in at:

Therefore, after solving the polynomial, we can get D roots which locate near the unit circle mostly. Then, based on the relationship between and, direction of arrival can be found.


Root-MUSIC algorithm is used to calculate the direction of arrival by the underlying information of noise subspace and distinguish some nearby signal sources. The steps of the MUSIC algorithm in practice can be shown in summary as:

Step 0: collect data, form correlation matrix R;

Step 1: calculate Eigen structure of R in metric of ;

Step 2: decide number of signals D;

Step 3: choose N columns to form noise subspace;

Step 4: transform;

Step 5:;

Step 6: calculate the direction by D roots.

It has been shown [112, 113] that Root-MUSIC has identical asymptotic properties, though in a small number of date Root-MUSIC has empirically been found to perform greatly better. Comparing with MUSIC approach, Root-MUSIC has an especially better performance when dealing with some nearby signal sources.


ESPRIT stands for Estimation of Signal Parameters via Rotational Invariance Techniques [7] which is another subspace based DOA estimation algorithm. It does not involve an exhaustive search through all possible steering vectors to estimate DOA and dramatically reduces the computational and storage requirements compared to MUSIC [3]. The goal of the ESPRIT technique is to exploit the rotational invariance in the signal subspace which is created by two arrays with a translational invariance structure. The idea of ESPRIT is that the sensor array is decomposed into two identical sub-arrays. In the sub-arrays, two elements which are corresponding to each have the same pan. That is to say, the array has translational invariance. Every two shifts match the same array element pairs. Fortunately, in practice many of the arrays satisfy this condition, such as uniform linear array. It also has some improved algorithms, such as least squares ESPRIT (LS-ESPRIT), total least squares ESPRIT (TLS-ESPRIT).

ESPRIT algorithm has the following advantages: Firstly, it is different from MUSIC algorithm which scans all steering vector directly by the ordinary Euclidean distance; it greatly reduces the computation of the MUSIC algorithm. Secondly, it does not need to know precisely the array manifold vector, and not require the strict array calibration. However, both ESPRIT algorithm and MUSIC algorithm cannot deal with coherent signals.

Array Geometry

ESPRIT algorithm is a robust and computationally efficient algorithm of DOA finding. The main idea of ESPRIT is using two identical arrays whose elements need to form matched pairs with an identical displacement vector, and the second element of each pair should be located at the same distance and direction relative to the first element.

In detail, assume a array of arbitrary geometry composed of M sensor doublets. The elements in each doublet have identical sensitivity array elements and are separated by a known constant displacement vector . ESPRIT algorithm has a lot of advantages. First of all, the nonzero sensitivity in all directions of interest, the gain, phase of each sensor are not required. Also, there is no requirement of any of the doublets for the same sensitivity patterns though. Furthermore, polarization sensitivity of the elements in the doublet is arbitrary.

The Data Mode

The ESPRIT algorithm is based on the following results for the case in which the underlying 2M-dimensional signal subspace containing the entire array output is known. Assume that signals received by the smart antenna with M array elements are linear combinations of the D narrow-band source signals as well as noise and DM. The sources are located sufficiently far from the array so that in homogeneous isotropic transmission media, the wave fronts impinging on the array elements are planar. The received vector X of the ESPRIT algorithm can be formulated as:



The incident signals, which appear as the complex vector F, are represented in amplitude and phase at some arbitrary reference point by the complex parameters ,,…. The noise appears as the complex vector W. As before, the sources may be assumed to be stationary zero-mean random processes or deterministic signals. Additive noise is present at all 2M sensors and is assumed to be a stationary zero-mean random process with a spatial covariance. The elements of X and are also complex in general. are the elements of , and depend on the relationship between signal arrival angles and the array element locations.

In order to describe mathematically the idea of the translational invariance of the array, it is necessary to describe the array as being consist of two subarrays, and. They are identical in every element and geometrically displaced from each other by a known displacement vector of magnitude. The combination signals received at each subarray can then be expressed as:



ESPRIT does not require any knowledge of the sensitivities, gain and phase patterns, so that the subarray displacement vector decides not only the scale for the problem, but also the reference direction. It is acceptable that the DOA finding outcomes are angles-of-arrival related to the direction of the vector. A natural consequence of this fact is the necessity for a corresponding displacement vector for each dimension with desired parameters. The subarray outputs can be combined to yield:


Defining the combination of :


Covariance matrix

The basic idea of ESPRIT is to exploit the rotational invariance of the underlying signal subspace induced by the translational invariance of the sensor arrays[]. The relevant signal subspace is X that contains the outputs from the two subarrays, and. The combination of the output of the arrays leads to two sets of eigenvectors, and, corresponding to and respectively.

After filtering the noise, the signal subspace can be obtained by collecting sufficient measurement data and finding any set of D linearly independent measurement vectors. These vectors can span the D-dimensional subspace. Consequently, it is significant to filter the noise by covariance matrix. The signal subspace can be obtained from knowledge of the covariance of the original data:


Eigen matrix

MUSIC is an eigenstructure algorithm and it means that this algorithm relies on the eigenvalue decomposition of the estimated covariance matrix. Because the rank of , which is the eigen matrix of , is the same with , there must exist a unique nonsingular matrix T to transform .


Direction of Arrival finding

Make an assumption that:

The rank of is D, and DM. It implies that there exists a unique matrix F with rank D to make:


Where F spans the null-space of Defining:


From the equation AT=0, we can get:


Therefore, the eigenvalues of must be equal to the diagonal elements of , and the columns of T are the eigenvectors of . This is the key relationship in the steps of ESPRIT and its properties. The signal parameters are gain as non-linear functions of the eigenvalues of which can be regard as rotating one set of vectors that span an M-dimensional signal subspace into another.

Once we obtained, we can calculate the direction directly by the configuration of the antenna arrays and the relationship among elements, which are represented in vector A.


MUSIC is an eigenstructure algorithm which means that this algorithm is based on the covariance model and the computation step of ESPRIT algorithm can be summarized as follows:

Make measurements from two identical subarrays, which are displaced by . Estimate the two array correlation matrixes ;

from the measurements and find their eigenvalues and eigenvectors ;

Find the number of directional sources;

decompose to obtain and where

Compute the eigendecomposition. Form a by matrix and find its eigenvectors. Let these eigenvectors be the columns of a matrix E


Divide the matrix E into four DD parts

Calculate the eigenvalues of the matrix


Estimate the angle of arrival using the eigenvalues of .

In practice, ESPRIT algorithm is generally developed and implemented to a wide range of areas. ESPRIT retains most of the essential advantages of the arbitrary array of sensors, and makes a significant reduction in computational complexity (than MUSIC) by forcing a constraint to the structure of the antenna array.

Summary of Estimators

Since the number of algorithms is extensive, it makes sense to give an overview of their properties already at this stage. The following "glossary" is useful for this purpose:

Coherent signals Two signals are coherent if one is a scaled and delayed version of the other.

Consistency An estimate is consistent if it converges to the true value when the number of data tends to infinity.

Statistical efficiency A,n estimator is statistically efficient if it asyrnptotically attains the CramCr-Rao Bound (CRB), which is; a lower bound on the covariance matrix of any unbiased estimator (see e.g. [63]).

We will distinguish between methods that are applicable to arbitrary array geometries and those that require a uniform linear array. Tables 1 and 2 summarize the message conveyed in the algorithm descriptions. The major computational requirements for each method are also included, assuming that the sample covariance matrix has already been acquired.

Here, 1-D search means that the parameter estimates are computed from M one-dimensional searches over the parameter space, whereas M-D search refers to a full M-dimensional numerical optimization. By a "good" statistical performance, we mean that the theoretical mean square error of the estimates is close to the CRB, typically within a few dB

in practical scenarios.

Many spectral methods in the past, have implicitly called upon the spectral decomposition of a covariance matrix to carry out the analysis (e.g., Karhunen-Lokve representation).

One of the most significant contributions came about when the eigen-structure of the covariance matrix was explicitly invoked, and its intrinsic properties were directly used to provide a solution to an underlying estimation problem for a given observed process. Early approaches involving invariant subspaces of observed covariance matrices include principal component factor analysis [SS] and errors-in-variables time series analysis [6S]. In the engineering literature, Pisarenko's work [94] in harmonic retrieval was among the first to be published. However, the tremendous interest in the subspace approach is mainly due to the introduction of the MUSIC (Multiple SIgnal Classification) algorithm [ 13,105]. It is interesting to note that while earlier works were mostly derived in the context of time series analysis and later applied to the sensor array problem, MUSIC was indeed originally presented as a DOA estimator. It has later been successfully brought back to the spectral analysis/system identification problem with its later developments (see e.g. [118, 1351).



Coherent Signal

Statistical Performance






EVD, M-D search





EVD, polynomial






Parametric Methods

While the spectral-based methods presented in the previous section are computationally attractive, they do not always yield sufficient accuracy. In particular, for scenarios involving highly correlated (or even coherent) signals, the performance of spectral-based methods may be insufficient. An alternative is to more fully exploit the underlying data model, leading to so-calledparametric array processing methods. As we shall see, coherent signals impose no conceptual difficulties for such methods. The price to pay for this increased efficiency and robustness is that the algorithms typically require a multidimensional search to find the estimates. For uniform linear arrays (ULAs), the search can, however, be avoided with little (if any) loss of performance.

Perhaps the most well known and frequently used modelbased approach in signal processing is the maximum likelihood (ML) technique. This methodology requires a statistical framework for the data generation process. Two different assumptions about the emitter signals have led to corresponding ML approaches in the array processing literature. In this section we will briefly review both of these approaches, discuss their relative merits, and present subspace-based ML approximations. Parametric DOA estimation methods are in general computationally quite complex. However, for ULAs a number of less demanding algorithms are known, as presented shortly.

Chapter 5 Array Geometry

In the chapter, it will introduce some kinds of array geometry. We take different kinds of array geometry with ten elements and inter-element spacing equal to p, which is half wavelength of the signal, to account, and try to analysis the characteristic of each array geometry based on the time delay between elements. The analysing is the preparation of array geometry simulation and compare in later chapters.

In this chapter, on one hand, it will demonstrate the configuration of elements. On the other hand, it will imply the delay, which is the key point effecting the DOA performance, among elements decided by the elements configuration. Definitely, the array geometry considerably decides the performance of DOA by way of different combination of elements delays.

It is necessary to assume that the signals originate from a very far distance and hence a plane wave associated with the signal advances through a non-dispersive medium that only introduces propagation delay. Then, the signal at any element can be expressed as a time advanced or time delayed version of the signal at the reference element.

Data Model

First of all we should define the coordinate and incident angle. In this chapter, we consider the geometry in the rectangular coordinate system, defining the azimuth as and the elevation as . As shown in Figure, D narrow-band signals transmitted from D far-field sources travel through a homogeneous isotropic fluid medium and impinge on an array of M identical isotropic elements or sensors located at for m [1, M]. Let us note the DOA of the dth source by [,], where elevation angle [,] measured clockwise relatively to the z-axis and azimuth angle [0, 2] measured counter clockwise relatively to the x-axis in the x-y plane.

3D system showing a signal arriving from azimuth and elevation

The received signal of the antenna array is modelled as:

where is the M Ã-1 snapshot vector of the signals received simultaneously on all the sensors, is the DÃ-1 vector of the source signals, is the M Ã-1 noise vector that is assumed to be white, Gaussian and uncorrelated with the source signals. The MÃ-D steering matrix = defines the array manifold and consists of the steering vectors whose components are:




is the propagation delay of source signal d received sensor m, c is the speed of propagation of the waves in the medium and is the unit vector pointing towards the dth source. We considered the array steering vector is a function of the angle of arrival only. Usually the steering vector is also a function of the array geometry, signal frequency and the individual element response.

In order to compare the performance of different geometry, firstly, we should consider the different delay of sources received by sensors.

One Dimension

One dimension is the most easy and feasible case. In this case, the unit vector pointing towards the dth source can be shown as:


There is only one kind of array geometry in one dimension case.

Uniform Linear Array (ULA)

Uniform linear array (ULA) composed of M (in this dissertation, M=10) elements placed on the x axes with inter-element spacing equal to p (p=0.5) is presented. The element placed at the origin is common for referencing purposes. Uniform linear array (ULA) is a classical kind of array geometry and it is the base configuration of the antenna arrays. It is acceptable that ULA can only estimate DOA with one parameter, e.g. the elevation.

We will attempt to express the received signal in each of the element in terms of the signal received in the first element or the reference element. In the figure, the ULA receives the signal from a direction of angle relative to the array broadside. Besides, the location of elements can be represented as:

Then we can get the time delay based on propagation manifests in a phase shift in the received signal at the reference element. Now we can use to express the time delay. The delays of elements are representing as:



Two Dimension

Two dimensions is the most common case, which has been researched by scientist for decades. In this case, the unit vector pointing towards the dth source can be shown as:


There are various kinds of array geometry in two dimension case. In this chapter, we focus on some classical array configuration as Uniform Circle Array, L Shaped Array, Y Shaped Array and Rectangular Array.

Uniform Circle Array (UCA)

Uniform Circle Array (UCA) composed of M (in this dissertation, M=10) elements placed on the x- y plane with inter-element spacing equal to p (p=0.5) is presented. Uniform Circle Array (UCA) is a significant kind of array geometry. As shows, a UCA with 10 elements locate at x-y plane. Shows the 3D perspective and shows the x-y plane perspective.

Definitely, UCA is isotropic. Based on this characteristic, UCA has good performance in mutable incident angles estimation.

In this dissertation, we consider the UCA with radius, allowing that the elements are spaced by half a wavelength. The position vector can be shown as:

and the propagation delay is then:


L Shaped Array (LSA)

L Shaped Array (LSA) composed of two uniform linear sub-arrays placed on the x and y axes, with inter-element spacing equal to p (p=0.5) is presented. The element placed at the origin is common for referencing purposes. Besides, we choose ten elements in this case, where six elements placed on x axes and five elements placed on y axes in this dissertation.

L Shape Array (LSA) is a array geometry which is always used to compare the performance of DOA estimation with other geometry. It is not hard to make sense that LSA is no isotropic. The location of elements can be represented as:

The delay of every element:


Y shaped array (YSA)

Y Shaped Array (YSA) composed of three uniform linear sub-arrays placed on the x axes, 2/3 degree and -2/3 degree, with inter-element spacing equal to p (p=0.5) is presented. We take ten sensors in to account in this case, where there are four elements in every uniform linear sub array. The element placed at the origin is common for referencing purposes.

Y Shape Array (YSA) is a particular array geometry which has been proved has good performance. The location of element can be represented as:

The delay of elements can be shown as:


Rectangular Array (RA)

The Rectangular Array composed of two uniform linear sub-arrays placed near the x axes in the x-y plane, and the axis of two ULA is (y=0.25, z=0) and (y=-0.25, z=0) in this case. The inter-element spacing equal to d (p=0.5) is presented. The element placed at (0, 0.25, 0) is common for referencing purposes.rect.jpg


The antenna array in Figure has a rectangular configuration composed by two ULAs placed in parallel on the x-y plane. The inter-sensor distance p is taken to be half a wavelength of the signal waves. Then the position vector can be expressed as:

The propagation delay for the dth source on the mth sensor is derived as:


Three Dimension

The three dimension case is a new area in array geometry. There is no much research has been made in this case. In the 3D case, the position vector of sensors can be expressed as:


Definitely, the performance depends on the geometry of the antenna array. There are various kinds of array geometry in three dimension case. In this chapter, we focus on some classical array configuration as Double Uniform Circle Array and Double L Shaped Array.

Double L Shaped Array (DLSA)

The 3D Double L shaped array with M (in this dissertation, M=10) elements composed of three uniform linear sub-arrays placed on the x, y and z axes (four elements in each axes in this dissertation) with inter-element spacing equal to p (p=0.5) is presented. The element placed at the origin is common for referencing purposes.


This antenna array configuration has already been proposed in [21] for the estimation of the 2D directions of arrival. However, the authors of [21] estimated independently the elevation angle by the sub-array placed on the z axis and the azimuth angle by both the sub-arrays placed on the x axis and the y axis. The position vector of the mth element is expressed as:

Then the propagation delay for the dth source on the mth sensor is derived as

Double Uniform Circle Array (DUCA)

The Double Uniform Circle Array (DUCA) composed of two uniform circle sub-arrays (five elements in each sub-array) placed on the plane z=p/2 and z=-p/2, with inter-element spacing equal to p (p=0.5) is presented. In this dissertation, we take ten sensors in to account, so the configuration is derived as.dc.jpgdc1.jpgAs figure shown, each circle has five elements, and locates at z=0.25 and z=-0.25 plane respectively. Assume that the radius of the sub-arras circle is, The position vector of the mth element is expressed as:

Then the propagation delay for the dth source on the mth sensor is derived as: