Advantages And Disadvantages Of Smart Antenna

✅ Paper Type: Free Essay	✅ Subject: English Language
✅ Wordcount: 5339 words	✅ Published: 25th Apr 2017

Reference this

Share this: Facebook Twitter Reddit LinkedIn WhatsApp

The Direction of Arrival (DOA) estimation algorithm which may take various forms generally follows from the homogeneous solution of the wave equation. The models of interest in this dissertation may equally apply to an EM wave as well as to an acoustic wave. Assuming that the propagation model is fundamentally the same, we will, for analytical expediency, show that it can follow from the solution of Maxwell's equations, which clearly are only valid for EM waves. In empty space the equation can be written as:

=0 (3.1)

=0 (3.2)

(3.3)

(3.4)

where "." and "Ã-", respectively, denote the "divergence" and "curl." Furthermore, B is the magnetic induction. E denotes the electric field, whereas and are the magnetic and dielectric constants respectively. Invoking 3.1 the following curl property results as:

(3.5)

(3.6)

(3.7)

The constant c is generally referred to as the speed of propagation. For EM waves in free space, it follows from the derivation c = 1 / = 3 x m / s. The homogeneous wave equation (3.7) constitutes the physical motivation for our assumed data model, regardless of the type of wave or medium. In some applications, the underlying physics are irrelevant, and it is merely the mathematical structure of the data model that counts.

3.2 Plane wave

In the physics of wave propagation, a plane wave is a constant-frequency wave whose wave fronts are infinite parallel planes of constant peak-to-peak amplitude normal to the phase velocity vector[].

Actually, it is impossible to have a rare plane wave in practice, and only a plane wave of infinite extent can propagate as a plane wave. Actually, many waves are approximately regarded as plane waves in a localized region of space, e.g., a localized source such as an antenna produces a field which is approximately a plane wave far enough from the antenna in its far-field region. Likely, we can treat the waves as light rays which correspond locally to plane waves, when the length scales are much longer than the wave's wavelength, as is often appearing of light in the field of optics.

Get Help With Your Essay

If you need assistance with writing your essay, our professional essay writing service is here to help!

Essay Writing Service

3.2.1 Mathematical definition

Two functions which meet the criteria of having a constant frequency and constant amplitude are defined as the sine or cosine functions. One of the simplest ways to use such a sinusoid involves defining it along the direction of the x axis. As the equation shown below, it uses the cosine function to express a plane wave travelling in the positive x direction.

(3.8)

Where A(x,t) is the magnitude of the shown wave at a given point in space and time. is the amplitude of the wave which is the peak magnitude of the oscillation. k is the wave's wave number or more specifically the angular wave number and equals 2Ï€/Î», where Î» is the wavelength of the wave. k has the units of radians per unit distance and is a standard of how rapidly the disturbance changes over a given distance at a particular point in time.

x is a point along the x axis. y and z are not considered in the equation because the wave's magnitude and phase are the same at every point on any given y-z plane. This equation defines what that magnitude and phase are.

is the wave's angular frequency which equals 2Ï€/T, and T is the period of the wave. In detail, omega, has the units of radians per unit time and is also a standard of how rapid the disturbance changing in a given length of time at a particular point in space.

is a given particular point in time, and varphi , is the wave phase shift with the units of radians. It must make clear that a positive phase shift will shifts the wave along the negative x axis direction at a given point of time. A phase shift of 2Ï€ radians means shifting it one wavelength exactly. Other formulations which directly use the wave's wavelength, period T, frequency f and velocity c, are shown as follows:

A=A_o cos[2pi(x/lambda- t/T) + varphi], (3.9)

A=A_o cos[2pi(x/lambda- ft) + varphi], (3.10)

A=A_o cos[(2pi/lambda)(x- ct) + varphi], (3.11)

To appreciate the equivalence of the above set of equations denote that

f=1/T,!

and

c=lambda/T=omega/k,!

3.2.2 Application

Plane waves are solutions for a scalar wave equation in the homogeneous medium. As for vector wave equations, e.g., waves in an elastic solid or the ones describing electromagnetic radiation, the solution for the homogeneous medium is similar. In vector wave equations, the scalar amplitude is replaced by a constant vector. e.g., in electromagnetism is the vector of the electric field, magnetic field, or vector potential. The transverse wave is a kind of wave in which the amplitude vector is perpendicular to k, which is the case for electromagnetic waves in an isotropic space. On the contrast, the longitudinal wave is a kind of wave in which the amplitude vector is parallel to k, typically, such as for acoustic waves in a gas or fluid.

The plane wave equation is true for arbitrary combinations of Ï‰ and k. However, all real physical mediums will only allow such waves to propagate for these combinations of Ï‰ and k that satisfy the dispersion relation of the mediums. The dispersion relation is often demonstrated as a function, Ï‰(k), where ratio Ï‰/|k| gives the magnitude of the phase velocity and dÏ‰/dk denotes the group velocity. As for electromagnetism in an isotropic case with index of refraction coefficient n, the phase velocity is c/n, which equals the group velocity on condition that the index is frequency independent.

In linear uniform case, a wave equation solution can be demonstrated as a superposition of plane waves. This method is known as the Angular Spectrum method. Actually, the solution form of the plane wave is the general consequence of translational symmetry. And in the more general case, for periodic structures with discrete translational symmetry, the solution takes the form of Bloch waves, which is most famous in crystalline atomic materials, in the photonic crystals and other periodic wave equations.

3.3 Propagation

Many physical phenomena are either a result of waves propagating through a medium or exhibit a wave like physical manifestation. Though 3.7 is a vector equation, we only consider one of its components, say E(r,t) where r is the radius vector. It will later be assumed that the measured sensor outputs are proportional to E(r,t). Interestingly enough, any field of the form E(r,t) = , which satisfies 3.7, provided with T denoting transposition. Through its dependence on only, the solution can be interpreted as a wave traveling in the direction, with the speed of propagation. For the latter reason, Î± is referred to as the slowness vector. The chief interest herein is in narrowband forcing functions. The details of generating such a forcing function can be found in the classic book by Jordan [59]. In complex notation [63] and taking the origin as a reference, a narrowband transmitted waveform can be expressed as:

(3.12)

where s(t) is slowly time varying compared to the carrier . For, where B is the bandwidth of s(t), we can write:

(3.13)

In the last equation 3.13, the so-called wave vector was introduced, and its magnitude is the wavenumber. One can also write, where is the wavelength. Make sure that k also points in the direction of propagation, e.g., in the x-y plane we can get:

(3.14)

where is the direction of propagation, defined counter clockwise relative the x axis. It should be noted that 3.12 implicitly assumed far-field conditions, since an isotropic, which refers to uniform propagation/transmission in all directions, point source gives rise to a spherical traveling wave whose amplitude is inversely proportional to the distance to the source. All points lying on the surface of a sphere of radius R will then share a common phase and are referred to as a wave front. This indicates that the distance between the emitters and the receiving antenna array determines whether the spherical degree of the wave should be taken into account. The reader is referred to e.g., [10, 24] for treatments of near field reception. Far field receiving conditions imply that the radius of propagation is so large that a flat plane of constant phase can be considered, thus resulting in a plane wave as indicated in Eq. 8. Though not necessary, the latter will be our assumed working model for convenience of exposition.

Note that a linear medium implies the validity of the superposition principle, and thus allows for more than one traveling wave. Equation 8 carries both spatial and temporal information and represents an adequate model for distinguishing signals with distinct spatial-temporal parameters. These may come in various forms, such as DOA, in general azimuth and elevation, signal polarization, transmitted waveforms, temporal frequency etc. Each emitter is generally associated with a set of such characteristics. The interest in unfolding the signal parameters forms the essence of sensor array signal processing as presented herein, and continues to be an important and active topic of research.

3.4 Smart antenna

Smart antennas are devices which adapt their radiation pattern to achieve improved performance - either range or capacity or some combination of these [1].

The rapid growth in demand for mobile communications services has encouraged research into the design of wireless systems to improve spectrum efficiency, and increase link quality [7]. Using existing methods more effective, the smart antenna technology has the potential to significantly increase the wireless. With intelligent control of signal transmission and reception, capacity and coverage of the mobile wireless network, communications applications can be significantly improved [2].

In the communication system, the ability to distinguish different users is essential. The smart antenna can be used to add increased spatial diversity, which is referred to as Space Division Multiple Access (SDMA). Conventionally, employment of the most common multiple access scheme is a frequency division multiple access (FDMA), Time Division Multiple Access (TDMA), and Code Division Multiple Access (CDMA). These independent users of the program, frequency, time and code domain were given three different levels of diversity.

Potential benefits of the smart antenna show in many ways, such as anti-multipath fading, reducing the delay extended to support smart antenna holding high data rate, interference suppression, reducing the distance effect, reducing the outage probability, to improve the BER (Bit Error Rate)performance, increasing system capacity, to improve spectral efficiency, supporting flexible and efficient handoff to expand cell coverage, flexible management of the district, to extend the battery life of mobile station, as well as lower maintenance and operating costs.

3.4.1 Types of Smart Antennas

The environment and the system's requirements decide the type of Smart Antennas. There are two main types of Smart Antennas. They are as follows:

Phased Array Antenna

In this type of smart antenna, there will be a number of fixed beams between which the beam will be turned on or steered to the target signal. This can be done, only in the first stage of adjustment to help. In other words, as wanted by the moving target, the beam will be the Steering [2].

Adaptive Array Antenna

Integrated with adaptive digital signal processing technology, the smart antenna uses digital signal processing algorithm to measure the signal strength of the beam, so that the antenna can dynamically change the beam which transmit power concentrated, as figure 3.2 shows. The application of spatial processing can enhance the signal capacity, so that multiple users share a channel.

Adaptive antenna array is a closed-loop feedback control system consisting of an antenna array and real-time adaptive signal receiver processor, which uses the feedback control method for automatic alignment of the antenna array pattern. It formed nulling interference signal offset in the direction of the interference, and can strengthen a useful signal, so as to achieve the purpose of anti-jamming [3].

Figure 2 - click for text version

Figure 3.2

3.4.2 Advantages and disadvantages of smart antenna

Advantages

First of all, a high level of efficiency and power are provided by the smart antenna for the target signal. Smart antennas generate narrow pencil beams, when a big number of antenna elements are used in a high frequency condition. Thus, in the direction of the target signal, the efficiency is significantly high. With the help of adaptive array antennas, the same amount times the power gain will be produce, on condition that a fixed number of antenna elements are used.

Another improvement is in the amount of interference which is suppressed. Phased array antennas suppress the interference with the narrow beam and adaptive array antennas suppress by adjusting the beam pattern [2].

Disadvantages

The main disadvantage is the cost. Actually, the cost of such devices will be more than before, not only in the electronics section, but in the energy. That is to say the device is too expensive, and will also decrease the life of other devices. The receiver chains which are used must be decreased in order to reduce the cost. Also, because of the use of the RF electronics and A/D converter for each antenna, the costs are increasing.

Moreover, the size of the antenna is another problem. Large base stations are needed to make this method to be efficient and it will increase the size, apart from this multiple external antennas needed on each terminal.

Then, when the diversity is concerned, disadvantages are occurred. When mitigation is needed, diversity becomes a serious problem. The terminals and base stations must equip with multiple antennas.

3.5 White noise

White noise is a random signal with a flat power spectral density []. In another word, the signal contains the equal power within a particular bandwidth at the centre frequency. White noise draws its name from white light where the power spectral density of the light is distributed in the visible band. In this way, the eye's three colour receptors are approximately equally stimulated [].

In statistical case, a time series can be characterized as having weak white noise on condition that {} is a sequence of serially uncorrelated random vibrations with zero mean and finite variance. Especially, strong white noise has the quality to be independent and identically distributed, which means no autocorrelation. In particular, the series is called the Gaussian white noise [1], if is normally distributed and it has zero mean and standard deviation.

Actually, an infinite bandwidth white noise signal is just a theoretical construction which cannot be reached. In practice, the bandwidth of white noise is restricted by the transmission medium, the mechanism of noise generation, and finite observation capabilities. If a random signal is observed with a flat spectrum in a medium's widest possible bandwidth, we will refer it as "white noise".

3.5.1 Mathematical definition

White random vector

A random vector W is a white random vector only if its mean vector and autocorrelation matrix are corresponding to the follows:

mu_w = mathbb{E}{ mathbf{w} } = 0 (3.15)

R_{ww} = mathbb{E}{ mathbf{w} mathbf{w}^T} = sigma^2 mathbf{I} . (3.16)

That is to say, it is a zero mean random vector, and its autocorrelation matrix is a multiple of the identity matrix. When the autocorrelation matrix is a multiple of the identity, we can regard it as spherical correlation.

White random process

A time continuous random process where is a white noise signal only if its mean function and autocorrelation function satisfy the following equation:

mu_w(t) = mathbb{E}{ w(t)} = 0 (3.17)

R_{ww}(t_1, t_2) = mathbb{E}{ w(t_1) w(t_2)} = (N_{0}/2)delta(t_1 - t_2). (3.18)

That is to say, it is zero mean for all time and has infinite power at zero time shift since its autocorrelation function is the Dirac delta function.

The above autocorrelation function implies the following power spectral density. Since the Fourier transform of the delta function is equal to 1, we can imply:

S_{ww}(omega) = N_{0}/2 ,! (3.19)

Since this power spectral density is the same at all frequencies, we define it white as an analogy to the frequency spectrum of white light. A generalization to random elements on infinite dimensional spaces, e.g. random fields, is the white noise measure.

3.5.2 Statistical properties

The white noise is uncorrelated in time and does not restrict the values a signal can take. Any distribution of values about the white noise is possible. Even a so-called binary signal that can only take the values of 1 or -1 will be white on condition that the sequence is statistically uncorrelated. Any noise with a continuous distribution, like a normal distribution, can be white noise certainly.

Find Out How UKEssays.com Can Help You!

Our academic experts are ready and waiting to assist with any writing project you may have. From simple essay plans, through to full dissertations, you can guarantee we have a service perfectly matched to your needs.

View our services

It is often incorrectly assumed that Gaussian noise is necessarily white noise, yet neither property implies the other. Gaussianity refers to the probability distribution with respect to the value, in this context the probability of the signal reaching amplitude, while the term 'white' refers to the way the signal power is distributed over time or among frequencies. Spectrogram of pink noise (left) and white noise (right), showed with linear frequency axis (vertical).

We can therefore find Gaussian white noise, but also Poisson, Cauchy, etc. white noises. Thus, the two words "Gaussian" and "white" are often both specified in mathematical models of systems. Gaussian white noise is a good approximation of many real-world situations and generates mathematically tractable models. These models are used so frequently that the term additive white Gaussian noise has a standard abbreviation: AWGN. White noise is the generalized mean-square derivative of the Wiener process or Brownian motion.

3.6 Normal Distribution

In probability theory, the normal (or Gaussian) distribution is a continuous probability distribution that has a bell-shaped probability density function, known as the Gaussian function or informally as the bell curve[1].

f(x;mu,sigma^2) = frac{1}{sigmasqrt{2pi}} e^{ -frac{1}{2}left(frac{x-mu}{sigma}right)^2 }

The parameter Î¼ is the mean or expectation (location of the peak) and Ïƒâ€‰2 is the variance. Ïƒ is known as the standard deviation. The distribution with Î¼ = 0 and Ïƒâ€‰2 = 1 is called the standard normal distribution or the unit normal distribution. A normal distribution is often used as a first approximation to describe real-valued random variables that cluster around a single mean value.

http://upload.wikimedia.org/wikipedia/commons/thumb/8/8c/Standard_deviation_diagram.svg/325px-Standard_deviation_diagram.svg.png

The normal distribution is considered the most prominent probability distribution in statistics. There are several reasons for this:[1] First, the normal distribution arises from the central limit theorem, which states that under mild conditions, the mean of a large number of random variables drawn from the same distribution is distributed approximately normally, irrespective of the form of the original distribution. This gives it exceptionally wide application in, for example, sampling. Secondly, the normal distribution is very tractable analytically, that is, a large number of results involving this distribution can be derived in explicit form.

For these reasons, the normal distribution is commonly encountered in practice, and is used throughout statistics, natural sciences, and social sciences [2] as a simple model for complex phenomena. For example, the observational error in an experiment is usually assumed to follow a normal distribution, and the propagation of uncertainty is computed using this assumption. Note that a normally distributed variable has a symmetric distribution about its mean. Quantities that grow exponentially, such as prices, incomes or populations, are often skewed to the right, and hence may be better described by other distributions, such as the log-normal distribution or Pareto distribution. In addition, the probability of seeing a normally distributed value that is far (i.e. more than a few standard deviations) from the mean drops off extremely rapidly. As a result, statistical inference using a normal distribution is not robust to the presence of outliers (data that are unexpectedly far from the mean, due to exceptional circumstances, observational error, etc.). When outliers are expected, data may be better described using a heavy-tailed distribution such as the Student's t-distribution.

3.6.1 Mathematical Definition

The simplest case of a normal distribution is known as the standard normal distribution, described by the probability density function

phi(x) = frac{1}{sqrt{2pi}}, e^{- frac{scriptscriptstyle 1}{scriptscriptstyle 2} x^2}.

The factor scriptstyle 1/sqrt{2pi} in this expression ensures that the total area under the curve Ï•(x) is equal to one[proof], and 12 in the exponent makes the "width" of the curve (measured as half the distance between the inflection points) also equal to one. It is traditional in statistics to denote this function with the Greek letter Ï• (phi), whereas density functions for all other distributions are usually denoted with letters f or p.[5] The alternative glyph Ï† is also used quite often, however within this article "Ï†" is reserved to denote characteristic functions.

Every normal distribution is the result of exponentiating a quadratic function (just as an exponential distribution results from exponentiating a linear function):

f(x) = e^{a x^2 + b x + c}. ,

This yields the classic "bell curve" shape, provided that a < 0 so that the quadratic function is concave for x close to 0. f(x) > 0 everywhere. One can adjust a to control the "width" of the bell, then adjust b to move the central peak of the bell along the x-axis, and finally one must choose c such that scriptstyleint_{-infty}^infty f(x),dx = 1 (which is only possible when a < 0).

Rather than using a, b, and c, it is far more common to describe a normal distribution by its mean Î¼ = âˆ’â€‰b2a and variance Ïƒ2 = âˆ’â€‰12a. Changing to these new parameters allows one to rewrite the probability density function in a convenient standard form,

f(x) = frac{1}{sqrt{2pisigma^2}}, e^{frac{-(x-mu)^2}{2sigma^2}} = frac{1}{sigma}, phi!left(frac{x-mu}{sigma}right).

For a standard normal distribution, Î¼ = 0 and Ïƒ2 = 1. The last part of the equation above shows that any other normal distribution can be regarded as a version of the standard normal distribution that has been stretched horizontally by a factor Ïƒ and then translated rightward by a distance Î¼. Thus, Î¼ specifies the position of the bell curve's central peak, and Ïƒ specifies the "width" of the bell curve.

The parameter Î¼ is at the same time the mean, the median and the mode of the normal distribution. The parameter Ïƒ2 is called the variance; as for any random variable, it describes how concentrated the distribution is around its mean. The square root of Ïƒ2 is called the standard deviation and is the width of the density function.

The normal distribution is usually denoted by N(Î¼,â€‰Ïƒ2).[6] Thus when a random variable X is distributed normally with mean Î¼ and variance Ïƒ2, we write

X sim mathcal{N}(mu,,sigma^2). ,

3.6.2 Alternative formulations

Some authors advocate using the precision instead of the variance. The precision is normally defined as the reciprocal of the variance (Ï„ = Ïƒâˆ’2), although it is occasionally defined as the reciprocal of the standard deviation (Ï„ = Ïƒâˆ’1).[7] This parameterization has an advantage in numerical applications where Ïƒ2 is very close to zero and is more convenient to work with in analysis as Ï„ is a natural parameter of the normal distribution. This parameterization is common in Bayesian statistics, as it simplifies the Bayesian analysis of the normal distribution. Another advantage of using this parameterization is in the study of conditional distributions in the multivariate normal case. The form of the normal distribution with the more common definition Ï„ = Ïƒâˆ’2 is as follows:

f(x;,mu,tau) = sqrt{frac{tau}{2pi}}, e^{frac{-tau(x-mu)^2}{2}}.

The question of which normal distribution should be called the "standard" one is also answered differently by various authors. Starting from the works of Gauss the standard normal was considered to be the one with variance Ïƒ2 = 12 :

f(x) = frac{1}{sqrtpi},e^{-x^2}

Stigler (1982) goes even further and insists the standard normal to be with the variance Ïƒ2 = 12Ï€ :

f(x) = e^{-pi x^2}

According to the author, this formulation is advantageous because of a much simpler and easier-to-remember formula, the fact that the pdf has unit height at zero, and simple approximate formulas for the quintiles of the distribution.

3.7 Cramer-Rao Bound

In estimation theory and statistics, the Cramér-Rao bound (CRB) or Cramér-Rao lower bound (CRLB), named in honor of Harald Cramer and Calyampudi Radhakrishna Rao who were among the first to derive it,[1][2][3] expresses a lower bound on the variance of estimators of a deterministic parameter. The bound is also known as the Cramér-Rao inequality or the information inequality.

In its simplest form, the bound states that the variance of any unbiased estimator is at least as high as the inverse of the Fisher information. An unbiased estimator which achieves this lower bound is said to be (fully) efficient. Such a solution achieves the lowest possible mean squared error among all unbiased methods, and is therefore the minimum variance unbiased (MVU) estimator. However, in some cases, no unbiased technique exists which achieves the bound. This may occur even when an MVU estimator exists.

The Cramér-Rao bound can also be used to bound the variance of biased estimators of given bias. In some cases, a biased approach can result in both a variance and a mean squared error that are below the unbiased Cramér-Rao lower bound; see estimator bias.

statement

The Cramér-Rao bound is stated in this section for several increasingly general cases, beginning with the case in which the parameter is a scalar and its estimator is unbiased. All versions of the bound require certain regularity conditions, which hold for most well-behaved distributions. These conditions are listed later in this section.

Scalar unbiased case

Suppose theta is an unknown deterministic parameter which is to be estimated from measurements x, distributed according to some probability density function f(x;theta). The variance of any unbiased estimator hat{theta} of theta is then bounded by the reciprocal of the Fisher information I(theta):

mathrm{var}(hat{theta}) geq frac{1}{I(theta)}

where the Fisher information I(theta) is defined by

I(theta) = mathrm{E} left[ left( frac{partial ell(x;theta)}{partialtheta} right)^2 right] = -mathrm{E}left[ frac{partial^2 ell(x;theta)}{partialtheta^2} right]

and ell(x;theta)=log f(x;theta) is the natural logarithm of the likelihood function and mathrm{E} denotes the expected value.

The efficiency of an unbiased estimator hat{theta} measures how close this estimator's variance comes to this lower bound; estimator efficiency is defined as

e(hat{theta}) = frac{I(theta)^{-1}}{{rm var}(hat{theta})}

or the minimum possible variance for an unbiased estimator divided by its actual variance. The Cramér-Rao lower bound thus gives

e(hat{theta}) le 1.

General scalar case

A more general form of the bound can be obtained by considering an unbiased estimator T(X) of a function psi(theta) of the parameter theta. Here, unbiasedness is understood as stating that E{T(X)} = psi(theta). In this case, the bound is given by

mathrm{var}(T) geq frac{[psi'(theta)]^2}{I(theta)}

where psi'(theta) is the derivative of psi(theta) (by theta), and I(theta) is the Fisher information defined above.

Bound on the variance of biased estimators

Apart from being a bound on estimators of functions of the parameter, this approach can be used to derive a bound on the variance of biased estimators with a given bias, as follows. Consider an estimator hat{theta} with biasb(theta) = E{hat{theta}} - theta, and let psi(theta) = b(theta) + theta. By the result above, any unbiased estimator whose expectation is psi(theta) has variance greater than or equal to (psi'(theta))^2/I(theta). Thus, any estimator hat{theta} whose bias is given by a function b(theta) satisfies

mathrm{var} left(hat{theta}right) geq frac{[1+b'(theta)]^2}{I(theta)}.

The unbiased version of the bound is a special case of this result, with b(theta)=0.

It's trivial to have a small variance âˆ’ an "estimator" that is constant has a variance of zero. But from the above equation we find that the mean squared errorof a biased estimator is bounded by

mathrm{E}left((hat{theta}-theta)^2right)geqfrac{[1+b'(theta)]^2}{I(theta)}+b(theta)^2,

using the standard decomposition of the MSE. Note, however, that this bound can be less than the unbiased Cramér-Rao bound 1/I(Î¸). See the example of estimating variance below.

Multivariate case

Extending the Cramér-Rao bound to multiple parameters, define a parameter column vector

boldsymbol{theta} = left[ theta_1, theta_2, dots, theta_d right]^T in mathbb{R}^d

with probability density function f(x; boldsymbol{theta}) which satisfies the two regularity conditions below.

The Fisher information matrix is a d times d matrix with element I_{m, k} defined as

I_{m, k} = mathrm{E} left[ frac{d}{dtheta_m} log fleft(x; boldsymbol{theta}right) frac{d}{dtheta_k} log fleft(x; boldsymbol{theta}right) right].

Let boldsymbol{T}(X) be an estimator of any vector function of parameters, boldsymbol{T}(X) = (T_1(X), ldots, T_n(X))^T, and denote its expectation vector mathrm{E}[boldsymbol{T}(X)] by boldsymbol{psi}(boldsymbol{theta}). The Cramér-Rao bound then states that the covariance matrix of boldsymbol{T}(X) satisfies

mathrm{cov}_{boldsymbol{theta}}left(boldsymbol{T}(X)right) geq frac {partial boldsymbol{psi} left(boldsymbol{theta}right)} {partial boldsymbol{theta}} [Ileft(boldsymbol{theta}right)]^{-1} left( frac {partial boldsymbol{psi}left(boldsymbol{theta}right)} {partial boldsymbol{theta}} right)^T

where

The matrix inequality A ge B is understood to mean that the matrix A-B is positive semi definite, and

partial boldsymbol{psi}(boldsymbol{theta})/partial boldsymbol{theta} is the Jacobian matrix whose ijth element is given by partial psi_i(boldsymbol{theta})/partial theta_j.

If boldsymbol{T}(X) is an unbiased estimator of boldsymbol{theta} (i.e., boldsymbol{psi}left(boldsymbol{theta}rig