Branching Processes History And Examples Biology Essay

Published: Last Edited:

This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.

Branching Processes have traditionally been studied as part of Markov Processes and Renewal Theory. Traditionally they have been considered as a tool for applications in natural sciences and more particularly in the fields of Biology, Microbiology and Epidemiology. One can see the correlation as a Branching Process can describe populations according to the birth and/or death rates. In addition, a great interest exists in the continuous Branching Processes, particularly when they can be written in the form of Stochastic Differential equations. Another aspect of the Branching Processes that we are interested in is the identification of families of estimators for their parameters

A Branching Process (BP) is a stochastic process, and more specifically a Markov process, that models the size of the population at a given time t, based on assumptions on the length of the life of any individual and the resulting generation that is produced by that particular individual. It comes as no surprise that the process was first described, in its simplest form, in an attempt to study the survival of family names. The first recorded scholar to work on that problem was the French statistician Irenee-Jules Bienayme in 1845 [Wiley, 1975 & Heyde and Seneta, 1873]. His success on the subject was that he attempted to represent mathematically the extinction of noble family names based on the mean number of males (offspring) for every male ancestor of the prior generation. Though he lacked the proof, he did manage to set up a first formulation. The process is more known as the Galton-Watson Process based on the work of English nobleman and science scholar Francis Galton and the mathematician Rev. Henry W. Watson. Galton was interested in investigating the survival/extinction of English aristocratic surnames. He proceeded to publish the question in the 1873 Educational Times. Watson answered his inquiry with a solution, which led to the publication of [2]. As such, the process is either referred to as Galton-Watson (GW) process or Bienayme-Galton-Watson (BGW) process. The process is commonly denoted as the family with the size of the population at time .

Definition 3.1 (Branching Process) (Kimmel and Axelrod, 2002)

Let be a family of non-negative random variables defined on with elements ω, the number of existing members of the colony at time t and ω the index of the generation. Let also the birth of the original ancestor to occur at , the life expectancy of the ancestor at generation ω and the count of the descendants occurred at death of the ancestor. Then:

Since it has a self recurrent property we have

in distribution

Replacing (3.2) in (3.1) we get:

Next, it is important to discuss the probability generating function (p.g.f) of the process. We use M. Kimmel and D. Axelrod's definition as it appears in [3]:

Definition 3.2 The Probability Generating Function

The pgf of a -valued random variable X is a function

, .

So, in essence we can adopt their notation . In addition, the following theorem is a collection of properties of the p.g.f. as also provided in [3]

Theorem 3.1 Properties (Kimmel and Axelrod, 2002)

Suppose is a valued random variable with pgf which may not be proper. Also assume the non-triviality condition:

, then:

is non-negative and continuous with all derivatives on [0,1). Under (2.6) is increasing and convex.

If is proper, ; otherwise .

If is proper, the kth factorial moment of , , is finite if and only if is finite. In such case .

If and are independent valued random variables then

If is a valued random variable and is a sequence of iid valued random variables independent of , then has the pgf .

Suppose that is a sequence of valued random variables,

exists for each if and only if the sequence converges in distribution to a random variable . Then is the pgf of the limit of .

By the relationship (3.3) and theorem 3.1, one can deduce that:

As we have discussed the general properties of Branching processes, we would like to proceed to a classification that depends on the lifespan of the members of the colony.

Definition 2.3 (BGW or GW) The Galton Watson Process

Let be Branching process and assume the ancestor generates offsprings at time of death, where . Furthermore, assume that the lifespan is identical for each member and it is equal to 1. Then the branching Process is a GW process.

Definition 2.4 Markov Branching Process

Let be a Branching process whose individuals have continuous lifetimes that are exponentially distributed. Then the process is a Markov Branching process.

Definition 2.5 Bellman-Harris

Let be a Branching process where the lifetime of the members follows a non-negative random variable. Then the process is a Bellman-Harris Branching process.

Figure . Example of a Galton-Watson Branching Process

Next we want to discuss the mean of the process, i.e. . It will be stated for the GW process, though it is true for all cases. Based on the properties from Theorem 3.1 and Definition 3.3 we have that

, and

In addition, we have that

, composed t times, and we can therefore get that

Because of (3.10) we can discuss the criticality of the process based on the value m, i.e. the trichotomy:

For , , therefore supercritical

For , , therefore critical

For , , therefore subcritical

In the last two cases one can easily see that the process does not have enough energy, i.e. the members of the colony are not generating enough offspring to substitute the original members. In fact, population will extinct almost surely.

Next we would like to show the probability generating function for the continuous Markov Branching Process (Kimmel and Axelrod, 2002). By definition, the lifetime of an individual follows an exponential distribution with parameter λ. Then the cumulative distribution for Y, the lifetime of the individual, takes the form

, which gives the p.d.f.

Any ancestor, during its lifetime, will produce progeny according to the p.g.f. . Defining as the total population size at time t, then is a continuous time Markov Process with initial condition. Our purpose here is to derive a differential equation for the p.g.f for the population size . We first use the fact that

Taking and small enough we get

As such we can write

Now, we divide both sides by and take the limit as goes to zero so that

, which by applying L'Hospital rule we get:

, with initial condition Q(s,0)=s and unique solution if

Example 2.1 Drug Resistance in Cancer Cells (Kimmel and Axelrod, 2002) & (Allen 2003)

As mentioned earlier, Branching Processes can be applied in various fields, including but not limited to Biology. What we found particularly attractive was its application to cancer and cancer therapy. An excellent example is in chemotherapy and drug resistance found in various books (Kimmel and Axelrod, 2002) & (Allen 2003). In the scenario presented here the existence of two different types of cancer cells is assumed: Type A being the drug sensitive cells in a tumor and Type B being the drug resistant cells. Define y as the time it takes for a cell to divide (split), which y is going to be exponentially distributed with parameter. Also lets define p the probability that out of the two generated cells from a Type A cell one of them will be a Type B cell. Another reasonable assumption is that every Type B cell will generated always two type B cells. In such a case a multi-type branching process is necessary. Therefore the p.g.f.s satisfy:

, and

Also, the set satisfying (3.16)

Then, by separation of variables we have the set of differential equations

, with initial condition

Once (3.20) and replace in (3.19) to get

The most interesting result for this example comes when the probability of no resistant cells at time t is evaluated, i.e:

It is easy to see that the limit of (3.23) as , is equal to zero. This is a very interesting result as it indicates that without any outside influence, the probability of having no resistant cells is zero, except when . If p is actually equal to zero, it would imply that the probability of a drug sensitive cell to generate a drug resistant cell is itself zero(Allen 2003).

R.W. Brown, Estimation of Branching Processes with Immigration by Adaptive Control, Master's Thesis supervised by B. Pasik-Duncan, 2000.

H. W. Watson and F. Galton, On the Probability of the Extinction of Families, Journal of the Anthropological Institute of Great Britain, volume 4, pages 138-144, 1875.

M. Kimmel and D.E. Axelrod, Branching Processes in Biology, Springer, 2002.

L.J.S Allen, An Application to Stochastic Processes with Applications to Biology, Prentice Hall, 2003

B. Bercu, Weighted estimation and tracking for branching processes with immigration, IEEE Transations on Automatic Control, 46, pp. 43-50, 2001.

B. Bercu, Weighted Estimation and Tracking for ARMAX Models, SIAM J. Control and Optimization, Vol. 33, No. 1, pp. 89-106, January 1995.

J. Winnicki, Estimation of the Variances in the Branching Process with Immigration, Prob. Th. Rel. Fields, vol 88, pp. 77-106, 1991.

A. J. Gao and B. Pasik-Duncan, Stochastic Linear Quadratic Adaptive Control for continuous-time First Order Systems, Systems & Control Letters 31, pp. 149-154, 1997

P. Jagers, Branching Processes with Biological Applications, Wiley, 1975

C.c. Heyde and E. Seneta, The simple branching process, a turning point test and a fundamental identity : a historical note on I.J. Bienayme, Biometrika, 59, 680-683, 1972


As we mentioned in the prior chapter, we were attracted to branching processes due to the biological applications and more particularly the applications in cancer therapy and chemotherapy. What attracted us to investigate Branching Process's estimators for its parameters is the work of Bernard Bercu. The field is not necessarily new, and many authors have provided estimators. The innovation in Bercu's approach though is that he introduces the process in its ARMAX form and by choosing appropriate form of the control function he can manipulate the process with fewer restrictions on the mean and variance. As such, we present his approach as well as some of his theorems that we want to replicate in the continuous case. The following two segments cover first the scenario with no Immigration and then the scenario with Immigration.

BGW (No Immigration)

As we discussed earlier, the value of is of great significance. Estimating , as well as estimating , is a challenging task that requires strict assumptions. As the LSE does not necessarily provide strong convergence for the family of estimators, in [7], B. Bercu offers an approach by applying Weighted Least Squares (WLS) and appropriate weights. Though his ultimate task is to estimate the parameters in the BGWI (Bienayme-Galton-Watson with Immigration) scenario, it is only reasonable to initially investigate the case where Immigration is zero (null) i.e.:

, where is the size of the colony for generation n, is the progeny count of each member of the nth generation and is an adaptive control. The purpose of the control is to control X, i.e. boost it when it drastically declines to zero or force it lower when it drastically grows. Also, another important element of the approach in [7] is the rewriting of (2.1) in ARMAX form i.e. by setting


, where

In order to formulate the estimators the quadratic criterion is

, which is minimized by

, where .

Also, the variance can be estimated by

and the choice of the control for is

, where is a sequence of non-negative integer valued random variables and P the projection operator on .

This particular choice of control satisfies the purposes we stated earlier. By employing the projection operator to the Natural numbers, we are assured that the summation makes sense. In addition, for the case where , i.e. the process is running out of energy and therefore would die out, it forces is at least equal to 1. Furthermore, due to the setup, the following theorems and lemmas were obtained (Bercu 1999):

Theorem 1. Assume that has a finite moment o order greater than 2 and that converges a.s. to an integer . If we use the adaptive control from (2.6), then is a strongly consistent estimator of m.


In addition if , we have the central limit theorem


the law of iterated logarithm


and the quadratic strong law


In fact, the above Theorem (Bercu, 2001) is in essence the base for what we want to prove in the next chapter for continuous case as it covers the main properties for estimation and most importantly it provides the strongest rate of convergence possible for a family of estimators.

In addition, the same properties were shown (Bercu, 2001) for the family of estimators for the variance:

Theorem 2. Assume that has a finite moment o order greater than 2 and that converges a.s. to an integer . If we use the adaptive control from (2.6), then is a strongly consistent estimator of .


In addition let be the fourth moment of and let , then we have the central limit theorem


the law of iterated logarithm


and the quadratic strong law


BGWI (with Immigration)

Another interesting result (Bercu, 2001) is when the same approach is used in the case where the BGW process has non-zero Immigration involved (therefore the BGWI case) given by

, in which case we are also interested in the parameter of , namely λ and . So, in similar fashion he sets (Bercu, 2001)


, therefore attaining the stochastic regression equation

with and

Then the quadratic criterion

, with , and

This was solved in [6] to get

where , S deterministic, symmetric and positive definite matrix.

For the variances:


then the suggested family of estimator is

where and , Q deterministic, symmetric and positive definite matrix. Similarly to the BGW scenario, the adaptive control would be expected to be

Using the above control though, the estimator for θ is not strongly consistent. Therefore an excitation ( is considered, i.e.

where ( is an exogenous bounded sequence of i.i.d. positive integer valued random variables, with V the nondegenerate distribution of (.

The two Lemmas and two Theorems that we need from (Bercu, 2001) are:

Lemma 1. Assume that converges to an integer By using the adaptive control in (3.9), we get a.s.


Theorem 3. Assume that converges to an integer By using the adaptive control in (3.9), then is a strongly consistent estimator of θ.


In addition assume that and have finite moments higher than 2, then the central limit theorem is


the law of iterated logarithm


where .

Lemma 2. Assume that converges to an integer and that that and have finite moments of order 4. If the adaptive control from (3.9) is used then a.s.:

Finally, the last theorem (Bercu, 2001) we would like to mention is:

Theorem 4. Assume that converges to an integer and that that and have finite moments of order 4. By using the adaptive control in (3.9), then is a strongly consistent estimator of η.


The central limit theorem


and the law of iterated logarithm


where and .