# Design Of Pharmacokinetic Studies For Latent Covariates Biology Essay

**Published:** **Last Edited:**

This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.

## Abstract

Latent covariates are covariates that are either not available or unobservable at the time of the clinical study. Single nucleotide polymorphisms (SNPs) are of interest in pharmacokinetic studies and are often latent at the time the patient is enrolled. The amount of information provided by latent covariates depends on whether the covariate distribution is continuous, ordinal or nominal. In this work we consider the effect of a SNP on the influence of age on drug clearance. The aim of this study was to investigate designs for clinical studies for latent covariates that accommodate the unknown covariate distribution, i.e. continuous, ordinal or nominal. Initially, the informativeness of a covariate explored using linear regression assuming continuous, ordinal and nominal models (here clearance (CL) was considered to be the dependent variable). The standard error (SE) for each parameter for each model was derived from the Fisher Information Matrix (FIM). Secondly, the linear covariate model was considered within a nonlinear mixed effects modelling framework. The population pharmacokinetics (PK) model was a one compartment iv bolus unit model. Three simulation scenarios were considered: (1) the influence of the SNP directly on CL (2) the influence of SNP on age and then the effect of age on CL and (3) the same scenario as in (2) but with age arising from a stratified rather than normal distribution. In all scenarios the SNP was assumed to conform to either a continuous, ordinal or nominal distribution. A power analysis for each scenario was conducted by simulation in MATLAB and estimation was performed in NONMEM according to predefined criteria. For the linear regression model, the calculated SEs for the different models were lowest for the continuous model and highest for the nominal model with the ordinal model falling between. The power analysis from the population PK model (1) where the SNP had direct influence on CL showed a higher power with the continuous model, followed by the power of the ordinal model and with the nominal model having the least power of all. For (2) when age and SNP were considered together, it was found that the power for the continuous model was highest while for the ordinal and nominal it was same following a normal distribution of age. The stratification of age resulted in same power compared to age following a normal distribution. It was found that parameter estimation is more precise for continuous models and generally less precise for nominal models.

Keywords: latent covariate; single nucleotide polymorphism; design studies; linear regression; non-linear mixed effects models.

## INTRODUCTION

Population pharmacokinetic (PK) or pharmacodynamics (PD) studies to be designed to avoid the chances of experimental failure. Covariate modelling is a part of population PK modelling where adding a covariate accounts for BSV for each parameter which helps improve the predictive performance of the model. A covariate is any variable that is specific to an individual and may influence the PKs or pharmacodynamics (PDs) of a drug, e.g. weight, age and sex. Covariates can be classified as intrinsic (e.g. age, race, weight and height) or extrinsic (e.g. dose, compliance and co-medication). They can also be categorised either as continuous variables or as categorical variables in which case they can be nominal (non-ordered) or ordinal (ordered).

The dictionary definition of word latent has differing meanings in various disciplines of science. For the purpose of this study, latent covariates are the covariates which are not available at the time of the clinical study. We may not be sure of which genotype a subject belongs unless we do genotyping and classify them. Latent covariates are not measured at the time of the clinical study, but they can be measurable e.g. height may not have been measured. They are not observed during the clinical study, but they can be observable e.g. blood samples may not yet have been analysed for a genetic polymorphism.

SNP is defined as a single base change in a DNA sequence that is present in a significant proportion (more than 1 percent) of a large population. The individual genome will show the presence of typical SNPs. Individuals can be categorized based on the SNPs. This SNP profile is important for identifying response to drug therapy. SNPs provide us with important covariate information to further individualize drug therapy similar to body weight, renal function to name a few. This kind of approach will consider the patient as an individual rather than a population. Such kind of information will be helpful as each patient behaves differently to drugs. The discovery of SNPs as a function of aging and disease will allow for the better understanding of these phenomenons. Single nucleotide polymorphisms (SNPs) are of interest in pharmacokinetic studies and are often latent at the time the patient is enrolled. The amount of information provided by latent covariates depends on whether the covariate distribution is continuous, ordinal or nominal.

This study aims to investigate the designs for pharmacokinetic studies for latent covariate single nucleotide polymorphism which can be continuous, ordinal and nominal.

## Theory

Linear regression: (Lee & Duffull 2010)

In the linear regression, linear functions are used to model dependent parameters from the data in hand. Linear regression is a type of model where the relationship between an explanatory (dependent) variable and an independent variable(s) is studied. Covariates are independent variables which can be related with the dependent variable y where

In the above equation, X is a covariate matrix, Ñ² represents covariate parameters and Îµ is random errors. The random errors are assumed to be distributed with mean 0 and variance Ïƒ2. Linear regression is solved by ordinary least squares method where sum of the squared residuals are calculates as

The values of Ñ² can be determined by minimization of the sum of squared residuals. Thus, Ñ² is given as (XTX)-1XTy if the variance is assumed 1. The variance-covariance matrix can be derived from this which is (XTX)-1. The estimate of Ñ² is more precise if the variance-covariance matrix is smaller. The covariate matrix X is essentially the sensitivity matrix and the Fisher Information matrix (FIM) is obtained as XTX. The maximization of the determinant of the FIM will minimize the elements of its inverse and so the variance of our parameter estimates. The standard errors for the parameters then can be approximated as the square root of the variances.

Nonlinear mixed effects modelling framework (Nick, 2010; Duffull book)

Nonlinear mixed effects models are used to account for inter and intra-individual variations in pharmacokinetics in a population. The goal of these PK models is to determine population mean parameters, how they change within subpopulations and how the individual characteristics explain variation across it. Clinical Trial Simulation (CTS) is the generation of virtual subjects by approximating the design variables and statistical variables using mathematical and numerical methods. CTS can be used to optimize study design. Clinical trials are performed to evaluate statistical power. Such power can only be achieved if the parameter precision is high. CTS are more practical in comparision to other optimal design methods as a variety of scenarios can be tested. CTS consist of input-output model, covariate distribution model and execution model. The input-output model has a structural model, covariate model and stochastic models. The covariate distribution model is different from the covariate model of input-output model. This typically includes expected covariate frequency distributions in the population under study. Trial execution model takes into account of the expected protocol deviations during the conduct of clinical study.

## Methods:

Linear regression:

In case of SNP, the presence of alleles either enhances or decreases the effect of the activity. A typical SNP may have a baseline effect when there are no alleles present. The presence of an allele may have x magnitude of effect and the presence of two alleles may have 2x magnitudes of effect. So, the SNP in this case can be continuous, ordinal and nominal covariate. Based on this relationship, the linear models can be written as continuous, ordinal and nominal models.

Continuous model:

Y = Î¸1+Î¸2.Xi

Ordinal model:

Y= Î¸1+ Î¸2.X1+ (Î¸2+ Î¸3).X2+ (Î¸2+ Î¸3+ Î¸4).X3

Nominal model:

Y= Î¸1+ Î¸2.X1+ Î¸3.X2+ Î¸4.X3

From these models, the covariate matrices are derived for each model as:

Continuous model

Ordinal model

Nominal model

The SE's were then determined as diagonals of inverse of the square root of FIM. Further, the effect of covariate frequency on parameter SE estimation was assessed. This was done by incorporating frequency of the covariate in the model. The replication of the matrix of the covariate value with the corresponding frequencies of the genotype was performed. This results in a final matrix of the covariate. The FIM was then derived from this as XTX and the SE's were calculated.

Nonlinear mixed effects modelling framework (CTS)

Clinical trial simulation (CTS) is used for determining sample sizes needed for the study incorporating the study variables. Such simulation identifies potential problems and the possible consequences of the study. The linear covariate models above are considered within a non-linear mixed effects modelling framework. For this purpose, a simple one compartment iv bolus unit model was considered as described previously by Jakob Ribbing et al. (ref)

Input-output model:

The 1 compartment intravenous bolus pharmacokinetic model will be used for simulation following a single unit dose. The pharmacokinetic parameters used are CL is 0.693 L/h and V=1 L. The between subject variability values for CL, V are 30% each respectively. Residual unexplained proportional error is 10% and residual unexplained additive error is 0.01 mg/L.

The between subject variability of CL, V will be described according to an exponential distribution model where the BSV is normally distributed with mean zero and variance Ï‰2. is the concentration in the ith subject at the jth time which is a function of Î¸, t, D (1 compartment model) where Î¸ is the individual value of the PK parameter, t is the time and D is the dose.

A combination of additive and proportional error model will be used to describe the random unexplained variability (RUV) when simulating concentrations

Where random errors are normally distributed with mean zero and variance Ïƒ2. The simulations include 7 observations from each individual. The concentrations observations include at time 0, 0.08, 0.5, 1, 2, 3 and 4 hours post dose.

Covariate distribution model:

The latent covariates will be simulated in case of even scenario as fractions of 0.33 and in case of uneven scenario as fractions of 0.50, 0.40 and 0.1 of number of patients. Ages are simulated from a normal distribution truncated at the lower end at 25 and at the upper end at 100 years. For the normal distribution of ages, random numbers are generated of the number of patients from the total number of population truncated between 25 and 100. The ages corresponding to the random numbers position are taken for the inclusion in the study. The stratified distribution of ages was obtained by randomly sampling from the truncated distribution and stratifying them into 3 different strata of ages. The age is categorised into 25-50, 50-75 and 75-100 years.

Simulation scenarios: The following simulation scenarios considered for the study.

Direct influence of SNP on TVCL

Influence of SNP on age (normal distribution), then modified age on TVCL

Influence of SNP on age (stratified distribution), then modified age on TVCL

Scenario I: Direct influence of SNP on TVCL

Under this scenario, the SNP was assumed to have a direct effect on population clearance. The 3 SNPs under study will influence TVCL to have 3 different dependent variables. Three levels of reductions of TVCL were assumed from the intercept value viz, 'low', 'mid' and 'high'. The 'low' level of reduction included about 4% for SNP1, 8% for SNP2 and 16% for SNP3. The 'mid' level of reduction included about 7% for SNP1, 14% for SNP2 and 28% for SNP3. The 'High' level of reduction included about 14% for SNP1, 28% for SNP2 and 56% for SNP3. The linear intercept model was assumed. The table I summarises the values used for simulation.

## Table I Simulation model for direct influence of SNP on TVCL

Model description

Random effects

Parameter values

TVCL = Ñ²1- Ñ²2*Xi

TVV = Ñ²5

CLi = TVCL*exp(Î·iCL)

Vi = TVV*exp(Î·iV)

Cij = dose*exp(-tj*(CLi/Vi))

Cobs.ij = Cij*exp(eps1ij)+eps2ij

Î·iCL Ñ” N(0,Ï‰CL)

Î·iV Ñ” N(0,Ï‰V)

correlation(Î·iCL, Î·iV) = 0

eps1ij Ñ” N(0,Ïƒ-prop)

eps2ij Ñ” N(0,Ïƒ-add

Ñ²1 = ln(2)

Ñ²5 = 1

Ñ²2 Ñ” {0.03, 0.05, 0.1}

Ï‰CL = 0.3

Ï‰V = 0.3

Ïƒ-prop = 0.1

Ïƒ-add = 0.01 mg/L

Scenario II & III: Influence of SNP on age, then modified age on TVCL

The presence of the SNP influences the chronological age (CA) of an individual and tells about the biological age (BA) of an individual. So, we assume if SNP1 (wild type) was present, the people are biologically younger and the clearance was changed only little. The people with SNP2 were biologically much older compared to SNP1, but younger compared to SNP3. Three levels of reductions of TVCL considered for the simulations (for the 80 year olds compared to 20 year adults) viz, 'low', 'mid' and 'high'. The 'low' level of reduction of TVCL includes about 5% for SNP1, 10% for SNP2 and 20% for SNP3. The 'mid' level of reduction of TVCL includes about 10% for SNP1, 20% for SNP2 and 40% for SNP3. The 'high' level of reduction of TVCL includes about 15% for SNP1, 30% for SNP2 and 60% for SNP3. The table summarises the values used for simulation.

## Table II Simulation model used for SNP influencing age, then on TVCL

Model description

Random effects

Parameter values

SNP_effect= Ñ²2*Xi

TVCL = Ñ²1- SNP_effect*AGE

TVV = Ñ²5

CLi = TVCL*exp(Î·iCL)

Vi = TVV*exp(Î·iV)

Cij = dose*exp(-tj*(CLi/Vi))

Cobs.ij = Cij*exp(eps1ij)+eps2ij

Î·iCL Ñ” N(0,Ï‰CL)

Î·iV Ñ” N(0,Ï‰V)

correlation(Î·iCL, Î·iV) = 0

eps1ij Ñ” N(0,Ïƒ-prop)

eps2ij Ñ” N(0,Ïƒ-add)

Ñ²1 = ln(2)

Ñ²5 = 1

Ñ²2 Ñ” {0.00055, 0.0011, 0.0015}

Ï‰CL = 0.3

Ï‰V = 0.3

Ïƒ-prop = 0.1

Ïƒ-add = 0.01 mg/L

Estimation of PK parameters:

The simulated data will be written to sim_data.CSV file for NONMEM estimation using 1-compartment ADVAN2 TRANS2 model with FOCE interaction. The analysis of the data was done using the models used for simulation. The covariate model was written either as continuous, ordinal and nominal for the estimation. Each data set was estimated independently using:

Continuous covariate model

TVCL = THETA (1) - THETA (3)*Xi

Ordinal covariate model

TVCL = THETA (1) - [THETA (3)*X1+ (THETA (3) + THETA (4))* X2 + (THETA (3) + THETA (4) +THETA (5))* X3]

Nominal covariate model

TVCL = THETA (1) - [THETA (3)*X1+ (THETA (4))* X2 + (THETA (5))* X3]

The data set which was successfully estimated for continuous model with the estimation of standard errors for the parameters will only be subjected to further ordinal and nominal model estimation. If the data set was estimated with successful covariance step in all the three models, then only that data set was considered. This was done by calling NONMEM from MATLAB and including while loops. The NONMEM estimation parameters like OBJF, structural parameter values and covariate coefficients along with their SE's were read from NONMEM .smr file and were written to a .csv file. The covariate coefficients and their SE's were used for model power testing.

Power calculation for each model:

Null hypothesis for the Continuous model: (H0: Ñ²2 = 0)

Null hypothesis for the Ordinal model: (H0: Ñ²2 = 0 and Ñ²2+ Ñ²3 = 0 and Ñ²2+ Ñ²3+ Ñ²4 = 0)

Null hypothesis for the Nominal model: (H0: Ñ²2 = 0 and Ñ²3 = 0 and Ñ²4 = 0)

If H0 is rejected,

Success = 1

Else

Success = 0

End

Lower 95% confidence interval calculation for hypothesis testing:

From each simulation-estimation, the estimated values for latent covariate parameters and their SE will be used to predict lower limit of 95% confidence interval as

Parameter estimate - (1.96*SE of the parameter)

In case of Ordinal model, lower limit of 95% confidence interval is calculated from pooled parameter values and pooled SE which will be used for the power analysis.

Power will be compared for Continuous, Ordinal and Nominal model for the influence of the scenarios of latent covariate distributions in the population i.e. even vs. uneven.

## Results:

Linear regression: Following linear regression models, the SEs are calculated for continuous, ordinal and nominal models from the derived FIM. The table III shows the standard errors calculated assuming even frequency of the genotype. The results show that the standard error for Ñ²1 is same for all the models as it is intercept in all cases. The parameter Ñ²2 has equal SE in case of continuous and ordinal model as it has equal information. But, Ñ²2 in case of nominal model has higher standard error compared to other models. In case of Ñ²3 the SE for ordinal is less compared to nominal model. The parameter Ñ²4 has equal SE for ordinal and nominal models. Table IV shows similar results for the parameter SE estimates when the unequal frequency of the genotypes was included in the model.

The figure 1 shows the power of each model following continuous, ordinal and nominal distribution of latent covariate for scenario I. From the figure, we can observe that power for continuous model is higher compared to ordinal model and nominal model assuming even or uneven frequency of the genotype in the population. It is also clear that ordinal model has higher power compared to nominal model. The nominal model has the least power of all the models. The 'low' reduction of TVCL has low power for all models since the value of covariate coefficient is small. The 'high' reduction of TVCL where a higher value of covariate coefficient used, the power is higher for all models compared to 'low' and 'mid' values. The similar trend observed for both even and uneven frequency of the genotypes with various levels of reductions in TVCL.

Figure 2 displays the power following normal distribution of age and assuming even or uneven frequency of the genotype in the population for the different covariate models. It is clear from the figure that continuous model has greater power compared to ordinal and nominal models. However, there is no difference between the ordinal and nominal models assuming even or uneven frequency of the genotypes. The 'low', 'mid' and 'high' reduction in TVCL all showed the similar trend.

Figure 3 displays the power following stratified distribution of age and assuming even or uneven frequency of the genotype in the population for the different covariate models. It is clear from the figure that continuous model has greater power compared to ordinal and nominal models. However, there is no difference between the ordinal and nominal models assuming even or uneven frequency of the genotypes. The 'low', 'mid' and 'high' reduction in TVCL all showed the similar trend.

Comparison of normal against stratified distribution of age shows that there is no statistically significant difference between them.

## Discussions

Designing clinical studies for covariates requires the distribution of covariates to be known. The covariate information in the population intended to be studied must be at hand before designing study. The covariate is whether continuous or categorical should also be known. We can assure a better design if we are aware of the covariate information. This is feasible in most studies. But, in case of studies with latent covariates, we are not sure how they behave unless the study is completed or latent covariate is determined. It is important to design studies for latent covariates expecting worst possible scenario to avoid the failure of study.

The aim of this study was to investigate designs for clinical studies for latent covariates that accommodate the unknown covariate distribution, i.e. continuous, ordinal or nominal. Initially, the informativeness of a covariate explored using linear regression assuming continuous, ordinal and nominal models (here clearance (CL) was considered to be the dependent variable). The standard error (SE) for each parameter for each model was derived from the Fisher Information Matrix (FIM). A continuous distribution of covariate has more information as all the covariates estimate a single parameter. Categorical distribution of covariates has less information for each parameter and hence higher standard errors. A comparison of the two categorical models i.e., ordinal and nominal models have difference in the information for each parameter. An ordinal model has more information compared to nominal model. This disparity in information for the covariate will result in different precisions of the parameters to be estimated. From the results for the linear regression, continuous model has better precision compared to the categorical models. Out of the two categorical models, the precision for ordinal model was better compared to the nominal model.

Secondly, the linear covariate model was considered within a nonlinear mixed effects modelling framework. The population pharmacokinetics (PK) model was a one compartment iv bolus unit model. This simple PK model was used to investigate whether the observations of the linear regression apply to the nonlinear mixed effect models. Three simulation scenarios were considered: (1) the influence of the SNP directly on CL (2) the influence of SNP on age and then the effect of age on CL and (3) the same scenario as in (2) but with age arising from a stratified rather than normal distribution.

The power analysis from the population PK model (1) where the SNP had direct influence on CL showed a higher power with the continuous model, followed by the power of the ordinal model and with the nominal model having the least power of all. This implies that the linear regression observation is valid in the non-linear mixed effects modelling.

We anticipated that the frequency of the covariate in the population will influence the study design. So, assuming even and uneven frequency, the SEs were calculated using linear regression. However, the even and uneven frequency of the covariates both results in overall good precision for the continuous model and a least precision for the nominal model. The power analysis in the non-linear mixed effects modelling also revealed the similar observation. Both even and uneven frequencies results in the similar trend of power for the continuous, ordinal and nominal models.

Modelling is data driven and so different levels of reductions in TVCL are attempted to explore any aberrations. Our results clearly show that across the different levels with varying covariate coefficients resulted in the similar results.

In conclusion, we have shown using linear regression that nominal distribution model for latent covariates represent the better design for clinical studies. This is optimised using CTS using scenario I where the SNP had direct influence on CL. When age and SNP were considered together, it was found that the power for the continuous model was highest while for the ordinal and nominal it was same following a normal distribution of age. The stratification of age resulted in same power compared to age following a normal distribution. It was found that parameter estimation is more precise for continuous models and generally less precise for nominal models.

## Table III SE calculated assuming even frequency of genotypes

Parameter

SE

Continuous

Ordinal

Ñ²1

0.1005

0.1005

Ñ²2

0.1005

0.1005

Ñ²3

## Â -

0.1231

Ñ²4

## Â -

0.1741

Sum

0.2010

0.4982

## Table IV SE calculated assuming uneven frequency of genotypes

Parameter

SE

Continuous

Ordinal

Ñ²1

0.1005

0.1005

Ñ²2

0.1005

0.1005

Ñ²3

## Â -

0.1429

Ñ²4

## Â -

0.3333

Sum

0.2010

0.6772

## Figure 1 Levels of reduction in TVCL considered for simulation for the SNP

snp_levels.emf

## Figure 2 Levels of reduction in TVCL considered for simulation for SNP and age

age_levels.emf

Figure 3 Power for the direct influence of SNP on TVCL direct_jpg.tif

## Figure 4 Power for models following normal distribution of age

normal_res1.emf

## Figure 5 Power for models following stratified distribution of age

stratified_results.emf