Preferred Germline Pairing In Antibodies Of Mouse Biology Essay

Published:

This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.

Antibodies are now important element for drug development like as drug molecules with a third of new drugs in development being antibodies. To make effective antibody in lab is very complicated process because to consider these nature -of course stable, available in high yield, not aggregate, bind well and not be immunogenic. Light and heavy chains are two essential part of antibody. Light and heavy chain sequences are extracted from germline sequences and then undergo somatic mutation to optimize their binding. Preference between light and heavy chain pairing is very ambiguous and still known is happened randomly. UCL (University college of London) SMB (Structural and Molecular Biology) created a new database for antibody sequence and structure data (http://www.bioinf.org.uk/abisys/) which has gone to some considerable efforts to pair light and heavy chain sequence data. For this project, data will be extracted for mouse from Abysis to get the nearest germline sequence for expressed antibody sequence and as well as V-gene sequence will be extracted for mouse from VBASE2 (http://www.vbase2.org/vbdownload.php) to create blast database. Then make it tblastn database to select best score to perform Chi-square test and get individual p-value to see they are significantly up or down. Significantly up or down tells us whether particular pairings of germline sequences are more or less favourable that expected by chance.

Introduction

The shape of antibody or immunoglobulin (Ig) looks like capital letter "Y" which consists of 4 protein chains, linked by disulphide bond.

Figure 1. Basic Antibody Structure (Brief Introduction to Antibody Structure, Figure 1, 2000)

These four chains classified into 2 categories- Light and Heavy chain. Light chain makes contact with antigen through amino terminal which has two classes for human- one kappa and one lambda but in mice it has 3 classes -one kappa and two lambdas. Other hand heavy chain falls into five different classes

mu chains (IgM)- It plays crucial role in human body because one of largest antibody and it is first appear during immune response and composed of 576 amino acid which is 8% of antibody in the serum. Basic difference between IgA and IgG is it has four C-region instead of three. It stays in blood and vital for killing for Bacteria.

gamma chains (IgG)- This antibody consists of two light and two heavy chain which are covalently linked by disulphide bond which is 80% of antibody in the serum. It has four subclasses which has done based on serum concentration (IgG2, IgG2, IgG3, IgG4) but these subclasses has moral less similar identity. It can be found in blood but as well as tissue spaces and coat antigens and speeding antigen uptakes.

alpha chains (IgA)- It consists of 13 % antibody in serum and it presents in secretion gland to guard at the entrance of body. It composed of one v region and three C regions. Two kinds of IgA can be observed IgA1 and IgA2 and one difference between them is size of hinge regions.

delta chains (IgD)- It consists of 383 residues and has three C-region which is less than 1% of antibody in serum. Its function is unclear though it can play as regulator during antigen internalizations for immune response.

epsilon chains (IgE)- It act as triggers against allergies which consists less than 0.003% of antibody in serum.8

After observing antibody structure, both chains composed of nonidentical but similar series of domain where each domain consists of near about 200 amino acids but for heavy chain, the number of amino acid is twice as to compare with light chain. These domains consist of two antiparallel β sheets, connected by disulphide bond. In the following Figure 2, disulphide is shown in yellow and two sheets in red and green.

Figure 2. Basic Antibody Structure (Brief Introduction to Antibody Structure, Figure 2, 2000)

The first domain for both chains which is known as variable region (VH for heavy chain and VL for light chain), consists of approximately 100 amino acid, stay at N-terminal region which can be seen at tip of antibody structure.

The type of light and heavy chain falls into constant region which has conserved amino acid sequence can be called "Constant domains" and it responsible for antibody molecule function.

Variability of amino acid within variable region can be found three hypervariable regions which form antigen binding site that basically determine to specificity and affinity for antigen but this variable is not constant throughout domain. These hypervariable regions or antigen binding sites are known as complementarity determining regions (CDRs).

If someone cuts through hinge region then antibody releases three fragments as a Fv (Fragment variables), Fab (Fragment antigen binding) and Fc (Fragment crystallisation), is shown in Figure 1 but it is possible to combine two heavy and light chain Fv by peptide link to form single-chain Fv (scFv).

Antigen recognition by antibody very complex process in the presence of two proteins Recombination Activating genes-

RAG-1

RAG-2

RAG proteins are present in higher class (vertebrates) organism like mammals, fish, birds etc which has unique feature, capable of carry out adaptive immune response. Now genetic knock-out mouse (one or more gene switched off by genetically engineering process or targeted mutation to observe the difference between normal and modified mouse) which RAG is inactive to produce T-cells receptors.7 These two proteins cut through both strands of DNA at RSS. The RSS (Recombination Signal Sequences) are adjacent to gene segments. When two cut ends are joined that form (Figure 3) a 'coding joint' (V-DJ or D-J for heavy chain or V-J for light chain) and a 'signal' which is loop of DNA that removes intervening DNA sequence (introns) primarily present in between two segments.5 During this process two-turn RSS can only join with a one turn RSS. In heavy chains, 51 VH (variable) gene segments form after 27 DH (Diversity) and 6 JH (Joining) gene segments combines with each other where VH encodes for first two CDRs and DH encodes 3rd CDR and JH encodes for remaining variable region. Other hand in light chains VH combines with JH to form V-region exons (Figure 4) like heavy chain. Antibody diversity process has been shown in details Figure 5. This follows following steps

DNA Rearrangements

Transcription

RNA splicing

Translation

Peptide processing

Figure 3. RSS, coding joint and signal joint (Antigen Receptors Diversity, 2009)

Figure 4. Gene Segments construction (Lecture 5, Recognition of antigens by adaptive immune system)

Figure 5. Molecular basis of kappa gene expression (CHAPTER 8, GENETIC BASIS OF ANTIBODY DIVERSITY)

First step DNA Rearrangements where one V and J segments join with each other randomly and DNA sequence is cut out and lost during joining process V and J segment in form closed circular molecules. After this step the Transcription goes through to end of C-region and form immature mRNA. Role of RNA splicing after this step is to remove introns between J-segments and C-region and resulting of mature mRNA. Then Translation cause forming polypeptide which happens on ribosome by rough endoplasmic reticulum (RER). The structure of polypeptide same as immunoglobulin kappa chain but except one difference this has 13 or more amino acids at end of amino terminal which is known as 'Leader' or 'Signal' and it has been already mentioned before. The same process goes through lambda and heavy chain but in heavy chain one additional segments can be seen D or Diversity segments which join with J and V segments as well.7

Two main functions has been done in body by antibody molecules -

Bind to specific epitope on an antigen

For antigen response, triggers are very useful.3

It is already known variable regions responsible for antigen specificity and single domain of C-regions of heavy chains responsible for effectors functions. Three major biological activities has been observed in body

Protection

Placental transfer

Cytophilic

Antibody cannot perform phagocytic alone but it can identify the foreign particle and IgM or IgG binds with antigen to activate complement system to perform phagocytic uptake. Three Fc receptors do interaction between several cell types with antigen-antibody complex because of immunoglobulin like extracellular domain nature which is responsible triggers for multiple biological functions.

Second activities implies for new born baby because immune system take some time after birth to gain full strength so in that the immune system comes from maternal IgG through placenta during pregnancy. IgA comes through breast feeding to inactivate pathogens infant's gut.

Cytophilic antibodies or cell loving antibodies are binding of IgE to mast cells and basophil receptors through their Fc regions which act as triggers against allergies.8

Because of their above role, antibody has huge commercial impact in drug industries to prevent infectious disease through binding with foreign particles by activation of host immune system. Monoclonal antibodies are most effective and safe medicines in drug industries. Now business volume based on antibody is $20 bn but expected within 5-6 years it will be more than $30bn. Antibody drugs are effective therapies for especially in oncology and severe immunological disease indications.9

Figure 1 of Ruud M.T. de Wildt, Rene M.A. Hoet, Walter J.van Venrooij Ian M. Tomlinson and Greg Winter's paper shows us that pairing between heavy and light chains are random. In this paper researchers was observed that during paring there is no preference so that is expected to be random pairing.

In this present study, to see whether pairing occurs randomly or they have any preference due to this reason variable isotype regions sequence for mouse present in kabat database need to do some statistical analysis again.

Present study stars with extraction of mouse light and heavy chain sequence from Kabat database using KabatMan Query language and V-gene sequence including their germline Id from Vbase 2 database. Then its converted into fasta format to create blast database for and after that created tbalstn database to paring vbase2 data with KabatMan Query data to get best score by query language in Microsoft access then to see overall p-value to know the nature of pairing if p-value is significant then try to know individual p-value whether their significant p-value is significantly up or down to gain knowledge whether they are following any preference by expected by chance during pairing. Of course this study is very important to immune system in human body because mouse and human have moral less similarity and it has huge impact on clinical antibody drug development.

Materials and Methods

Extraction of sequence from KabatMan database and Vbase2 database

This step is divided into two parts-

Light and heavy chain pairing data from KabatMan Query database.

Sequence of mouse V-region from Vbase2.

First step, sequence of mouse's light and heavy chain pairing data were extracted from kabatMan database on the BIOINF website (http://bioinf.org.uk/abs/kabatman.html) through kabatMan Query language where query is 'SELECT pir WHERE source includes mouse complete eq true AND '. In this command, two clauses has been included one of them is source is only mouse and 'complete eq true' means true of both light and heavy chains are present and this output was saved as 'kabatseq.txt'.

Second step was to extract sequences of mouse V-regions from 'vbdownload' database on the 'vbase2' website (http://www.vbase2.org/vbdownload.php) by selecting query parameter as 'Mouse' for organism and 'all' for chain. This file contains 'fasta' format for V gene sequence where 'fasta' header composed of 'VBASE2 ID', the V gene names and length of DNA sequence. For this query, output file was saved as 'mouseseq.txt'.

Filtering extracted data and converted it into suitable format for creating blast and tblastn database

Filtering of sequence in this step is an essential part of this project. For 'kabatseq.txt' file where it is expected to get bug if we run it for blast database because in those sequence there were lots of '?' sign which need to replace with 'X' letter and if there is any space in line then need to be replaced with '|' sign if needed. Crucial task for this step is to split this file into heavy (heavychain.faa) and light (lightchain.faa) chain fasta format using 'kabat2fasta.pl' perl programme where both file contains same header contents for each heavy and light chain fasta file.

For 'mouseseq.txt' file, germline sequence contains like IGHV2-7*02 and IGHV2-7*01.'*01' or '*02' signifies to different alleles of parent gene. For that reason, task has to be removed those number after '*' sign and as a result expressed sequence of those alleles can be assigned for same gene like IGHV2-711 and new file was created for that is 'vbase2fasta.faa' through 'vbase2fasta.pl' programme which contains only 'VBASE2 ID' and germline sequence id as header of fasta format and as well as DNA sequence.

Creating BLAST and tblastn database

While this approach of creating BLAST, file 'vbase2fasta.faa' which consists set of functional mouse nucleotide germline sequence which includes VH, Vλ and VK and combined those into BLAST database11 and created three files with name of 'blastdb'. The command for this step is 'makeblastdb -in vbase2fasta.faa -dbtype nucl -out blastdb' where database type is here nucleotide and input file is 'vbase2fasta.faa'.

Now protein sequence of mouse was queried against Blast database through tblastn to create expressed mouse antibody sequences grouped based on their parent germline sequence11. 'tblastn' operation creates to file for each chain (heavytblastdb and lighttblastdb) and command was used for that is 'tblastn -db blastdb -query lightchain.faa -out lighttbalstdb'.

Select best score from tblastn file and merge together

In this step task has been assigned to extract highest score for each and every query using 'select_best_score.pl' programme to save two files as 'heavybest' and 'lightbest' where each line contains 'VBASE2 ID',V gene names, bit score and expected score.

Expected value and bit score value are related but E-value contains more information and % similarities are more important than bit score. Here the basic thing is that if score is lesser then the alignment is better. 'Bit Score' takes parameter as an alignment of similar residue and gaps assigned to align the sequence. So if 'Bit Score' is higher so alignment is better. 'Bit score' is normalised that indicates, 'Bit score' can be compared for different alignments though different scoring matrices have been used12. The standard formula for 'Bit Score' is

S' = (λS -ln K)/ (ln2) (1)

Here statistical parameter has been used K and lambda and this score basically set of units13.

Final approach for this step to merge two best score files ('heavybest' and 'lightbest') into one file ('mergefilescore') using merge_file.pl programme where end and start position of every line ,'|' sign has been removed.

Calculation of overall p-value

In this step is based on statistical knowledge. But before do the calculations need to do modification of 'mergefilescore' file to get desired result using 'dat.pl' programme. Through this programme, needs to group 'Germline id' like IGHV1-10 and IGHV1-11 into IGHV1 or IGHV2S4 into IGHV2. Then create 'Microsoft access 2007' database to run query language to make it into 'modchisqcount.dat' file. Here the query language is 'SELECT modmergefilescore.LightgermID,modmergefilescore.HeavygermID, COUNT(*) FROM modmergefilescore GROUP BY modmergefilescore.HeavygermID, modmergefilescore.LightgermID;'. In this query operation task was to count for each and every paring between heavy and light V gene.

Here chi-squared table is 'n x n' and for that reason chi-squared follows chi-squared distribution with degree of freedom ((row-1) (column-1)) when expected values are not too small (>5)14. To run chi-squared test need to type this command '~martin/bin/chisq -d modchiqcount.dat'. Here '-d' displays chi0squared tables and observed and expected value for each data. For this test the formula has been used -

Chisq (X2) = ∑ni=1 (Oi-Ei) 2 /Ei (2)

Where O-Observed frequency and E-Expected frequency and expected frequency has been calculated like P(nc/n)*P(nr/n)*n. In that formula P stands for probability and nc and nr denotes total of each column and total of each row respectively and n is overall total. After getting chi-squared value and df (degree of freedom) need to run p-value test to see whether result is statistically significant or not. To run this operation following commands is essential '~martin/bin/chisig "chisqaured value" "df"'.

2.7 Calculation for individual p-value

In this approach, two calculations have been done one for group merge individual data and another one for original individual data.

For first calculation, 'modfile.pl' programme has been used to modify 'mergefilescore' file to create 'modmergescore' file where task was to convert IGHV1-10 and IGHV1-11 into IGHV1 and then created database in 'Microsoft access 2007' to make new file 'modgermmatrix.ods' by running query language

'SELECT modmergescore.LightgermID,modmergescore.HeavygermID, COUNT(*) FROM modmergescore GROUP BY modmergescore.HeavygermID, modmergescore.LightgermID;'

After that to create individual 2x2 'dat' file for each cell of matrix. The calculation is shown in 'Table 1' how to make above file-

Value of columni and rowj

Value of total columni- Value of columni and rowj

Value of total rowj- Value of columni and rowj

Value of total matrix - Value of total columni-( Value of total rowj- Value of columni and rowj)

Table 1. Procedure for calculation of individual 2x2 dat file

Next task is to run '~martin/bin/chisq -d -y filename.dat' to get individual chi-squared value. Using those chi-squared value, aim is to get individual p-value through '~martin/bin/chisig "chisqaured value" "df"'. For 2x2 table or more precisely one degree of freedom for chi-squared test should not be strictly used. To overcome this problem 'Yates correction' is the simplest method and the way to do this is to subtract 0.5 from (Oi-Ei) 15. The formula is given below-

Chisq (X2) = ∑ni=1 ((Oi-Ei) -0.5)2 /Ei (3)

After getting p-value next aim was to check those significant p-values significantly up or down by just looking first cell (observed value) of 'Table 1' is greater or lesser than related expectation value.

For original individual p-value calculation, procedure is same as described above for modified individual but here 'modfile.pl' programme did not used because to keep 'mergefilescore' as original.

Results

light and heavy chain and variable regions domains retrieved from databanks

Analysis of 1495 mouse sequences was done for each chain type for 'KabatMan' databanks. While in Vbase2 860 V domains sequences have been extracted which based on V-gene chains. But all sequence does not contain V-gene names therefore needed those sequences omit from file. Doing above operation 403 sequences has been obtained. Family of heavy chains is shown in 'Table 2'-

Heavy and light Chain Sequences

Number of Sequences

IGHV(II)

4

IGHV(III)

6

IGHV1

89

IGHV10

5

IGHV11

4

IGHV12

9

IGHV13

4

IGHV14

7

IGHV15

1

IGHV2

27

IGHV3

15

IGHV4

4

IGHV5

51

IGHV6

13

IGHV7

9

IGHV8

8

IGHV9

10

IGKV1

13

IGKV10

12

IGKV11

1

IGKV12

10

IGKV13

4

IGKV14

3

IGKV15

3

IGKV16

1

IGKV17

2

IGKV18

1

IGKV19

1

IGKV2

5

IGKV3

10

IGKV4

31

IGKV5

5

IGKV6

13

IGKV7

1

IGKV8

10

IGKV9

6

IGLV1

2

IGLV2

2

IGLV3

1

Table 2. Summary of heavy and light chains family for mouse antibody sequences

This table displays frequency of each variables chains family. In heavy chains, highest number of frequency is 89 for 'IGHV1' family and lowest one is 1 for 'IGHV15'. Other way for light chain kappa, highest number of observation has been observed is 31 for 'IGKV4' and lowest one is 1 for 'IGKV11' ,'IGKV16','IGKV18','IGKV19' and 'IGKV7'. For lambda light chain, highest one is 2 for 'IGLV1' and 'IGLV2' and lowest one is 1 for 'IGLV3'.

Now in 'Table 3' the summary of unique germline genes and number of alleles is given-

Number of alleles

Number of unique genes

VH

266

209

VK

132

118

Vλ

5

3

Total

403

330

Table 3. Summary of germline sequence retrieved from 'VBASE2' database.

This data was extracted from 'VBASE2' databanks and assigned against the mouse antibody sequence from kabat through tblastn. Here 330 expressed germline sequence was used against 1495 kabat mouse sequence while 403 alleles which denotes the number of sequences was extracted via 'VBASE2' databanks.

Bit score for heavy and light chain

Each of 1495 mouse sequences were analysed against 330 mouse germline sequences and 1495 alignment got for light and heavy chain individually. Here alignment was converted into bit score and highest score are the final score for each alignment (see details about score in materials and methods). In 'Table 4' shows summary of pattern for heavy and light chain highest score along with expected value and 'VBASE ID'.

VbaseID VgeneID Highestscore E-value

musIGHV115 IGHV6-7 211 4.00E-058

musIGHV168 IGHV7-1 209 2.00E-057

musIGHV168 IGHV7-1 208 3.00E-057

musIGHV168 IGHV7-1 208 3.00E-057

musIGHV168 IGHV7-1 208 2.00E-057

musIGKV185 IGKV7-33 207 5.00E-057

musIGKV069 IGKV8-28 207 7.00E-057

musIGKV185 IGKV7-33 207 6.00E-057

musIGKV185 IGKV7-33 207 5.00E-057

musIGKV185 IGKV7-33 207 6.00E-057

Table 4. Summary of pattern for heavy light chain alignments

Analysis of overall p-value

The nature of pairing between light and heavy chain are based on statistical analysis specially based on chi-squared value and p-value. The condition is very simple if p-value is significant then pairing of germline sequence is not random and that is expected by chance. Threshold value is almost for p-value is 0.05. So in statistical language if p-value is significant then 'reject null hypothesis' and that difference is statistically significant16. 'Table 5' gives us summary of overall result-

Observation of data for overall test

Value of test

Highest expected value

142.8

Lowest expected value

4.6

Chi-Squared value

336.890042

Degree of freedom

54

P-value

0

Table 5. Summary of overall test

From above table, it indicates that expected values (142.8 and 4.6) are satisfying the nature of 'n x n' table (see in materials and method). Interesting thing has to be notice that the p-value is 0 which indicates the result is statistically significant but chi-squared value is 336.890042 which is too big then p-value will be too small because p-value has inverse relationship with chi-squared. Now if p-value << 10-16 then that will give value 0. Interesting point is this that the p-value result here allows to continue to do other experimental stage.

Score of merge data

The preference of pairing depends on significant p-value's nature. Other way significant p-value depends on number of each pairing between germline sequences. After merging all subgroup family, the following observations has been seen in 'Table 6'-

Name of V gene and their pairing for highest number

Observed value

Total of IGHV1

458

Total of IGKV4

332

Total pairing between IGHV1 and IGKV4

119

Table 6. Observation of V-gene number

Figure 6. Frequencies of group merge data

This graph shows each germline pairing frequencies where highest one value is 119 that is shown in 'Table 6' and average frequency is lower than 20.

HeavyGermID

LightGermID

Significantly Up

P-value

Significantly Down

P-value

IGHV1

IGHV1

IGHV1

IGHV1

IGHV1

IGHV10

IGHV10

IGHV10S3

IGHV11

IGHV12

IGHV14

IGHV1S113

IGHV1S113

IGHV1S12

IGHV1S12

IGHV1S123

IGHV1S2

IGHV1S24

IGHV1S35

IGHV1S43

IGHV1S5

IGHV1S53

IGHV1S72

IGHV1S72

IGHV1S72

IGHV1S94

IGHV1S94

IGHV2

IGHV2

IGHV2

IGHV2S4

IGHV2S4

IGHV3

IGHV3

IGHV3

IGHV3

IGHV3

IGHV4

IGHV4

IGHV4

IGHV4

IGHV5

IGHV5

IGHV5S10

IGHV5S10

IGHV5S9

IGHV6

IGHV6

IGHV7

IGHV7

IGHV7

IGHV8

IGHV8

IGHV8

IGHV9

IGHV9

IGKV10

IGKV15

IGKV3

IGKV4

IGLV1

IGKV10

IGKV16

IGKV1

IGKV14

IGKV4

IGKV1

IGKV10

IGKV6

IGKV2

IGKV5

IGKV1

IGLV1

IGKV1

IGKV4

IGKV10

IGKV6

IGKV6

IGKV19

IGKV3

IGKV9

IGKV19

IGKV9

IGKV12

IGKV3

IGLV2

IGKV3

IGKV4

IGKV10

IGKV12

IGKV14

IGKV4

IGKV5

IGKV10

IGKV2

IGKV4

IGLV1

IGKV2

IGKV4

IGKV2

IGKV6

IGKV2

IGKV1

IGKV11

IGKV4

IGKV7

IGKV8

IGKV1

IGKV13

IGKV4

IGKV10

IGKV6

0.002076348

2.0425E-05

0.023426039

0.039302623

0.035056358

1.88817E-05

0

6.5514E-06

0.021655919

0.010804379

8.62984E-05

0.020875396

1.19112E-09

0.013770489

0

0.038681425

0.00025231

0

6.87073E-12

2.13789E-09

0.004468809

0.00466241

0.005833878

6.41239E-06

1.10603E-05

1.02889E-06

9.79978E-05

0

0.000478379

0.003283494

0.032136882

0.005072862

0.045765496

0.000211699

0.028728158

0.028528642

1.989E-05

0.011092505

7.80833E-10

0.044152745

0

0

8.26006E-14

0

4.80725E-08

1.80908E-05

0.000410214

7.83883E-05

7.25E-005

0.038231003

0.039762081

0.030196982

0.001782797

0.001348265

0.005083853

0.03872939

Table 7. Red P-value indicates significantly up and blue one is significantly down.

Above 'Table 7' only includes significant p-value. Reason behind to include only those values because significant p-value gives idea whether those pairing have any preference or not. This table clearly shown 'IGKV4' kappa light chain includes more significant value as well as more significantly down value also than other light chain. Other hand 'IGHV1' and 'IGHV3' have more significant p-value as well as significantly down than other heavy chain. One interesting has to be noted that significantly up value is greater for 'IGKV4' but not same as in heavy chain that is true for 'IGHV4'.

Score of original data

Figure 7. Frequency of original data

In this graph it can be seen the average frequency is less than 5 where the highest frequency is 46 for IGKV4-59 and IGHV2S4. In 'x-axis' the numbering denotes as each cell pairing number.

LightGermID

HeavyGermID

Significant Pvalue

IGKV10-94

IGHV1S113

1.20E-008

IGKV10-94

IGHV4-1

9.76E-007

IGKV10-94

IGHV5-16

0

IGKV10-94

IGHV9-1

0

IGKV10-96

IGHV1-34

0.002857693

IGKV10-96

IGHV1-53

0.011115355

IGKV10-96

IGHV1S43

0

IGKV10-96

IGHV1S75

0.031885436

IGKV1-110

IGHV1-11

0.000387894

IGKV1-110

IGHV14-3

3.48E-006

IGKV1-110

IGHV1-5

3.66E-006

IGKV1-110

IGHV1-55

0.000615839

IGKV1-110

IGHV1S24

0.000211505

IGKV1-110

IGHV2-6-2

7.85E-008

IGKV1-110

IGHV5-9-3

0.046208733

IGKV1-110

IGHV6-7

0

IGKV11-125

IGHV6-6

0

IGKV1-117

IGHV1-52

9.36E-013

IGKV1-117

IGHV1S69

0.0075952

IGKV1-117

IGHV3-8

1.38E-006

IGKV1-117

IGHV5-12

0.003167004

IGKV1-117

IGHV5-9-4

0.000229762

IGKV1-117

IGHV7-3

0

IGKV1-117

IGHV9-2-1

0.015293006

IGKV1-117

IGHV9-3

0.000102923

IGKV1-122

IGHV10S3

0

IGKV1-132

IGHV14-1

8.07E-006

IGKV1-132

IGHV1-66

0.00099796

IGKV1-133

IGHV1-14

0.045321923

IGKV1-133

IGHV14-3

0.045321923

IGKV1-133

IGHV1-53

0.008189527

IGKV1-133

IGHV1-54

4.41E-007

IGKV1-133

IGHV1S132

1.72E-014

IGKV1-135

IGHV1-39

0.008189527

IGKV1-135

IGHV1S24

0.004980157

IGKV1-135

IGHV3S5

0.025403987

IGKV1-135

IGHV5-15

1.05E-012

IGKV12-38

IGHV1-52

0.003177905

IGKV12-38

IGHV1-55

0.006432729

IGKV12-40

IGHV1-9

0.006566491

IGKV12-41

IGHV14-1

0.000146335

IGKV12-41

IGHV14-3

0.026022458

IGKV12-41

IGHV2-2

1.88E-010

IGKV12-41

IGHV2-6-7

2.28E-006

IGKV12-41

IGHV9-4

0.028676428

IGKV12-44

IGHV1-4

0.000600495

IGKV12-44

IGHV1-42

8.66E-015

IGKV12-44

IGHV1-66

0.002587584

IGKV12-44

IGHV1S113

0.001693445

IGKV12-44

IGHV2-6-1

7.62E-006

IGKV12-44

IGHV2S4

0.01207426

IGKV12-46

IGHV1-42

0.006432729

IGKV12-46

IGHV1S33

0.043187364

IGKV12-46

IGHV1S53

0.043187364

IGKV12-46

IGHV2-3

0

IGKV12-46

IGHV3-6

0.002418361

IGKV12-89

IGHV1-63

0.013709937

IGKV12-89

IGHV1S22

0

IGKV12-98

IGHV1-15

0.016308464

IGKV12-98

IGHV3-8

0

IGKV13-84

IGHV8-8

0

IGKV13-85

IGHV5S15

1.18E-011

IGKV14-100

IGHV1-7

0

IGKV14-111

IGHV11-2

0

IGKV14-111

IGHV1-14

0.004107064

IGKV14-111

IGHV14-1

0.043411254

IGKV14-111

IGHV1-66

0.00010283

IGKV14-111

IGHV3-2

0.000289101

IGKV14-111

IGHV5-16

3.48E-007

IGKV15-103

IGHV2-6

8.66E-015

IGKV16-104

IGHV10-1

0.004631658

IGKV16-104

IGHV1-47

0.016200091

IGKV16-104

IGHV3-6

2.60E-008

IGKV16-104

IGHV9-4

0.004631658

IGKV17-121

IGHV1-53

0.000939845

IGKV17-127

IGHV1-55

0.000655887

IGKV17-127

IGHV5-17

0.005688731

IGKV1-88

IGHV1S123

8.81E-008

IGKV1-88

IGHV2-3-1

8.07E-006

IGKV1-88

IGHV2-5-1

0.000317443

IGKV1-88

IGHV4-1

0.013037178

IGKV1-88

IGHV8-11

8.07E-006

IGKV19-93

IGHV1S72

0.004468809

IGKV19-93

IGHV1S94

6.41E-006

IGKV19-93

IGHV5-12-2

2.00E-010

IGKV2-109

IGHV1-50

3.48E-008

IGKV2-109

IGHV4-1

4.48E-008

IGKV2-109

IGHV5-6-4

0.008657363

IGKV2-109

IGHV6-3

0.000124272

IGKV2-112

IGHV1-7

6.57E-006

IGKV2-112

IGHV7-1

1.84E-005

IGKV2-137

IGHV5-6-4

4.29E-008

IGKV2-137

IGHV5-9

0.047240784

IGKV2-137

IGHV5-9-4

0.00059018

IGKV2-137

IGHV5S10

2.24E-009

IGKV2-137

IGHV5S9

0

IGKV3-1

IGHV1-4

0

IGKV3-1

IGHV5-6-1

3.98E-013

IGKV3-10

IGHV14-3

0.001525662

IGKV3-10

IGHV1-5

3.61E-006

IGKV3-10

IGHV5-4

0.005067789

IGKV3-12

IGHV1-14

0.00017034

IGKV3-12

IGHV13-2

0.00061789

IGKV3-12

IGHV1-37

0.043818341

IGKV3-12

IGHV1-55

0.026498362

IGKV3-12

IGHV1-66

0.03458965

IGKV3-12

IGHV4S2

7.78E-006

IGKV3-12

IGHV5-9-3

0.005766629

IGKV3-2

IGHV1-69

1.69E-006

IGKV3-2

IGHV5-6-1

2.45E-005

IGKV3-4

IGHV1-11

0.011308879

IGKV3-4

IGHV1-12

0

IGKV3-4

IGHV1-56

2.85E-008

IGKV3-4

IGHV1S72

0.04367278

IGKV3-4

IGHV5-6-5

0

IGKV3-5

IGHV1-19

0.015769666

IGKV3-5

IGHV1-67

0.026048267

IGKV3-5

IGHV1S72

0.019619169

IGKV3-5

IGHV3-1

0.000414538

IGKV3-5

IGHV7-1

2.10E-005

IGKV3-7

IGHV1-53

7.65E-010

IGKV3-9

IGHV1-63

0.000294358

IGKV4-50

IGHV14-3

9.61E-012

IGKV4-51

IGHV1-66

0.026110068

IGKV4-53

IGHV1-28

1.97E-006

IGKV4-53

IGHV1-61

4.80E-005

IGKV4-53

IGHV1-63

0

IGKV4-53

IGHV1S22

0.00099796

IGKV4-53

IGHV8-5

0.008589793

IGKV4-55

IGHV1-61

1.14E-005

IGKV4-55

IGHV3-2

0.009850668

IGKV4-55

IGHV5-4

0.005067789

IGKV4-55

IGHV9-1

0.00390677

IGKV4-56

IGHV1-47

0

IGKV4-57

IGHV14-3

0.008579883

IGKV4-57

IGHV1-61

0.000147872

IGKV4-57

IGHV1-9

6.00E-008

IGKV4-57

IGHV5-16

0.011969833

IGKV4-57

IGHV5-4

0.011969833

IGKV4-58

IGHV1-63

0.000294358

IGKV4-59

IGHV1-26

1.95E-010

IGKV4-59

IGHV1-9

0.004955131

IGKV4-59

IGHV2-6-5

0.010559441

IGKV4-59

IGHV2S4

0

IGKV4-61

IGHV1-14

4.13E-011

IGKV4-61

IGHV1-15

0.026313023

IGKV4-62

IGHV1-63

4.96E-009

IGKV4-62

IGHV1S12

5.17E-006

IGKV4-62

IGHV5S15

7.77E-016

IGKV4-63

IGHV1S1

1.41E-009

IGKV4-63

IGHV7-3

5.41E-014

IGKV4-68

IGHV1-18

0

IGKV4-68

IGHV3-6

0.000489135

IGKV4-69

IGHV1-9

0.006566491

IGKV4-70

IGHV1-9

0

IGKV4-70

IGHV5-4

0.001077546

IGKV4-70

IGHV6-3

0.000235035

IGKV4-72

IGHV1-12

3.70E-012

IGKV4-72

IGHV1-19

0.003560516

IGKV4-72

IGHV1-22

0.020704606

IGKV4-72

IGHV1-43

2.26E-005

IGKV4-72

IGHV5-17

0.02419021

IGKV4-73

IGHV1-39

8.10E-009

IGKV4-74

IGHV4-1

0.001039444

IGKV4-74

IGHV8-12

0

IGKV4-74

IGHV9-2-1

0.007956923

IGKV4-78

IGHV2-6-2

3.16E-010

IGKV4-78

IGHV8-8

3.61E-006

IGKV4-79

IGHV1S35

0

IGKV4-80

IGHV1-4

0.003177905

IGKV4-80

IGHV8-8

0.004631658

IGKV4-81

IGHV1-34

0.013968311

IGKV4-86

IGHV4-1

0

IGKV4-86

IGHV4S2

0.02761805

IGKV4-90

IGHV7-3

0.007928884

IGKV4-91

IGHV12-3

0

IGKV4-91

IGHV2-5

0.002719024

IGKV4-91

IGHV5-2

1.35E-005

IGKV5-37

IGHV1-9

0.006566491

IGKV5-39

IGHV3-1

2.08E-006

IGKV5-43

IGHV1-26

2.17E-008

IGKV5-43

IGHV1S12

0

IGKV5-43

IGHV3-8

0.009175858

IGKV5-45

IGHV1-14

0.001525662

IGKV5-45

IGHV1-20

0

IGKV5-45

IGHV1S39

0

IGKV5-48

IGHV1-69

0

IGKV5-48

IGHV5S10

1.05E-005

IGKV6-13

IGHV14-4

0.001763883

IGKV6-13

IGHV8-12

7.78E-006

IGKV6-14

IGHV1-54

0.002067773

IGKV6-15

IGHV2-6-5

0.038060732

IGKV6-15

IGHV5-6-1

7.62E-006

IGKV6-15

IGHV5S10

1.71E-013

IGKV6-15

IGHV9-3

0.000600495

IGKV6-17

IGHV1-37

0.028875679

IGKV6-17

IGHV9-3

2.15E-006

IGKV6-20

IGHV1-11

0.024623039

IGKV6-20

IGHV1-54

1.05E-005

IGKV6-20

IGHV1S113

0.001138763

IGKV6-20

IGHV3-2

0.024623039

IGKV6-20

IGHV9-1

0.011345999

IGKV6-23

IGHV1S113

8.35E-006

IGKV6-23

IGHV1S5

0

IGKV6-23

IGHV1S53

7.77E-016

IGKV6-23

IGHV5-9-2

1.41E-009

IGKV6-25

IGHV1S5

0

IGKV6-32

IGHV1-61

0.046791834

IGKV6-32

IGHV5-4

1.82E-005

IGKV7-33

IGHV7-1

0

IGKV8-16

IGHV1-11

0.000221589

IGKV8-19

IGHV1-11

2.25E-006

IGKV8-19

IGHV14-3

0.047571776

IGKV8-19

IGHV1S72

0.011471629

IGKV8-19

IGHV5-4

0.000527071

IGKV8-19

IGHV5-6-1

5.32E-006

IGKV8-21

IGHV10-3

0.020317492

IGKV8-21

IGHV1-14

0.006883341

IGKV8-21

IGHV1-55

0.026498362

IGKV8-21

IGHV1-63

4.19E-005

IGKV8-24

IGHV1-15

9.63E-006

IGKV8-24

IGHV1-37

6.57E-007

IGKV8-24

IGHV2-6-4

2.00E-007

IGKV8-24

IGHV3S5

0.002719024

IGKV8-27

IGHV1S123

1.37E-006

IGKV8-27

IGHV8-11

2.40E-005

IGKV8-28

IGHV1-20

0.000586306

IGKV8-28

IGHV1-34

0.006573837

IGKV8-28

IGHV5-9

1.09E-010

IGKV8-28

IGHV7-1

0

IGKV8-30

IGHV1-15

4.71E-008

IGKV8-30

IGHV1-64

0.02761805

IGKV8-30

IGHV1-67

0.000229458

IGKV8-30

IGHV1S45

1.20E-007

IGKV8-30

IGHV1S53

0.02761805

IGKV8-30

IGHV5-12

0.048482456

IGKV9-120

IGHV1-15

0.016308464

IGKV9-120

IGHV1-48

0.000416104

IGKV9-120

IGHV1S94

3.33E-016

IGKV9-120

IGHV2-4

3.67E-007

IGKV9-123

IGHV1-20

0.000235035

IGKV9-123

IGHV1-39

5.87E-005

IGKV9-124

IGHV1-9

0.00070547

IGKV9-124

IGHV5-17

0.002465413

IGKV9-124

IGHV9-4

0.003177905

IGLV1

IGHV1S2

0

IGLV1

IGHV2-6-7

0.000252415

IGLV1

IGHV3-8

0.037745168

IGLV1

IGHV4-1

0.028728158

IGLV2

IGHV2-6-2

0.016308464

IGLV2

IGHV2-7

3.67E-007

IGLV3

IGHV5-9

2.26E-005

Table 8. Significant p-value for original data

In this table all p-values are red that means all are significantly up. The number of most significant p-value includes in IGKV1-110 (Number is 47) for light chain and IGHV1-14 for heavy chain (Number is 23).

Discussion

Some researchers have established theory that pairing of V-gene is random. In this report, aim was to see whether their theory is true by statistical test or not. This variable region is crucial for antigen specificity so this report will be interesting for resurgence of antibody immunotherapies17. 'Table 5' shows that p-value (<.05) here significant which means pairing between V-genes are not random they follow some preference during pairing. When it was came to know they have some preference then the merging data experiment has been performed to find significant result. Where 25% results are significant but that does not mean they preferred for pairing. In 'Table 7' where 84% data (red) are significantly up which means group merge data indicates they have preference during pairing and that allows performing same experiments for original data. For that experiment 34% data are significant as well as all significant result are significantly up so that established one theory that particular pairings of germline sequences are more favourable that expected by chance.

Writing Services

Essay Writing
Service

Find out how the very best essay writing service can help you accomplish more and achieve higher marks today.

Assignment Writing Service

From complicated assignments to tricky tasks, our experts can tackle virtually any question thrown at them.

Dissertation Writing Service

A dissertation (also known as a thesis or research project) is probably the most important piece of work for any student! From full dissertations to individual chapters, we’re on hand to support you.

Coursework Writing Service

Our expert qualified writers can help you get your coursework right first time, every time.

Dissertation Proposal Service

The first step to completing a dissertation is to create a proposal that talks about what you wish to do. Our experts can design suitable methodologies - perfect to help you get started with a dissertation.

Report Writing
Service

Reports for any audience. Perfectly structured, professionally written, and tailored to suit your exact requirements.

Essay Skeleton Answer Service

If you’re just looking for some help to get started on an essay, our outline service provides you with a perfect essay plan.

Marking & Proofreading Service

Not sure if your work is hitting the mark? Struggling to get feedback from your lecturer? Our premium marking service was created just for you - get the feedback you deserve now.

Exam Revision
Service

Exams can be one of the most stressful experiences you’ll ever have! Revision is key, and we’re here to help. With custom created revision notes and exam answers, you’ll never feel underprepared again.