This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.
Single nucleotide polymorphism is a nucleotide variant from the normal sequence, with a population frequency of greater than 1% (Perkel, 2008). There are commonly two alleles involved in SNPs. Alleles T and C accounts for two thirds of allele changes in humans because of bias mutation. The numbers of SNPs can vary within and between populations. SNPs can be classified into synonymous (no change in the amino acid produced by the gene); conservative (amino acid change to chemically similar amino acid) and non-conservative (amino acid change to a chemically different amino acid). SNPs can also be insertions or deletions commonly known as "deletion insertion polymorphisms (DIPs)" (McEntyre, 2002). SNPs can also be used to design drugs for treatments and play an important part in developing drugs to treat genetic diseases (Twyman and Primrose, 2003)
The SNPs can be discovered by several molecular methods in which variant DNA molecules can be identified. These methods have a basic idea of amplifying the region of interest by PCR and looking for similar/identical physical and chemical properties i.e. if there is single base change. For example, single stranded conformational polymorphism (SSCP) analysis and denaturing gradient gel electrophoresis (DGGE). SSCP involves denaturing double-stranded DNA into single-stranded which fold ups into globular shape. The folding up determines the variants in DNA. On the other hand, DGGE involves denaturing DNA to change mobility. However, SNPs are discovered by using sequencing techniques like pyrosequencing e,g,454 as they are cheaper and reliable (Chang et al., 2010). This technique involves the emittion of chemical signal when a particular dNTP is incorporated. The order of dNTPs released is decided by pyrogram and could be based on the known SNP. The chemical signal produced is proportional to the number of dNTPs incorporated (Chang et al., 2010).
SNP genotyping methods need to be fast; cheap and precise (Kim and Mishra, 2007).There are several SNP genotyping methods used e.g. invader assay and mass spectrometer. Although, the basic principles of all methods are allele discrimination and allele detection (Twyman and Primrose, 2003). The method used depends on the SNP and genome scale. For instance, array based methods i.e. high-throughput techniques are used to genotype SNPs genome-wide. Illumina is one of the commonly used methods. This involves denaturing genomic DNA to single stranded DNA attached to "Bead Array platform" or "Beadxpress". A set of oilgonucleotides are annealed to the genomic DNA producing a well. Then "allele-specific extension and ligation" take place. The sequences are amplified by PCR with an addition of fluorescently labelled primers (Chen, Kowk and Levine, 1999) to produce "amplicons" that contain several SNPs. They are then hybridised onto the illumina beads which have complementary sequences. When the correct and particular sequence is hybridised to the sequence in the illumina bead, a unique signal is produced corresponding to the sequence being hybridised and determining the genotype of the SNP.
The other commonly used method is "Affymetrix GeneChip 500K Mapping Array Set" and it can array around 500,000 different SNPs at a time which is time efficient (Butcher et al., 2007). The genomic DNA is cleaved with restriction enzymes ("NspI or StyI) and then adaptors are ligated to the fragments. The fragments are amplified by PCR under optimal conditions and the ends are labelled so that they can hybridise onto the chips (AFFYMETRIX®, 2005). Each position determines a unique SNP genotype in the DNA sample that has been hybridised in the chips. However, it requires a lot of DNA and sometimes not possible to get loads of DNA.
There are other methods that are non-array based and used for identifying specific set of SNPs in specific genes. This includes Dynamic Allele specific hybridisation (DASH). This is a more efficient way as the arrays can be customised however this method only works for small number of SNPs. This uses stringency conditions to find which DNA sequence of interest has mismatches to the hybridised probe. (Brookes et al., 1999).
A BLAT search was performed on the query sequence for each locus at a time at UCSC browser. It was accessed at the following URL: http://genome.ucsc.edu/. "Browser" link was clicked of the first result. This gave information about the location of the SNP in the chromosome and several SNP IDs. The chromosomal position was used to find the gene that the SNP is located. This was done by accessing the NCBI mapviewer at the following URL: http://www.ncbi.nlm.nih.gov/projects/mapview/. The chromosomal position was entered in the "region on chromosome" link.
Moreover, the right SNP ID was identified by selecting the "details" link of the first result. There were two lighter blue bases with a black base in the middle, suggesting the location of the SNP in the sequence. The position of the SNP in the chromosome was worked out and then went to the "browser" page and entered the position in the "position box" and clicked "jump". This gave one SNP ID that corresponded to the SNP in each locus. By clicking on that SNP ID, this gave lots of information about the alleles, validation, coding region and many more.
The dbSNP database was accessed at the following URL:http://www.ncbi.nlm.nih.gov/projects/SNP/ to verify if the SNP ID is the correct one. The SNP ID was entered in the search box and this gave lots of information relating to the population frequency, alleles and coding region. To find out the position of the coding region for some of the loci, the nucleotide position was selected in the "Gene view "section. Also, by clicking on "clinical association" link, the function of the SNP was found.
All of the SNPs in the locus are located in CFTR gene and all of the sequences are localised to 7q31.2.
The SNP is located in an exon. The function of the SNP is missense as the amino acid changes from valine to methionine. The position of the SNP in the chromosome is 117199533.The SNP ID is rs_213950. The substitution is a positive sense. It is in exon 11 of the CFTR gene, this was found via dbSNP by clicking on the nucleotide position (84517). Moreover, it's a cystic fibrosis transmembrane regulator.
The SNP has been validated by cluster; frequency; submitter; 2-hit-2allele and HapMap project. It has been sequenced in 1000 genomes project.
The ancestral alleles for the SNP is A, and the derived allele is G. There were several allele frequencies from different populations. Some of them were taken from the same population whereas others weren't a population as it was only taken from one person. So, a representative allele frequency of different populations was taken. The allele frequencies of allele A and G in European population are 50.8% and 49.2% respectively. The allele frequencies of allele A and G in Chinese population are 40.5% and 59.5% respectively. The allele frequencies of allele A and G in Japanese population are 32.6% and 67.4% respectively. The allele frequencies of allele A and G in Sub-Saharan African population are 97.5% and 2.5% respectively. The allele frequencies of allele A and G in African American population are 89.1% and 10.9% respectively. African has a higher frequency of allele A whereas Asians have equal number of alleles A and G. This suggests that genetic diversity began from Africa and it's retained in Africa.
The SNP is located in the intron. The position of the SNP in the chromosome is 117163616. The SNP ID is rs_7802924.
The SNP has been validated by cluster; frequency, HapMap project and 1000-genomes project.
The ancestral and derived alleles are A and G respectively.
There were three populations in total. However, two of them were only from an individual hence it doesn't represent the whole population. The allele frequency was taken from a multiple population including Japanese, Chinese, African and so on. The allele frequencies of A and G are 90.9% and 9.1% respectively. As the allele A frequency is pretty high, this suggests that it's highly conserved throughout multiple populations even though the sample size was small.
The SNP is located in the exon. The position of the SNP in the chromosome is 117282644. The SNP ID is rs_1800130.
The SNP has been validated by cluster; frequency; 2 hit-2 allele and 1000 genome project.
The ancestral and derived alleles are G and A respectively. There is a very high allele frequency of 100% in European and Asian. This also suggests that most of the population with allele A have moved from Africa. However, allele A frequency in Sub-Saharan African and African American populations are 78.8% and 88.1% respectively whereas allele frequencies of G are 21.2% and 11.9% respectively.
The SNP is located in the intron. The position of the SNP in the chromosome is 117173231.The SNP ID is rs_213943.
The SNP has been validated by cluster; frequency; submitter; 2 hit-2 allele and 1000 genome project.
The ancestral and derived alleles are C and T respectively. The allele frequencies of allele C and T in European population are 51.7% and 48.3% respectively. The allele frequencies of allele C and T in Chinese population are 39.5% and 60.5% respectively. The allele frequencies of allele C and T in Japanese population are 32.1% and 67.9% respectively. The allele frequencies of allele C and T in Sub-Saharan African population are 98.3% and 1.7% respectively. Allele C is highly conserved in Africa.
I found the UCSC genome browser useful as it was easy to navigate around and gave relevant information about a particular SNP.
However, the information regarding most of the loci wasn't present there so had to use another database to gather the information. Moreover, some of the information e.g. the ancestral allele was different from other databases so had to check the information provided by UCSC browser for all the loci.