Biological Background
Transcription factors (TFs) are special regulatory proteins that govern the regulation of transcription and gene expression by binding to the DNA. The TFs bind to specific sequence-motifs in the DNA,
known as transcription factor binding sites (TFBSs), which are usually between 5 and 15 bp long. These TFBSs are enriched in the promoter regions. Single nucleotide polymorphisms (SNPs) - the exchange
of a base at a specific position in the genome - can strongly influence the gene expression level by changing the binding affinity of the TFs to the sequence. A nucleotide substitution in one position of
a
TFBS can be sufficient to even disrupt or create a TFBS. Such SNPs are referred to as regulatory SNPs (rSNPs). They have recently gained much attention in life sciences, because they can be
causal for specific traits or diseases.
Workflow
Our workflow to identify rSNPs and their consequences on TF binding consists of four steps:
|
1. Extraction of the promoter region for each gene covering the -7.5 kb to 2.5 k.b regions relative to the transcription start sites |
|
2. Identification of the SNPs occuring within these promoter regions and extraction of their respective flanking sequences defined
as the 25 bp upstream and downstream of the SNP. These SNPs are defined as rSNPs. |
|
3. Employing the TFBS prediction tool MATCH™ to these flanking sequences we predict putative TFBSs for the reference
as well as the alternate allele of each SNP. |
|
4. In order to determine the consequences of the SNP we compare the predicted TFBSs between reference and alternate allele.
We separate the effect of an rSNP on the binding of a TF in four categories:
Gain of TFBS : The TFBS exists only for the 1 (alternative) allele of the SNP
Loss of TFBS : The TFBS exists only for the 0 (reference) allele of the SNP
Score-Change : The TFBS is predicted for both alleles but the TF binding affinity differs (measured by the Core_Similarity_Score and Matrix_Similarity_Score calculated by MATCH™)
No Change : The TFBS is predicted for both alleles with the same TF binding affinity (measured by the Core_Similarity_Score and Matrix_Similarity_Score calculated by MATCH™)
|
Data sources
agReg-SNPdb was constructed using the following genome assembly versions downloaded from Ensembl (release 103):
|