Two co-inherited SNPs of the telomerase reverse transcriptase (TERT) gene are associated with Iraqi patients with lung cancer

Background The telomerase reverse transcriptase (TERT) gene is essential polymorphic loci linked to most malignant tumors. This study assessed the association between the TERT gene and non-small cell lung carcinoma (NSCLC) in Iraq. Methods Genomic DNA samples were extracted from a total of 200 samples of blood. Four specific PCR fragments were designed to amplify four high-frequency rs2735940, rs2736098, rs2736100, and rs10069690 SNPs within the TERT gene. Single-strand conformation polymorphism (SSCP) followed by sequencing reactions were used for genotyping and validating the amplified fragments.


Introduction
Lung cancer is one of the most severe public health challenges and a major contributor to cancerrelated morbidities and fatalities worldwide.Lung cancer cases are being reported more frequently in the Middle East regions (1).Recent years have shown a steady increase in the clinical risk factors for lung cancer and its associated consequences (2).It has been demonstrated that traditional methods of diagnosing lung cancers are less effective due to several issues related to higher mortality rates reported in developing malignant lung tumors in their advanced stages (3).Non-small cell lung cancer (NSCLC) is one of the most critical types of respiratory tract tumours, making up 80% of lung carcinogenesis worldwide.Unfortunately, it has been reported that surgical treatment may not be possible if NSCLC develops in severe stages III and IV (4).As a result, NSCLC is substantially more challenging to detect and cure early than other types of cancer (5).This consequence is linked to people with lung cancer, typically detected in advanced stages, having a poor prognosis (6).Given the sharp decline in lung cancer survival rates from early to advanced stages, more focus should be given to this issue regarding its early diagnosis (7).Because of this, it is critical to detect this type of cancer early by integrating molecular research with traditional clinical evaluations (8,9).Recently, the onset and progression of lung cancer have been linked to various extents by a wide range of genetic variants (10,11).The telomerase reverse transcriptase (TERT) gene is one of the essential polymorphic loci linked to various human cancers (12).The TERT gene has 16 exons and is positioned in chromosome 5 (5p15.33).The TERT gene encodes the catalytic component of telomerase, a ribonucleoprotein that plays a pivotal role in cancer initiation via telomere-dependent or telomere-independent pathways (13).The methods for maintaining telomeres entail intricate cellular adjustments brought on by TERT, such as TERT structural variants (14), TERT gene amplifications (15), TERT epigenetic changes (16), and alternative lengthening of telomere (17), and TERT promoter mutations (18).Any alteration of these mechanisms is possibly associated with numerous types of carcinogenesis at a single-cell level (14).Telomere length is influenced by TERT-based telomerase activity, which can also act as an informative biomarker in the early diagnosis and prognosis of many malignancies (19).TERT aids telomerase in replenishing telomere sequences by extending the extreme ends of telomere strands to enable other polymerases to synthesize the complementary strand (20).It has been established that elevated TERT expression leads to telomerase activity restoration (21), implying that the transcriptional control of TERT plays a remarkable role in carcinogenesis (22).Due to its significance in numerous critical actions throughout the cells, TERT polymorphism may be connected to the early onset and progression of malignancies.It has been shown that the TERT gene is linked to a diverse spectrum of malignancies, such as pancreatic cancer (23), prostate cancer (24), gastric cancer (25), thyroid cancer (26), esophageal cancer (27), and bladder cancer (28), melanoma (29).However, controversial data have been found on the possible association of the TERT gene with lung cancers (4).This is due to the poor association with squamous cell carcinoma and NSCLC (30,31).But later on, many sources of data have shown that there is a strong connection between TERT variants and the risk of NSCLC (32)(33)(34).In Iraq, the prevalence of NSCLC has steadily increased.This is due to exposure of Iraqi people the several disastrous wars and the deterioration of health care infrastructures (35).However, the significance of several TERT SNPs in the development of NSCLC have not been identified in this population.Taking these data into consideration, this research aim was to investigate the possible association between four TERT high-frequency SNPs rs2735940, rs2736098, rs2736100, and rs10069690 with increased risk of NSCLC in cancer subjects in Iraq.collaborated with allele G of rs2735940 to generate TG haplotype in patients.According to our findings, both TERT-rs2735940: A/G and TERT-rs2736098: C/T SNPs were found to be significant associations with the elevated risk of NSCLC.Both SNPs showed the highest values of co-inheritance in patients.This co-inheritance is mainly represented by alleles rs2735940: A and rs2736098: C. Both pathogenic T and G alleles have generated TG haplotype that is only available in patients' samples.Conclusion: This study suggests employing the haplotype TG as a promising biomarker for the early diagnosis of NSCLC.These findings need further validation by largescale investigation with a larger size of samples in the study population.

Controls and Subjects
The experiments performed in the study were conducted following the Helsinki Declaration for experiments involving people, and the biochemical research involving human subjects was approved by the Institutional Review Board (IRB) in the University of Kufa (IECIH/UOK 088/2020; CAAE 08802212).Signed written informed consent was obtained from all participants before being involved in the study.In this case-control study, a total of 200 individuals, of which 100 patients with NSCLC and 100 controls, were included.The details of the collected samples were described in (36).Briefly, the samples of patients were collected from the Middle-Euphrates Cancer Center (located in Najaf governorate) and Merjan Cancer Center (located in Babil governorate).The involved patients ranged in age from 24 to 80 (average age: 52.2).Each had previously received a diagnosis of NSCLC from trained personnel to diagnose malignan-cies at the facilities mentioned above in the study area.In this investigation, 100 healthy volunteers with no prior history of lung cancer and ages ranging from 18 to 78 made up the control population (mean age: 44.4).The samples that were screened all belonged to citizens of Iraq, and they were all examined from February to December of 2021.

Genomic DNA extraction
Genomic DNA samples were extracted from the peripheral blood samples using a Blood/Cell DNA Mini Kit (Cat.No. GB100, Geneaid Co., Taipei, Taiwan).A Nanodrop spectrophotometric technique was used to confirm the purity of the extracted genomic DNA (Biodrop Co., UK).The isolated genomic DNA's integrity was assessed by agarose gel electrophoresis following the recommended standard instructions (37).

PCR
Four sets of PCR-specific primers were designed using NCBI primer BLAST tool to amplify four distinct PCR fragments within the TERT gene (38).In PCR designing, four high-frequency SNPs (rs2735940, rs2736098, rs2736100, and rs10069690) were targeted within total lengths of 220 bp, 223 bp, 201 bp, and 206 bp, respectively (Figure 1a).The specified features of PCR products that are required to provide the best resolutions in PCR-SSCP methods were met by the PCR amplicons when they were prepared with these lengths (39).The details of the PCR oligonucleotide sequences are displayed in Table I.The experiments of PCR were conducted by a lyophilized PCR AccuPower PreMix (Cat# K-2012, Bioneer Co., Korea), with a final volume of 20 mL for each amplified fragment.After performing PCR experiments, it was confirmed that PCR products were in the expected sizes by electrophoresis on agarose gels.

Genotyping analysis
The genotyping experiments were conducted using PCR-single strand conformation polymorphism (SSCP).This sensitive and inexpensive genotyping approach can discriminate between PCR products that differ in only one nitrogen base using denaturation, chilling, and electrophoresis on polyacrylamide gels (40).Briefly, PCR amplicons were denatured at 94 for 7 to 8 min and then immediately frozen in ice for at least 10 min.Then, PCR products were subsequently loaded in 8% polyacrylamide gels until samples reached the bottom of the gels.Gels were stained with silver nitrate following the recommended procedure (41).Each detected SSCP band-ing pattern was subjected to Sanger dideoxy-sequencing laboratories to confirm the obtained electrophoretic genotypes following the recommended instructions (Macrogen Inc., South Korea).The DNA chromatogram of each genotype was visualized using SnapGene Viewer ver.4.0.4.(Insightful Science, Canada).The alignment of the observed variation with the referring DNA sequences was conducted using BioEdit suit, version 7.1 (DNASTAR; Madison, USA).

Statistical and functional analysis
Using the MedCalc online server, the odds ratios (ORs) and associated 95% confidence intervals (CIs) were computed to examine the genotype variations for the targeted SNPs between the patients and controls (42).Gen-Calculator software (www.genecalc.pl)was utilized to evaluate the likelihood that both the cases and the controls would deviate from the Hardy-Weinberg equilibrium (HWE).The linkage disequilibrium (LD) plots of the identified SNPs were constructed by Haploview software ver.4.2 to assess the likely prevalence of haplotypes' co-inheritance in both controls and patients (43).The p-values below 0.001 were regarded as significant.After retrieving the positions of investigated SNPs from the reference genomic sequence, the possible connections between their haplotypes were assessed.

Genotyping analysis
After reviewing the SNPs database for the TERT gene according to their frequency in the dbSNP web server, four SNPs with the highest fre- quency were selected respectively.These four SNPs are rs2735940, rs2736098, rs2736100, and rs10069690, which were chosen to be screened in four distinct positions of the TERT gene.Three different banding patterns were found for each selected SNP, demonstrating that the rs2735940, rs2736098, rs2736100, and rs10069690 SNPs each have three genotypes (Figure 1b).The conducted sequencing experiments verified the three expected genotypes for all investigated high-frequency SNPs.The electropherograms of the four investigated SNPs demonstrated these three geno-types (rs2735940: AA: AG: GG, rs2736098CC: CT: TT, rs2736100: CC: CA: AA, and rs10069690: CC: CT: TT) (Figure 1c).
To determine if the study population was in HWE, the genetic diversity of the identified polymorphic SNPs was performed.According to the chisquare values with Yate's corrections, the polymorphisms of all four detected polymorphic SNPs were confirmed to be compatible with HWE in both control and NSCLC groups at the significance level of 0.05 (Table II).

Association analysis
The nucleic acid differences of four investigated SNPs were compared between controls and patients may be an association between the TERT gene and NSCLC (Table III).
For rs2735940 (G/A) SNP, It was observed that the heterozygous AG genotype was significantly more frequent in patients than in controls.Thus, individuals with the heterozygous AG genotype showed a significantly higher risk of developing NSCLC with a P-value of 0.0299 (OD 2.3158, Cl 95% 1.0853-4.9414).This data indicated that individuals with allele G had a considerably higher risk of developing NSCLC with a P-value of 0.0097 (OR 2.2195; Cl 95% 1.2134-4.0599).For rs2736098 (C/T) SNP, noticeable differences were identified in the distributions of the heterozygous CT genotype between the patients and controls.Individuals with the CT genotype exhibited a significantly higher risk of developing NSCLC with a P-value of 0.0363 (OD 2.1583, Cl 95% 1.0503-4.4351).This data also showed that individuals with allele T exhibited a significantly higher risk of developing NSCLC with a P-value of 0.0072 (OR 2.1507; Cl 95% 1.2303-3.7598).Regarding the other two SNPs under investigation (rs2736100 and rs10069690), no significant associations were identified in their alleles and genotypes distribution between the patients and controls.rs2736100 (CA) exerted no obvious preferences for any genotype or allele to be in patients or controls since no significant association was detected in the distribution between both involved groups (P-value 0.8790 and 0.3861 for CA and AA genotypes, respectively).The same observation also applied to the distribution of rs10069690 (CT).Though the heterozygous genotype CT showed a higher frequency in patients (26%) than the controls (21%), The p-value analysis reflected statistically insignificant differences between the patients and controls for this SNP (P-value 0.4051).

LD analysis
The observed haplotypes were evaluated, and the values of linkage disequilibrium were determined between the four polymorphic loci in the current population to assess the co-inheritance potential among them.The observed D' (1.0) and LOD (0.96) values showed the presence of partial linkage between two polymorphic SNPs in the controls, namely rs2735940 and rs2736098.Whereas the other val-ues of the other two SNPs showed no evidence of coinheritance (Figure 2a).In the patients' population, LD plot analysis showed that the rs2735940 and rs2736098 were highly co-inherited (D' 0.96, LOD 22.6) (Table IV).
The haplotype analysis further details the LD plot among the investigated polymorphic loci.This high value entailed a high genetic collaboration between rs2735940 and rs2736098 in the patients' group.Based on their positions occupied within the genomic sequences of the TERT gene, these four SNPs were separated accordingly into two adjacent blocks.The 6 kb distance block-1 was created by combining rs2736100 and rs10069690.Whereas block-2 of 2 kb distance was created between rs2735940 and rs2736098 SNPs.In controls, the highest ratio of haplotypes' distributions between both blocks was represented by CC: CA and CA: CA, with a multi-allelic D of 0.19.Whereas the highest ratio of haplotypes' distributions between both involved blocks were CC: CA, CA: CA, and TC: CA, with a multi-allelic D of 0.15.LD plot showed that the observed CA haplotype represented the highest distribution of 0.8 to 0.79 in controls and patients, respectively (Figure 2b).This haplotype was generated by the collaboration of both rs2735940 (AG) and rs2736098 (CT) SNPs in the block-2 in both populations.Interestingly, it was found that the collaboration between the pathogenic T allele in rs2736098 SNP with the pathogenic G allele in rs2735940 SNP created the TG haplotype.This haplotype is patient-specific and not found in the control population.

Discussion
One of the main reasons for the high mortality rates associated with lung cancer in many parts of the world is related to its inadequate early detection (44).Thus, it is critical to develop useful genetic biomarkers for the early diagnosis of lung cancer (45).The TERT gene is one of these intriguing factors that has been shown to have a remarkable connection with the initiation and development of many cancers.Many TERT biomarkers have been discovered to be risk factors for NSCLC.In various populations affected by the steady increase in these cases, it is unclear if many certain alleles of the TERT gene pose a risk for NSCLC individuals.This study described the connection between TERT polymorphism and NSCLC in the Iraqi population by genotyping four high-frequency SNPs positioned in four regions within the TERT gene.
Out of four investigated SNPs, the polymorphisms of rs2736100 and rs10069690 SNPs did not show any significant association with NSCLC.This finding implies no evidence of a connection between these variants and lung cancer in the investigated population.However, our results do not align with other research that indicated the significant association of rs2736100 and rs10069690 SNPs with lung cancer in Chinese and Korean populations (34,(46)(47)(48)(49)(50)(51).This may be due to the lower number of investigated samples that represent the main limitation of this case-control study.However, genotyping 100 patients with NSCLC is tedious due to many logistic issues related to the strict identification of this type of cancer, and eliminating other types of lung cancers has reduced the sample size.
In contrast to rs2736100 and rs10069690 SNPs, this study found that both rs2735940 and rs2736098 SNPs showed significant associations with the development of NSCLC in the investigated Iraqi subjects.This association may indicate the potential roles of these SNPs in the progression of NSCLC.In agreement with our results, many reports have shown a significant association between the polymorphism of rs2735940 SNP and the development of lung cancers in several populations, including Chinese (52), Koreans (53), Icelanders (54), and African-Americans (55).In contrast, this SNP is also reported to be associated with a decreased risk of lung cancer in American women (56).However, the inclusion of one gender in the case-control investigation may change the genotype-phenotype association.In addition to lung cancer, rs2735940 SNP has also been associated with multiple malignancies, such as renal cell carcinoma (57), head and neck cancer (31), and gastric cancer (25).Furthermore, rs2735940 SNP showed variable distributions in many populations, which refers to variable biological diversity.According to dbSNP, the rs2735940 SNP allele G is deposited with high frequencies in Latin Americans (G=0.54 to 0.63), East Asians (G=0.53),Asians (G=0.52),Africans (G=0.49), and Europeans (G=0.48).Although rs2736098 SNP is less known than rs2735940 SNP, it is the of lung cancer (32,53).Moreover, it is also linked with the increased risks of other cancers (58), such as nasopharyngeal carcinoma (59), hepatocellular carcinoma (60), cervical carcinoma (61), and breast cancer (62).The rs2736098 SNP allele T is deposited in dbSNP with lower frequencies than in rs2735940 SNP.This can easily be shown in the percentage of T allele that deposited in 0.45, 0.41, 0.38, 0.27, 0.22-0.19,and 0.11 in South Asians, Asians, East Asians, Europeans, Latin Americans, and Africans, respectively.Noteworthy, the A allele was deposited in the Qatari population at the same frequency in which the G allele of rs2735940 SNP was deposited (0.26).This point may support our findings of the presence of strong collaboration between both polymorphic loci in the studied area of the Middle East.
Both the rs2736098 and rs2735940 SNPs have been deposited in the ClinVar database because of their significance in developing many different forms of tumors (63).Due to the synonymous effect of rs2735940 SNP, it seems that this variant exhibits its effect by modulating splicing patterns (64,65), regulating the velocity of mRNA translation (66), or influencing the protein kinetics (67).Whereas the effect of the 5 -UTR rs2736098 SNP differs based on its different position upstream of the open reading frame.It was reported that 5 -UTR SNPs effects are mainly concerned with altering mRNA translation and decay (68).Whatever the mechanism through which each SNP impacts the TERT, both rs2736098 and rs2735940 SNPs showed a high collaboration in the progression of NSCLC.This can be seen in the heterozygous genotypes of rs2736098: CT and rs2735940: AG as both showed a remarkable tendency to be co-inherited with each other the study population.Further analysis of haplotype distributions revealed that both SNPs are mainly co-inherited by the CA haplotype.In the patient population, the CA haplotype has a greater contribution with further genetic formulations than that found in the control group.This haplotype was generated by the contribution of the C and A alleles from rs2736098 and rs2735940 SNPs, respectively.Most importantly, the pathogenic T allele of rs2735940 and the pathogenic G allele of rs2736098 SNPs have collaborated to generate the TG haplotype, a distinct haplotype for patients.Due to the absence of TG haplotype in control, it is rather important to figure out its severity in association with NSCLC in the studied population.This sort of cooperation may be attributed to the close positioning of both SNPs in the TERT gene sequences.To the best of our knowledge, no previous report has claimed this sort of co-inheritance between rs2736098 and rs2735940 SNPs in the TERT gene.However, more pathological connections between TERT gene and NSCLC are need to be clarified the get further details on the mechanisms of this con-inheritance.
In conclusion, the genotyping experiments conducted on four variants within the TERT gene indicated the presence of a significant association between the polymorphisms of rs2735940 and rs2736098 SNPs and NSCLC in the studied Iraqi subjects.Individuals with rs2735940: AG and rs2736098: CT genotypes exhibited higher risks of NSCLC.This research demonstrated that both AG and CT genotypes collaborated to generate a high level of co-inheritance represented by the haplotype TG.This haplotype is specific for patients and may be associated with the development of NSCLC.Accordingly, the haplotype TG can serve as a promising biomarker for the early diagnosis of NSCLC.However, large-scale screening experiments are suggested to provide another layer of confirmation of the current findings.

Figure 1 A
Figure 1 A schematic diagram of TERT genotyping using the PCR-SSCP-sequencing method.A) -PCR designing of four PCRspecific primers for the amplification of 220 bp, 223 bp, 201 bp, and 206 bp to flank rs2735940, rs2736098, rs2736100, and rs10069690, respectively.B -PCR-SSCP genotyping, in which all targeted SNPs showed three polymorphic patterns of nucleic acid variations.C -Sequencing reactions of the targeted loci as positioned in the amplified PCR fragments.

698
Lawi et al.: Association of TERT gene with lung cancer

Table I
The specific PCR primers designed for the amplifiation of the TERT gene.PCR primers were designed based on the GenBank acc.no.NC_000005.10.

Table II
Hardy-Weinberg equilibrium (HWE) for rs2735940, rs2736098, rs2736100, and rs10069690 SNPs of the TERT gene in patients and control groups.The P-value with statistical significance is in bold.CL -confidence interval.

Table III
Association of rs2735940, rs2736098, rs2736100, and rs10069690SNPs with the risk of NSCLC.
The P-values with statistical significance are shown in bold.CL -confidence interval.

Table IV
The computed linkage values four investigated four SNPs in the TERT gene as determined by LD plot analysis.Whereas it is deposited in lower frequencies in South Asians (G=0.32).However, the population of the Middle East has the lowest frequencies of the allele G, with only 0.26 in the Qatari population.