br are thought to cause CC
are thought to cause CC and they express E6, E7, E1 and E2 oncogenes in the early stages and L1, L2 and E4 in late stages. E6 mediates pro-teasome dependent or independent p53 (a tumor suppressor) turnover and inactivates MDM2 while E7 mediates phosphorylated pRb-medi-ated release of E2F that enhances proliferation (Tomaić, 2016). Epi-genetic alterations can also affect the 616-91-1 of HPV and host genes in different stages (Lu et al., 2012) of CC and were noted in genes like MGMT, RASSFA1, FHIT, CDK, APC, E-cadherin, ER1 etc. (Sen et al., 2018).
Long non-coding RNAs (lncRNAs) (> 200 nts) and microRNAs (miRNAs) (< 30 nts) are two specific types of non-coding RNAs, which are transcribed from intergenic and intronic regions and their aberrant expression is often detected in different cancer types with inherent tissue specificity which can be exploited for novel diagnostic or prog-nostic purposes (Richard Boland, 2017; Xue et al., 2017). Many lncRNAs are dysregulated in gynecological cancers (Hosseini et al., 2017) and lncRNAs like MALAT1, EBIC, HOTAIR, GAS5 modulate CC carcinogenesis (Peng et al., 2016). Specific lncRNAs were often found to affect the metastatic ability by acting as sponges for miRNAs and thereby reduce the number of miRNAs available for target mRNAs (Olgun et al., 2018). A few reported sponges in CC include ANRIL sponging miR-186 (Zhang et al., 2018), GAS5 downregulating miR-196A and miR-205 (Yang et al., 2017) and DLEU1 interacting with miR-
Abbreviations list: FIGO, international federation of gynecology and obstetrics; GO, gene ontology; HPV, human papilloma virus; CC, cervical cancer
E-mail address: [email protected] (D. Karunagaran).
S. Banerjee and D. Karunagaran
Hence, an integrated approach utilizing publicly available resources was employed in this study for identification of robust mRNA as well as lncRNA staging markers aimed at diagnostic and computational cost minimization. Role of experimentally validated interactions of miRNAs with mRNAs and lncRNAs was explored to understand their deregula-tion in FIGO stage based CC progression by pathway, oncoprint, DNA methylation status and interacting miRNA family enrichment analyses followed by the identification of prognostic markers. Cox regression analysis (for death) of the validated interacting miRNA partners of se-lected mRNA and lncRNAs was also performed in this study. This ap-proach can also be used in other cancer models for future im-plementation in clinics to maximize staging precision and five-year survival in CC patients. In many studies, it was found that mining of minimal number of genes from microarray data can differentiate normal and disease conditions with high sensitivity and specificity, but implementation of such biomarker panels in clinics needs prior vali-dation by other methods (Alcaraz et al., 2017). Hence, for validation, more than two datasets were considered in this study and the biological role of selected mRNAs was also evaluated from the literature.
2. Materials and methods
2.1. Selection of mRNAs from microarray data, followed by their oncoprint, family enrichment of interacting miRNA partners, DNA methylation status and survival analyses
Five gene expression omnibus (GEO) datasets (GSE52903, GSE29817, GSE9750, GSE27469, GSE46857) were considered for comparing stage I vs stage II as well as stage II vs stage III, whereas for stage III vs stage IV comparison, only two GEO datasets (GSE29817 and GS52903) were available and the differentially expressed (DE) genes were selected for further analysis using GEO2R. DE mRNAs between normal and CC in the cancer genome atlas (TCGA) data were derived from GEPIA (Tang et al., 2017) and were compared with the selected mRNAs from microarray data from GEO. For all selections, mRNAs with adjusted p < 0.05 and log2FC (fold change) 2.0 were considered. Common DE genes in each stage (both upregulated and downregulated mRNAs) were considered for identification of validated interacting miRNAs as well. Family enrichment of interacting miRNA partners was identified using miRNet tool (Fan et al., 2016; Fan and Xia, 2018). During miRNet analysis, organism name (H. sapiens), ID type (Gene official symbol) and tissue origin (Cervix) were provided besides using the input mRNAs, and for network visualization, no degree or be-tweenness filter was used. When survival analysis of the selected mRNAs was performed using PROGgene V2 (Goswami and Nakshatri, 2014), tumor samples were divided into high- and low-expression groups. Statistically significant combinatorial markers with p < 0.05 using log-rank test for CC were also identified using the selected mRNAs (duVerle et al., 2013). Oncoprint analysis and visualization of heatmap for the most significant DNA methylation status of the selected mRNAs