Introduction
Recent advances in DNA sequencing technology have revolutionized genomics and biomedical research. Collectively referred to as the Next Generation Sequencing (NGS), the impact of the new technology is particularly evident in the fields of cancer research (Reference). Various types of mutations as well as large scale chromosomal aberrations are being reported and cataloged, and the rate of data accumulation will likely accelerate for the foreseeable future. This should certainly apply to lung cancer which is currently the second most common cancer and the primary cause of mortality among cancer-related deaths in the United States (Reference).
Recently, an NGS-based transcriptome analysis of multiple lung adenocarcinoma patients has been reported. The main findings include the advent of fusion gene containing tyrosine kinase domain of c-RET in 1~2% of patients which could promote anchorage independent growth of NIH3T3 cells (Reference). This was corroborated by our NGS study as well as other concurrent studies that examined multiple samples via targeted sequencing strategy or combination of immunohistochemistry and fluorescence in situ hybridization (Reference). Given that these patients do not harbor mutations in EGFR or KRAS or express fusion genes of ALK, it is likely that c-RET fusion genes will lead to defining a new subclass of lung cancer (Reference).
Such case of identifying mutations with high likelihood of being the ‘driver’ is an exemplary application of NGS, but the bigger challenge down the road is moving beyond simple cataloging of mutations and establishing means of integrating diverse omics data generated by NGS. This will allow understanding cancer at the multiple levels of gene networks and signaling pathways and reveal targets of regulation and therapy even when a single dominant driver is not readily identifiable. Here, we report a high-throughput sequencing study of primary lung tumor and adjacent normal tissues isolated from 6 Korean never-smoker female non-small cell lung carcinoma (NSCLC) patients. It is the first multi-dimensional study of NSCLC that covers the exome-seq, RNA-seq, small RNA-seq, MeDIP-seq. NGS data are also complemented by microarray-based gene expression profiling and array-CGH study for DNA copy number variations (CNVs). Our study thus represents simultaneous probing of the genome, transcriptome, and epigenome of biological samples revealing mutations, gene expression, and regulation, respectively. More importantly, we describe integrative analyses of combining different types of omics data which facilitated identifying key regulators and elucidating the details of relevant cellular processes at the systems level. The results show that the gene network modules highly relevant to development of cancer including that governing G2/M DNA damage checkpoint are consistently perturbed in these patients. We also report that multiple microRNAs show consistent anti-correlation with the predicted and validated target genes within these gene network modules indicating that microRNAs likely represent key regulatory players of NSCLC development.
Collaborations : Korean Lung Cancer Collaborative Research Project