Cancer Genomics

Genome Annotation

Though the sequencing of the human genome is complete, not all genes have been identified. Additionally, the regulatory sequences that dictate when and how each gene is transcribed into mRNA remain largely unknown.

Introduction

The human genome is comprised of functional DNA sequences that encode macromolecules (genes), regulate transcription, or enable the division and sorting of chromosomes. There are also so-called ‘junk’ DNA sequences that may, or may not, contain any functional elements. Identifying functional elements such as promoters, regulatory elements, introns, coding exons, and non-coding exons is extremely difficult as they can vary for each gene, and be spread over many thousands of nucleotides. Several LICR teams are using different molecular biology and bioinformatics techniques to identify regulatory elements and genes, and thus annotate the genome.

Whole Genome Promoter Mapping

The mapping of regulatory sequences, promoters in particular, is vital for identifying genes not discovered by the international human genome sequencing project. Because genes are defined by their ability to produce a functional product, promoters are the critical elements that distinguish genes from ‘junk DNA’. The approach can also be used as a powerful tool to identify genes controlled by newly discovered regulatory proteins so as to define their cellular function.

LICR scientists from the San Diego Branch, working with industry collaborators, used DNA microarrays representing the entire genome to identify 10 567 active promoters, close to half of which were previously unknown, in human fibroblast cells(1,2). This study marked the first step in decoding regulatory networks in the human genome. The team plans to use the same strategy to identify active regulatory elements in other human cells and tissues, as part of the international ENCODE consortium that aims to determine functional elements in the human genome.

Gene Identification

LICR investigators at the Lausanne and São Paulo Branches used bioinformatics to identify genes from EST databases such as the Cancer Genome Anatomy Project (CGAP), which was compiled with data from the US National Cancer Institute (NCI) and LICR’s Human Cancer Genome Project(3). The investigators compiled EST sequences in silico (on the computer) to predict complete mRNA transcripts that were then compared to the human genome sequence(4,5).

Differential Splicing in Cancer

A gene’s DNA sequence is transcribed into a primary RNA transcript that is subsequently processed to excise the ‘introns’ and join the ‘exons’, RNA sequences that will not and will be translated into protein, respectively. This RNA ‘splicing’ allows several different products to be generated from one gene sequence, as the gene’s exons can be omitted or joined in different ways. Investigators from the LICR São Paulo Branch have found that splice variants are expressed differentially in glioblastomas (brain tumors)(6) and colorectal cancers(7). It is possible that splice variants may be involved in tumor onset and progression and/or be useful as cancer markers (8).

References

  1. Kim T.H., Barrera L.O., Qu C., Van Calcar S., Trinklein N.D., Cooper S.J., Luna R.M., Glass C.K., Rosenfeld M.G., Myers R.M., and Ren B. Direct isolation and identification of promoters in the human genome. Genome Res. (2005) 15(6):830-839.
  2. Kim T.H., Barrera L.O., Zheng M., Qu C., Singer M.A., Richmond T.A., Wu Y., Green R.D., and Ren B. A high-resolution map of active promoters in the human genome. Nature (2005) 436(7052):876-880.
  3. Strausberg R.L., Camargo A.A., Riggins G.J., Schaefer C.F., De Souza S.J., Grouse L.H., Lal A., Buetow K.H., Boon K., Greenhut S.F., and Simpson A.J. An international database and integrated analysis tools for the study of cancer gene expression. Pharmacogenomics.J (2002) 2(3):156-164.
  4. De Souza S.J., Camargo A.A., Briones M.R., Costa F.F., Nagai M.A., Verjovski-Almeida S., Zago M.A., Andrade L.E., Carrer H., El Dorry H.F., Espreafico E.M., Habr-Gama A., Giannella-Neto D., Goldman G.H., Gruber A., Hackel C., Kimura E.T., Maciel R.M., Marie S.K., Martins E.A., Nobrega M.P., Paco-Larson M.L., Pardini M.I., Pereira G.G., Pesquero J.B., Rodrigues V., Rogatto S.R., da Silva I.D., Sogayar M.C., de Fatima S.M., Tajara E.H., Valentini S.R., Acencio M., Alberto F.L., Amaral M.E., Aneas I., Bengtson M.H., Carraro D.M., Carvalho A.F., Carvalho L.H., Cerutti J.M., Correa M.L., Costa M.C., Curcio C., Gushiken T., Ho P.L., Kimura E., Leite L.C., Maia G., Majumder P., Marins M., Matsukuma A., Melo A.S., Mestriner C.A., Miracca E.C., Miranda D.C., Nascimento A.N., Nobrega F.G., Ojopi E.P., Pandolfi J.R., Pessoa L.G., Rahal P., Rainho C.A., da Ros N., de Sa R.G., Sales M.M., da Silva N.P., Silva T.C., da S.W., Jr., Simao D.F., Sousa J.F., Stecconi D., Tsukumo F., Valente V., Zalcbeg H., Brentani R.R., Reis F.L., Dias-Neto E., and Simpson A.J. Identification of human chromosome 22 transcribed sequences with ORF expressed sequence tags. Proc.Natl.Acad.Sci.U.S.A (2000) 97(23):12690-12693.
  5. Imanishi T., Itoh T., Suzuki Y., O'Donovan C., Fukuchi S., Koyanagi K.O., Barrero R.A., Tamura T., Yamaguchi-Kabata Y., Tanino M., Yura K., Miyazaki S., Ikeo K., Homma K., Kasprzyk A., Nishikawa T., Hirakawa M., Thierry-Mieg J., Thierry-Mieg D., Ashurst J., Jia L., Nakao M., Thomas M.A., Mulder N., Karavidopoulou Y., Jin L., Kim S., Yasuda T., Lenhard B., Eveno E., Suzuki Y., Yamasaki C., Takeda J., Gough C., Hilton P., Fujii Y., Sakai H., Tanaka S., Amid C., Bellgard M., Bonaldo M.F., Bono H., Bromberg S.K., Brookes A.J., Bruford E., Carninci P., Chelala C., Couillault C., De Souza S.J., Debily M.A., Devignes M.D., Dubchak I., Endo T., Estreicher A., Eyras E., Fukami-Kobayashi K., Gopinath G.R., Graudens E., Hahn Y., Han M., Han Z.G., Hanada K., Hanaoka H., Harada E., Hashimoto K., Hinz U., Hirai M., Hishiki T., Hopkinson I., Imbeaud S., Inoko H., Kanapin A., Kaneko Y., Kasukawa T., Kelso J., Kersey P., Kikuno R., Kimura K., Korn B., Kuryshev V., Makalowska I., Makino T., Mano S., Mariage-Samson R., Mashima J., Matsuda H., Mewes H.W., Minoshima S., Nagai K., Nagasaki H., Nagata N., Nigam R., Ogasawara O., Ohara O., Ohtsubo M., Okada N., Okido T., Oota S., Ota M., Ota T., Otsuki T., Piatier-Tonneau D., Poustka A., Ren S.X., Saitou N., Sakai K., Sakamoto S., Sakate R., Schupp I., Servant F., Sherry S., Shiba R., Shimizu N., Shimoyama M., Simpson A.J., Soares B., Steward C., Suwa M., Suzuki M., Takahashi A., Tamiya G., Tanaka H., Taylor T., Terwilliger J.D., Unneberg P., Veeramachaneni V., Watanabe S., Wilming L., Yasuda N., Yoo H.S., Stodolsky M., Makalowski W., Go M., Nakai K., Takagi T., Kanehisa M., Sakaki Y., Quackenbush J., Okazaki Y., Hayashizaki Y., Hide W., Chakraborty R., Nishikawa K., Sugawara H., Tateno Y., Chen Z., Oishi M., Tonellato P., Apweiler R., Okubo K., Wagner L., Wiemann S., Strausberg R.L., Isogai T., Auffray C., Nomura N., Gojobori T., and Sugano S. Integrative annotation of 21,037 human genes validated by full-length cDNA clones. PLoS.Biol. (2004) 2(6):e162.
  6. Correa R.G., Sasahara R.M., Bengtson M.H., Katayama M.L., Salim A.C., Brentani M.M., Sogayar M.C., De Souza S.J., and Simpson A.J. Human semaphorin 6B [(HSA)SEMA6B], a novel human class 6 semaphorin gene: alternative splicing and all-trans-retinoic acid-dependent downregulation in glioblastoma cell lines. Genomics (2001) 73(3):343-348.
  7. Correa R.G., de Carvalho A.F., Pinheiro N.A., Simpson A.J., and De Souza S.J. NABC1 (BCAS1): alternative splicing and downregulation in colorectal tumors. Genomics (2000) 65(3):299-302.
  8. Caballero O.L., De Souza S.J., Brentani R.R., and Simpson A.J. Alternative spliced transcripts as cancer markers. Dis.Markers (2001) 17(2):67-75.

Centers Involved in this Research