Annual Report 2020
Division of Genome Analysis Platform Development
Yuichi Shiraishi, Naoko Iida, Ai Okada, Kenichi Chiba, Raúl Nicolás Mateos
Introduction
We have been constructing various analysis workflows for efficient analysis of cancer genome and transcriptome sequencing data. As an analysis platform that includes items necessary for primary analysis of sequencing data, we have constructed an analysis workflow that detects various mutations such as point mutations, deletions, insertions, and translocations from genome sequencing, and transcriptional abnormalities such as fusion genes, changes in expression levels, and splicing abnormalities from transcriptome sequencing. In addition, we conduct integrated analysis of genomic and transcriptomic data. With the development of high-throughput sequencing technology and the advancement of many methodologies and software worldwide, we are developing the platform with various improvements, including speeding up and refinement.
Research activities
(1) Development of mutation detection analysis for hereditary tumors and AYA (adolescent and young adult) generation cancers
Although the development of high-throughput sequencing technology has enabled the comprehensive detection of mutations, detecting mutations with high sensitivity and accuracy by eliminating sequencing errors and artifacts is still an important and challenging task. We have developed practical filtering tools to detect mutations in hereditary tumors and AYA generation cancers. 1. We constructed a database of genomic breakpoints due to structural abnormalities using WGS data from the 1000 Genome Project. Then, a tool was developed to detect the structural variants specific to the target samples by filtering the structural variants in the database. 2. We developed a tool to obtain mutation information (VCF file) to detect disease-related point mutations and deletion/insertion mutations efficiently. GenomicsDBImport and GenotypeGVCF were performed on the results of HaplotypeCaller in GATK software, multiple annotations were added, and the results were filtered appropriately. This tool was run against WGS data of hereditary tumors and AYA generation cancers, and candidate mutations associated with the disease have been detected.
(2) Large-scale analysis of transcriptome data
We have conducted analyses aimed at knowledge discovery using the large-scale transcriptome data stored in the public NCBI Sequence Read Archive (SRA). In addition to the alignment and expression analysis using the software STAR, an analysis using a tool we have developed has been conducted to detect splicing abnormalities.
(3) Development of long read sequencing data analysis pipeline
We developed an analysis flow for long-read sequences obtained by long-read sequencers, attracting attention in recent years. The performance of the mutation detection software "Medaka", the haplotype caller "Whathap", and the structural abnormality detection software "NanomonSV" has been investigated. We used container technology to package the tools to ensure the reproducibility of the analysis.
(4) Spatial gene expression analysis using the cloud system
We constructed an analysis environment for Space Ranger, a series of analysis pipelines that analyze gene expression from tissue sections while retaining cell location information. Space Ranger requires a high-spec machine, but we built an analysis system that uses the analysis environment with the necessary specifications when needed by using AWS virtual machines.
Education
We supported the many researchers using our analysis pipeline by answering their bioinformatics questions. Postdocs were hired and their research were supported.
Future Prospects
We have established the technological basis for cancer genome and transcriptome analysis. To develop the application of whole-genome analysis to genomic medicine, we will construct various genome analysis flows. In addition, we will apply the established analysis flow to large-scale data, and conduct knowledge discovery from the obtained information.
List of papers published in 2020
Journal
1. Ishida Y, Kakiuchi N, Yoshida K, Inoue Y, Irie H, Kataoka TR, Hirata M, Funakoshi T, Matsushita S, Hata H, Uchi H, Yamamoto Y, Fujisawa Y, Fujimura T, Saiki R, Takeuchi K, Shiraishi Y, Chiba K, Tanaka H, Otsuka A, Miyano S, Kabashima K, Ogawa S. Unbiased Detection of Driver Mutations in Extramammary Paget Disease. Clin Cancer Res, 27:1756-1765, 2021
2. Nishimura A, Hirabayashi S, Hasegawa D, Yoshida K, Shiraishi Y, Ashiarai M, Hosoya Y, Fujiwara T, Harigae H, Miyano S, Ogawa S, Manabe A. Acquisition of monosomy 7 and a RUNX1 mutation in Pearson syndrome. Pediatr Blood Cancer, 68:e28799, 2021
3. Saito Y, Koya J, Araki M, Kogure Y, Shingaki S, Tabata M, McClure MB, Yoshifuji K, Matsumoto S, Isaka Y, Tanaka H, Kanai T, Miyano S, Shiraishi Y, Okuno Y, Kataoka K. Landscape and function of multiple mutations within individual oncogenes. Nature, 582:95-99, 2020
4. Ueno H, Yoshida K, Shiozawa Y, Nannya Y, Iijima-Yamashita Y, Kiyokawa N, Shiraishi Y, Chiba K, Tanaka H, Isobe T, Seki M, Kimura S, Makishima H, Nakagawa MM, Kakiuchi N, Kataoka K, Yoshizato T, Nishijima D, Deguchi T, Ohki K, Sato A, Takahashi H, Hashii Y, Tokimasa S, Hara J, Kosaka Y, Kato K, Inukai T, Takita J, Imamura T, Miyano S, Manabe A, Horibe K, Ogawa S, Sanada M. Landscape of driver mutations and their clinical impacts in pediatric B-cell precursor acute lymphoblastic leukemia. Blood Adv, 4:5165-5173, 2020
5. Yasuda T, Sanada M, Nishijima D, Kanamori T, Iijima Y, Hattori H, Saito A, Miyoshi H, Ishikawa Y, Asou N, Usuki K, Hirabayashi S, Kato M, Ri M, Handa H, Ishida T, Shibayama H, Abe M, Iriyama C, Karube K, Nishikori M, Ohshima K, Kataoka K, Yoshida K, Shiraishi Y, Goto H, Adachi S, Kobayashi R, Kiyoi H, Miyazaki Y, Ogawa S, Kurahashi H, Yokoyama H, Manabe A, Iida S, Tomita A, Horibe K. Clinical utility of target capture-based panel sequencing in hematological malignancies: A multicenter feasibility study. Cancer Sci, 111:3367-3378, 2020
6. Fukumoto K, Sakata-Yanagimoto M, Fujisawa M, Sakamoto T, Miyoshi H, Suehara Y, Nguyen TB, Suma S, Yanagimoto S, Shiraishi Y, Chiba K, Bouska A, Kataoka K, Ogawa S, Iqbal J, Ohshima K, Chiba S. VAV1 mutations contribute to development of T-cell neoplasms in mice. Blood, 136:3018-3032, 2020
7. Inagaki-Kawata Y, Yoshida K, Kawaguchi-Sakita N, Kawashima M, Nishimura T, Senda N, Shiozawa Y, Takeuchi Y, Inoue Y, Sato-Otsubo A, Fujii Y, Nannya Y, Suzuki E, Takada M, Tanaka H, Shiraishi Y, Chiba K, Kataoka Y, Torii M, Yoshibayashi H, Yamagami K, Okamura R, Moriguchi Y, Kato H, Tsuyuki S, Yamauchi A, Suwa H, Inamoto T, Miyano S, Ogawa S, Toi M. Genetic and clinical landscape of breast cancers with germline BRCA1/2 variants. Commun Biol, 3:578, 2020
8. Matsuo H, Yoshida K, Nakatani K, Harata Y, Higashitani M, Ito Y, Kamikubo Y, Shiozawa Y, Shiraishi Y, Chiba K, Tanaka H, Okada A, Nannya Y, Takeda J, Ueno H, Kiyokawa N, Tomizawa D, Taga T, Tawa A, Miyano S, Meggendorfer M, Haferlach C, Ogawa S, Adachi S. Fusion partner-specific mutation profiles and KRAS mutations as adverse prognostic factors in MLL-rearranged AML. Blood Adv, 4:4623-4631, 2020