Jump to Main Contents
研究所ロゴ

Home > Organization > Divisions and Independent Research Units > Division of Bioinformatics > Research Projects

Research Projects

List of Projects

Bioinformatics for cancer genome medicine (precision medicine)
A new theory of personalized medicine: numerical simulation-based personalized medicine
Bioinformatics analysis for other laboratories

Bioinformatics for cancer genome medicine (precision medicine)

A product version of cisCall (Kato et al, 2018, Genome Medicine) and cisInter, which we developed, was implemented into the OncoGuide™ NCC Oncopanel System and this system was approved by the government for the first time as a Medical Device in cancer genome medicine in Japan (December, 2018).

Further, the system was approved by the government for the first time to be usable under the National Health Insurance in cancer genome medicine in Japan (June, 2019).
 

Cancer genomics has progressed greatly since 2008. This progress resulted from the emergence of next-generation sequencers, the latest experimental device of that day that was used in leading cancer genomics laboratories. Thereby, gene aberrations were discovered consecutively. In addition to research, might this technology also be usable for medicine? This is when the novel idea of clinical sequencing – now called “cancer genome medicine” – was born. Since 2012, this Center has started a pioneering research project called TOP-GEAR to realize cancer genome medicine. This project was not behind the world’s pace and was a leading project of cancer genome medicine in Japan. However, not many people of the Center favored this project. From responses at that time, the Principal Investigator, PI, of the Division of Bioinformatics infers that this was because 1) it was then common to study a specific cancer type but uncommon to target solid cancers collectively (the project targeted solid cancers in general); 2) many people thought it was sufficient to import new technologies from abroad and did not understand why new technologies in cancer genome medicine need to be developed in Japan (e.g., there were no Japan-made next-generation sequencers).


The TOP-GEAR project comprises the first (2012-2015), second (2015-2018), and third (2018-2021) phases, during which gene aberrations in cancer specimens were identified by next-generation sequencing (NGS) and patients were introduced to clinical trials for molecularly targeted drugs (for free to use such drugs in clinical trials) that were matched with the detected aberrations (Tanabe et al, 2016, Molecular Cancer; Sunami et al, 2019, Cancer Science). One of the project aims was to develop Japan-made technologies. Previous genetic tests usually examined only one type of aberration in a single gene; meanwhile, NGS tests had the potential to examine several types (SNVs/indels, copy number alterations, and fusions) of aberrations in more than 100 genes in a single test. However, this would only be realized when the vast amounts of data generated by NGS were processed properly.


NGS only generates short sequence data on nucleotides in the length of 100 base pairs or so and such data need to be processed properly using computer programs to extract information on how and what genes are altered in cancer specimens. In 2012 – when we started the project, bioinformatics program tools were already developed in the “research” field of cancer genomics. We used them; however, these tools did not produce good results. Though this may be too technical, in cancer genomics, DNA is extracted from specimens frozen with liquid nitrogen for storage and then used for NGS. Frozen specimens are not used daily in hospitals because the costs of cooling facilities and their maintenance is too expensive; however, it is possible to get chemically-intact, “clean” DNA from frozen specimens. Meanwhile, in the clinical practice of cancer genome medicine, formalin-fixed paraffin-embedded, FFPE, specimens are used instead of frozen specimens. FFPE specimens are stored at room temperature and do not require special facilities; therefore, these are used daily in hospitals, for example, for pathological diagnosis. However, DNA extracted from FFPE specimens is chemically denatured due to the chemical processes involved. As a result, the related NGS data are polluted with substantial noise; this is why previous bioinformatics tools, which were developed for research samples, could not make detections precisely enough for clinical purposes.


To solve this problem, we developed cisCall, a bioinformatics tool for detecting gene aberrations, which is specialized for cancer genome medicine (Kato et al, 2018, Genome Medicine) .

linked to Fig1_cisCall.pdf
 (Click on the image to open the PDF)

linked to Fig2_cisMuton.pdf
 (Click on the image to open the PDF)

linked to Fig3_cisFusionCton.pdf
 (Click on the image to open the PDF)

cisCall is equipped with algorithms that utilize robust methods (non-parametric statistics, computer-intensive statistics, use of internal controls based on random sampling of targeted data) enabling precise detection without disturbance by noise. Thereby, we reduced the errors and misses of all types of gene aberrations to less than 14% compared to those with previous tools. Further, the previous tools each only handled one type of gene aberration out of SNVs/indels, copy number alterations, and gene fusions (because the different types require to develop different algorithms, which is too laborious), whereas cisCall could detect all the aberration types necessary for cancer genome medicine. The publication of cisCall (Kato et al, 2018, Genome Medicine) is the first that publicly opened a program identifying all the types of gene aberrations using NGS data derived from FFPE specimens, which are necessary for daily diagnosis in cancer genome medicine.


It is then necessary to suggest molecularly targeted drugs that are appropriate to detected gene aberrations. For this purpose, we developed cisInter, a program that automatically suggests such drugs and also generates a report document listing detected aberrations and drugs,
which would not be necessary in the basic research field of cancer genomics. Although we did not publicize cisInter because its outputs are in Japanese, we consider it to be an advanced bioinformatics tool in cancer genome medicine at that time. We also developed a database (cisVids) that can take in the report outputs, as well as a web tool (cisMedi) to efficiently manage the Tumor Board/Expert Panel.


Simultaneously, the Center was establishing hospital facilities that meet an international quality standard (U.S. CLIA standard), wherein DNA is extracted and used for NGS, and the resulting NGS data are then processed for cancer genome medicine. For these facilities, we implemented our developed tools, set the information-technology infrastructure, designed the information logistics, and linked the information generated from the facilities to the electronic medical records in the Hospital.

linked to Fig4_Informatics.pdf
 (Click on the image to open the PDF)

linked to Fig5_RunPhoto.pdf
 (Click on the image to open the PDF)

Generally, scientists do not perform these types of tasks because such tasks have nothing to do with publications; however, as no additional people could be spared we did them, accepting the fate. Additionally, we participated in a project where IBM Watson, AI for cancer genome medicine and a popular topic at that time, was applied to the TOP-GEAR project (Itahashi et al, 2018, Frontiers in Medicine). To our knowledge, this paper is the first report in Japan that scientifically evaluated IBM Watson. Although we cannot say that all our attempts were successful, we studied, prepared, and executed almost all information technologies that were necessary for cancer genome medicine.


Our cisCall (Kato et al, 2018, Genome Medicine) and cisInter were then technology-transferred to the industry and used in the first Advanced Medical Care
(Advanced Medical Care B “Multiplex Gene Panel Testing to Advancing Personalized Medicine”) in cancer genome medicine (April, 2018). In December, 2018, the OncoGuide™ NCC Oncopanel System, for comprehensive genome profile testing, was approved by the government for the first time as a Medical Device in cancer genome medicine in Japan. This system is considered a Combination Medical Device composed of analysis programs and reagents, of which the programs are indeed a product version of cisCall and cisInter. Then in June, 2019, cancer genomic profile testing using this system was approved to be usable under the National Health Insurance, which means that cancer genome medicine was publicly realized in Japan. This marks the beginning of cancer genome medicine in Japan. With these achievements, we are proud of ourselves as the top laboratory that studies bioinformatics for cancer genome medicine in Japan and one of the top-class laboratories worldwide.


We achieved the application under the National Health Insurance. However, the PI always cared about two important issues during the TOP-GEAR project: 1) although the targets of molecularly targeted drugs are found, patients are not introduced to clinical trials if the matched clinical trials are not ongoing at the Center; 2) even if matched molecularly target drugs are found, our role ends with this and we do not know what has become of patients after the genomic testing. Regarding 1), matched clinical trials may be ongoing at other hospitals. For 2), if such data are accumulated and analyzed, it is possible to obtain insights into better judgements about what drug therapies are effective for what types of patients. Perhaps, everyone engaged in cancer genome medicine shared these issues.


As if responding to such concerns, C-CAT (Center for Cancer Genomics and Advanced Therapeutics), was established at the Center in 2018. The data of cancer genomic tests performed under the National Health Insurance in Japan are all sent to C-CAT, which documents the nationwide clinical trials of matched drug therapies individually for every patient (in “C-CAT Findings”) and which sends these documents back to hospitals that initially send the data. Through this system, the above two issues are solved. The PI is concurrently positioned as the Chief of the Section of Genomic Data Management in C-CAT; some members of the Division of Bioinformatics are also members of the Section of Genomic Data Management. The C-CAT system is a large-scale system across Japan and a partial system related to genomic data is being developed and managed by the Section of Genomic Data Management. The section also develops more practical information technologies compared to research, such as the “CATS (cancer genomic test standardized) format”, a file format specialized for cancer genomic profile testing. The homepage of the Section of Genomic Data Management is here.

 

Selected Papers:

  • Kuniko Sunami, Hitoshi Ichikawa, Takashi Kubo, Mamoru Kato, Yutaka Fujiwara, Akihiko Shimomura, Takafumi Koyama, Hiroki Kakishima, Mayuko Kitami, Hiromichi Matsushita, Eisaku Furukawa, Daichi Narushima, Momoko Nagai, Hirokazu Taniguchi, Noriko Motoi, Shigeaki Sekine, Akiko Maeshima, Taisuke Mori, Reiko Watanabe, Masayuki Yoshida, Akihiko Yoshida, Hiroshi Yoshida, Kaishi Satomi, Aoi Sukeda, Taiki Hashimoto, Toshio Shimizu, Satoru Iwasa, Kan Yonemori, Ken Kato, Chigusa Morizane, Chitose Ogawa, Noriko Tanabe, Kokichi Sugano, Nobuyoshi Hiraoka, Kenji Tamura, Teruhiko Yoshida, Yasuhiro Fujiwara, Atsushi Ochiai, Noboru Yamamoto, Takashi Kohno
    Feasibility and utility of a panel testing for 114 cancer-associated genes in a clinical setting: A hospital-based study
    Cancer Science, 2019, 110, 1480.1-11
  • Mamoru Kato, Hiromi Nakamura, Momoko Nagai, Takashi Kubo, Asmaa Elzawahry, Yasushi Totoki, Yuko Tanabe, Eisaku Furukawa, Joe Miyamoto, Hiromi Sakamoto, Shingo Matsumoto, Kuniko Sunami, Yasuhito Arai, Yutaka Suzuki, Teruhiko Yoshida, Katsuya Tsuchihara, Kenji Tamura, Noboru Yamamoto, Hitoshi Ichikawa, Takashi Kohno, and Tatsuhiro Shibata
    A computational tool to detect DNA alterations tailored to formalin-fixed paraffin-embedded samples in cancer clinical sequencing
    Genome Medicine, 2018, 10, 44.1-44.11
  • Kota Itahashi, Shunsuke Kondo, Takashi Kubo, Yutaka Fujiwara, Mamoru Kato, Hitoshi Ichikawa, Takahiko Koyama, Reitaro Tokumasu, Jia Xu, Claudia S. Huettner, Vanessa V. Michelinim, Laxmi Parida, Takashi Kohno, and Noboru Yamamoto
    Evaluating clinical genome sequence analysis by Watson for Genomics
    Frontiers in Medicine, 2018, 5, 305.1-10
  • Yuko Tanabe, Hitoshi Ichikawa, Takashi Kohno, Hiroshi Yoshida, Takashi Kubo, Mamoru Kato, Satoru Iwasa, Atsushi Ochiai, Noboru Yamamoto, Yasuhiro Fujiwara, and Kenji Tamura
    Comprehensive screening of target molecules by next-generation sequencing in patients with malignant solid tumors: guiding entry into phase I clinical trials
    Molecular Cancer, 2016, 15, 73-77

 

Reviews:

  • Mamoru Kato
    NCC Oncopanel and multigene panels abroad
    Journal of Clinical and Experimental Medicine (Igaku-no Ayumi, in Japanese), 2020, 275, 419-423

  • Mamoru Kato
    Development of bioinformatics methods in cancer gene-panel test
    Precision Medicine (in Japanese), NTS, ISBN 978-4-86043-580-6, 2018, 71-80
  • Mamoru Kato
    Bioinformatics pipelines in genome medicine
    Experimental Medicine (Jikken Igaku, in Japanese), 2018, 36, 2645-2652
  • Mamoru Kato
    Cancer precision medicine
    Anti-Aging Medicine (Anti-Aging Igaku, in Japanese), 2017, 13, 663-669
  • Mamoru Kato
    Bioinformatics in cancer clinical sequencing – an emerging field of cancer personalized medicine
    Japanese Journal of Cancer and Chemotherapy (Gan To Kagaku Ryouhou, in Japanese), 2016, 43, 391-397

 

A new theory of personalized medicine: numerical simulation-based personalized medicine

This theme is rather related to basic research compared to the above-mentioned bioinformatics used in cancer genome medicine. We studied on cancer-cell evolution and its resultant intra-tumor heterogeneity, wherein cancer cells acquire diversity (heterogeneity) in the same manner as Darwinian evolution and adapt to microenvironments in the body to proliferate. This viewpoint was suggested long ago. However, it is only recently that it was actually and commonly observed at high-resolution because of the emergence of NGS, which enabled easy analysis of DNA and RNA in large volumes. This kind of diversity is observed at the DNA level as well as the RNA level, which is more phenotypic; however, the PI prefers the DNA level diversity because this diversity is generated by evolution through genetic inheritance.


For example, in resistance to a molecularly targeted drug (imatinib) in gastrointestinal stromal tumors (GISTs), we suggested that cancer cells genetically with a “general” resistance, for instance, to apoptosis, originally exist in a small fraction from the beginning; when exposed to the molecular targeted drug, they first endure by the resistant ability and at some time point, they finally acquire a strong resistance (KIT secondary mutation) “specific” to the drug, allowing them to proliferate intensely  (Takahashi et al, 2017, Genes, Chromosomes & Cancer).

linked to Fig1_GIST.01.pdf
  (Click on the image to open the PDF)

We also analyzed the NGS data of DNA from single cancer cells (single-cell sequencing) in our original method combining the latest population genetic theory to analyze (usual) organism evolution – multiple-merger coalescent model – with the latest statistical theory to estimate parameter values – approximate Bayesian computation, and suggested that cancer cells, like fish, evolve at high rates of birth and death, in the context of breast cancer (Kato et al, 2017, Royal Society Open Science).

linked to Fig2_MMC.01.pdf
  (Click on the image to open the PDF)


 
While we studied the evolutionary processes of cancer cells, the PI began to consider the question – “so what?” – the evolutionary processes of cancer cells varied widely from patient to patient.
A phylogenetic tree of cancer cell evolution could be reconstructed wherein a set of genes were serially impaired in an order so that cancer cells proliferated in one patient. Another phylogenetic tree was drawn wherein a different set of genes were damaged in a completely different order, leading to cancer cell proliferation in another patient. The histories were highly individual-specific, so that it appeared similar to just deciphering many personal autobiographies. The aim, of course, was to extract the universal rules; however, such rules seemed too abstract to be practically useful. In fields that study evolution, such as species evolution and cosmic evolution, it is valuable to describe events, even one-time events, because this history is experienced by the ancestors of everyone or the constituent elements of everyone. The Cambrian explosion in species evolution and the Big Bang in cosmic evolution are thus meaningful to everyone though they are one-time events. However, in cancer-cell evolution, the events are too individual among patients and are equally important – in the end, the scenario remains as “this history for this patient” and “that history for that patient”. Because we studied the clinically practical application of cancer genome medicine as described above, even though we might obtain some insights, the PI questioned whether continuing this line of research would be of help to patients.
   
 
A breakthrough was obtained with the development and application of tugHall, a computer simulation model to simulate cancer cell evolution (Nagornov and Kato, 2020, Bioinformatics).

linked to Fig3_tugHall_Alg.pdf
 (Click on the image to open the PDF)

Originally, we developed a computer simulation model that considered gene functions to reveal intra-tumor heterogeneity (Technically, we previously studied a time-backward model – multiple-merger coalescent model. Though this was simple, it was difficult to involve complex operations such as gene functions in the model because of the time-backward approach. To solve this problem, we wanted to use a time-forward model and developed tugHall.) tugHall is a computer simulation model that considers the individual functions of tumor-related genes such as TP53 and KRAS. In this model, individual gene functions are mediated by the essence of cancer biology – cancer hallmarks – to finally influence the proliferation of cancer cells. This model is not so complex; it only has 7 internal parameters. Besides, every time one gene is taken into the model, 6 free parameters may be added; nevertheless, some of the parameters can be removed based on biological knowledge.


Importantly, it is indeed possible to determine the values of parameters per patient from the data obtained using current technologies such as NGS
(To do this, we used approximate Bayesian computation, the latest parameter-estimation method used in the previous time-backward model study). Then, nullifying the values virtually on a computer, such as the values related to KRAS, it is possible to generate a situation where the KRAS aberration is blocked by a molecularly targeted drug, whereby we can predict whether the cancer cells will die out when a drug to block the KRAS aberration is used in a patient. On the TCGA website, the NGS and its derived data of many patients have been already accumulated. We applied tugHall to the NGS-derived data of a 73-year-old male patient with colorectal cancer on the website (Nagornov and Kato, 2020, Bioinformatics). The representative colorectal cancer-related genes, APC, KRAS, TP53, and PIK3CA were all impaired in the cancer tissue of this patient; however, tugHall predicted that, blocking TP53, and not the other genes, could stop cancer cell proliferation in this patient. This result suggests that drugs targeting APC, KRAS, and PIK3CA would not be effective for this patient; instead, a drug targeting TP53 would be effective.

linked to Fig4_tugHall_Effect.pdf
 (Click on the image to open the PDF)

In fact, molecularly targeted drugs matched with detected gene aberrations do not always show positive effects in the practice of cancer genome medicine. For example, it is known that at most, only 40–60% of the drug therapies (15–35% of second-line therapies) are effective in colorectal cancers.


We think that this is a big conversion of concepts. So far, many computer simulation models have been studied to simulate cancer-cell evolution for revealing the development of intra-tumor subpopulations of cancer cells and the values of parameters such as mutation rate and selection coefficient. However, our study proved the novel concept that a cancer-cell simulation model can be applied to predict the effects of molecularly targeted drugs for individual patients.
This study only provided preliminary results and we cannot say that this concept works practically; nevertheless, we did demonstrate a methodological possibility. To our knowledge, this study (Nagornov and Kato, 2020, Bioinformatics) is the first to demonstrate a proof-of-concept that computer simulations of cancer-cell evolution can be applied to personalized medicine. The PI refers to this new type of personalized-medicine theory as “[numerical] simulation-based personalized medicine”. This term contrasts with “statistics-based personalized medicine”, which predicts drug effects based on survival analysis, and multivariate analysis/AI. The PI simply thinks that simulation-based and statistics-based personalized medicine are complementary to each other.

linked to Fig5_tugHall_Comp.pdf
 (Click on the image to open the PDF)


 
We care about weather prediction in the typhoon season. Such weather prediction benefits from computer simulations called numerical weather prediction. Weather prediction was originally, based on statistics: sunny or rainy are frequently observed on this or that date. Recently, in addition to statistics, numerical weather prediction utilizing fluid dynamics and physical process models is being considered, resulting in more precise weather prediction. Historically, the concept of numerical weather prediction was proposed in the 1920s (using manual calculation at the time) and became computationally possible in the 1950s; however, it was not practically useful due to the slow computation speed at that time. Probably, at least 50 years were required to reach the practical level even for numerical weather prediction. Similarly, many breakthroughs will be required for simulation-based personalized medicine to reach the practical level, the PI estimates.

 

Selected Papers:

  • Iurii S. Nagornov, Jo Nishino, Mamoru Kato.
    tugHall: A Tool to Reproduce Darwinian Evolution of Cancer Cells for Simulation-Based Personalized Medicine.  
    ISMCO 2020: Mathematical and Computational Oncology.
    Lecture Notes in Computer Science
    , vol 12508, 71-76.
  • Iurii S. Nagornov and Mamoru Kato
    tugHall: a simulator of cancer-cell evolution based on the hallmarks of cancer and tumor-related genes
    Bioinformatics, 2020, 36, 3597–3599.
  • Mamoru Kato, Daniel A. Vasco, Ryuichi Sugino, Daichi Narushima, and Alexander Krasnitz
    Sweepstake evolution revealed by population-genetic analysis of copy-number alterations in single genomes of breast cancer
    Royal Society Open Science, 2017, 4, 171060.1-171060.11
  • Tsuyoshi Takahashi, Asmaa Elzawahry, Sachiyo Mimaki, Eisaku Furukawa, Rie Nakatsuka, Hiromi Nakamura, Takahiko Nishigaki, Satoshi Serada, Tetsuji Naka, Seiichi Hirota, Tatsuhiro Shibata, Katsuya Tsuchihara, Toshirou Nishida, and Mamoru Kato
    Genomic and transcriptomic analysis of imatinib resistance in gastrointestinal stromal tumors
    Genes, Chromosomes & Cancer, 2017, 56, 303-313.

 

Materials:

 

Bioinformatics analysis for other laboratories

We perform bioinformatics analysis of data generated via collaborations with experimental laboratories. Such data analysis is the mainstream of bioinformatics studies and we can learn many aspects of biology and medicine through collaborations with experimental laboratories. The results are produced synergistically. Collaborations are thus one of the main pillars of our laboratory.


We have so far performed a variety of bioinformatics analyses through such collaborations and some have been published (publications). We tallied the statistics previously and it seems that one-third of the collaborations have been published. We analyzed these statistics for three years and interestingly found that this one-third fraction did not change greatly over the years. Conversely, we provided data analysis for three times the published collaborations.


These studies have their own history of data analysis battles but are too numerous to introduce. Hence, we have selected only some examples.
One study selected for the simplicity of its results is the genomic (omics) analysis of bile-duct cancer (Nakamuraet al, 2015, Nature Genetics), for which we collaborated with the Division of Cancer Genomics at the Center.

linked to Fig1_BileDuct.pdf
 (Click on the image to open the PDF)

Through transcriptome analysis, we discovered four subgroups of bile-duct cancer cases and found a clear difference in the prognosis among the four subgroups. The subgroup with the worst prognosis tended to have a higher tumor mutation burden and increased expression of immune checkpoint genes. Immune-checkpoint inhibitors might work for such patients. Another study was conducted by Japan’s representative groups for liver cancer genomic analysis – a kind of All Japan Liver Cancer Genomic Group. We participated in the analysis of the whole-genome sequencing data and found that TP53 variants and unexpectedly in 5’ UTR HRASLS variants were associated with overall survival (Fujimoto et al, 2016, Nature Genetics).

 

linked to Fig2_Liver.pdf
 (Click on the image to open the PDF)


These two studies were performed under the umbrella of the International Cancer Genome Consortium (ICGC). The ICGC is an international consortium that coordinates cancer types that each country should study such as liver and bile-duct cancers for Japan, to avoid inefficient redundant studies. It also promotes sharing of the latest discoveries and knowledge among the participant countries. Thus, the accumulated results and experiences finally allowed whole-genome analysis across a variety of caner types, which was reported as the grand sum of cancer genomics over recent years (The ICGC/TCGA Pan-Cancer Analysis of Whole Genomes Consortium, 2020, Nature). This is the achievement of a research community comprising 1341 scientists including the PI.

 

Selected Papers:

  • The ICGC/TCGA Pan-Cancer Analysis of Whole Genomes Consortium
    Pan-cancer analysis of whole genomes
    Nature, 2020, 578, 82-93.
  • Akihiro Fujimoto, et al,
    Whole genome mutational landscape and characterization of non-coding and structural mutations in liver cancer,
    Nature Genetics, 2016, 48, 500-509.
  • Shinichi Yachida, et al,
    Genomic sequencing identifies ELF3 as a driver of ampullary carcinoma,
    Cancer Cell, 2016, 29, 229-240.
  • Hiromi Nakamura, et al,
    Genomic spectra of biliary tract cancer,
    Nature Genetics, 2015, 47, 1003-1010.
  • Yasushi Totoki, et al,
    Trans-ancestry mutational landscape of hepatocellular carcinoma genomes,
    Nature Genetics, 2014, 46, 1267-1273.