NBDC Research ID: hum0094.v11

 

SUMMARY

Aims: Identification of oncogenic alteration in breast cancer, lung adenocarcinoma, colorectal cancer (CRC) and pulmonary pleomorphic carcinoma through genomic analysis. To identify new biomarkers of desmoid tumor by investigating the molecular profiles in combination with the clinicopathological characteristics.

Methods: Whole Genome Sequencing (WGS), Whole Exome Sequencing (WES), RNA-seq and Target Capture sequencing were performed with HiSeq 2000/2500. Methylation analysis was performed with Infinium MethylationEPIC Kit. SNP array was performed with HumanOmni2.5-8. Genomic DNA was purified from each sample; We sequenced those genomic DNAs using 10X Genomics sequencing technologies, and analyzed genomic structural variations. Long-read sequencing was performed with the Pacific Biosciences Single-Molecule Real-Time (SMRT) sequencing technology.

Participants/Materials: Surgically resected breast cancer, lung adenocarcinoma, CRC and desmoid tumor tissues and paired non-tumor tissues or peripheral blood cells (as normal tissues)

            WGS: Triple negative breast cancer: 16 cases (JGAS000095)

                       Lung adenocarcinoma: 7 cases (JGAS000360)

                       MSI-H CRC: 12 cases (JGAS000360)

                       Triple negative breast cancer: 11 cases (JGAS000360)

                       Colorectal cancer: 21 cases (JGAS000335)

             WES: Triple negative breast cancer: 36 cases (including 16 cases used for WGS, 23 cases used for RNA-seq) (JGAS000095)

                       Lung adenocarcinoma: 43 cases (JGAS000105) + 126 cases (JGAS000215)

                       MSI-H CRC: 149 cases (JGAS000113)

                       CRC with liver metastasis: 12 cases, CRC without liver metastasis: 16 casess (JGAS000128)

                       Desmoid tumor: 64 cases (JGAS000270)

                       Pulmonary pleomorphic carcinoma: 19 cases (JGAS000297)

                       Colorectal cancer: 128 cases (JGAS000335)

            RNA-seq: Triple negative breast cancer: 23 cases (JGAS000095)

                            Estrogen receptor-positive breast cancer: 17 cases (JGAS000095)

                            Human epidermal growth factor receptor 2 (HER2)-positive breast cancer: 15 cases (JGAS000095)

                            Lung adenocarcinoma: 43 cases (JGAS000105) + 126 cases (JGAS000215)

                            MSI-H CRC: 94 cases (JGAS000113)

                            CRC with liver metastasis: 12 cases, CRC without liver metastasis: 16 cases (JGAS000128)

                            Desmoid tumor: 55 cases (JGAS000270)

                            Pulmonary pleomorphic carcinoma: 33 cases (JGAS000297)

                            Colorectal cancer: 141 cases (JGAS000335)

                            Salivary duct carcinoma: 67 cases (JGAS000534)

            RNA access: MSI-H CRC: 18 cases (JGAS000113)

            Methylation array: MSI-H CRC: 93 cases (JGAS000113)

            SNP array: MSI-H CRC: 25 cases (JGAS000113)

            Target Capture: Pulmonary pleomorphic carcinoma: 20 cases (JGAS000297)

                                      Colorectal cancer: 67 cases (JGAS000335)

                                      Salivary duct carcinoma: 67 cases (JGAS000534)

            Iso-seq: Triple negative breast cancer: 14 cases (JGAS000095)

                          Estrogen receptor-positive breast cancer: 8 cases (JGAS000095)

URL: https://www.ncc.go.jp/en/ri/division/genetics/index.html

 

Dataset IDType of DataCriteriaRelease Date
JGAS000095 NGS (WGS, Exome, RNA-seq) Controlled-access (Type I) 2017/09/04
JGAS000105 NGS (Exome, RNA-seq) Controlled-access (Type I) 2018/01/22
JGAS000113

NGS (Exome, RNA-seq, RNA access),

Methylation array, SNP array

Controlled-access (Type I) 2019/02/12
JGAS000128 NGS (Exome, RNA-seq) Controlled-access (Type I) 2020/08/14
JGAS000215 NGS (Exome, RNA-seq) Controlled-access (Type I) 2020/10/26
JGAS000270 NGS (Exome, RNA-seq) Controlled-access (Type I) 2021/01/29
JGAS000297 NGS (Exome, RNA-seq, Target Capture) Controlled-access (Type I) 2021/07/02
JGAS000360 NGS (WGS) Controlled-access (Type I) 2021/10/22
JGAS000095 (Data addition) NGS (Iso-seq, RNA-seq) Controlled-access (Type I) 2021/10/26
JGAS000335 NGS (WGS, Exome, RNA-seq, Target Capture) Controlled-access (Type I) 2022/05/26
JGAS000534 NGS (Target Capture, RNA-seq) Controlled-access (Type I) 2022/11/08

*Release Note

* Data users need to apply an application for Using NBDC Human Data to reach the Controlled-access Data. Learn more

 

MOLECULAR DATA

WGS (JGAS000095)

Participants/Materials Triple negative breast cancer (ICD10: C50): 16 cases
Targets WGS
Target Loci for Capture Methods -
Platform Illumina [HiSeq 2000/2500]
Library Source DNAs extracted from tumor tissues and paired non-tumor tissues or peripheral blood cells as non-tumor tissues
Cell Lines -
Library Construction (kit name) NEBNext® Ultra™ DNA Library Prep Kit for Illumina®
Fragmentation Methods Ultrasonic fragmentation (Covaris)
Spot Type Paired-end
Read Length (without Barcodes, Adaptors, Primers, and Linkers) 103 bp
Japanese Genotype-phenotype Archive Dataset ID JGAD000095
Total Data Volume 8.94 TB (bam [ref: hg19])
Comments (Policies) NBDC policy

 

Exome

Participants/Materials

Triple negative breast cancer (ICD10: C50): 36 cases

Lung adenocarcinoma (ICD10: C349): 43 cases

MSI-H CRC (ICD10: C18, 19, 20): 149 cases

CRC with liver metastasis: 12 cases, CRC without liver metastasis: 16 cases (ICD10: C18, 19, 20)

Lung adenocarcinoma (ICD10: C34): 126 cases

Desmoid tumor (ICD10: D48.1): 64 cases

Pulmonary pleomorphic carcinoma (NCBI MedGen UID: 87198): 19 cases [30 samples] (tumor tissue: 12 cases [12 samples], non-tumor tissue: 18 cases [18 samples], paired samples from 11 cases)

Colorectal cancer (ICD10: C189): 128 cases [141 samples]

Targets Exome
Target Loci for Capture Methods -
Platform Illumina [HiSeq 2000/2500, NovaSeq 6000]
Library Source DNAs extracted from tumor tissues and paired non-tumor tissues or peripheral blood cells as non-tumor tissues
Cell Lines -
Library Construction (kit name) NEBNext® Ultra™ DNA Library Prep Kit for Illumina®, SureSelect Human All Exon V5, SureSelect Human All Exon V6
Fragmentation Methods Ultrasonic fragmentation (Covaris)
Spot Type Paired-end
Read Length (without Barcodes, Adaptors, Primers, and Linkers)

breast cancer and lung adenocarcinoma: 103 bp

MSI-H CRC: 104 bp or 135 bp

CRC: 100 bp or 103 bp

lung adenocarcinoma (JGAD000301): 134 bp or 137 bp

desmoid tumor: 134 bp or 137 bp

pulmonary pleomorphic carcinoma: 133 bp

colorectal cancer: 125 bp, 150 bp or 151 bp

Japanese Genotype-phenotype Archive Dataset ID

Triple negative breast cancer: JGAD000095

Lung adenocarcinoma: JGAD000110

MSI-H CRC: JGAD000122

CRC: JGAD000139

Lung adenocarcinoma: JGAD000301

Desmoid tumor: JGAD000376

Pulmonary pleomorphic carcinoma: JGAD000407

Colorectal cancer: JGAD000446

Total Data Volume

JGAD000095: 8.94 TB (bam [ref: hg19])

JGAD000110: 955.16 GB (bam [ref: hg19/hg38])

JGAD000122: 5.99 TB (bam [ref: hg38])

JGAD000139: 1.36 TB (bam [ref: hg38])

JGAD000301: 5 TB (bam [ref: hg38])

JGAD000376: 2.8 TB (bam [ref: hg38])

JGAD000407: 786.9 GB (bam [ref: hg19/hg38])

JGAD000446: 7.5 TB (bam [ref: hg38])

Comments (Policies)

NBDC policy (JGAD000095, JGAD000122, JGAD000139, JGAD000301, JGAD000376, JGAD000407, JGAD000446)

NBDC policy & Company User Limit (JGAD000110)

 

RNA-seq

Participants/Materials

Triple negative breast cancer (ICD10: C50): 23 cases (including 2 cases for JGAD000457)

Estrogen receptor-positive breast cancer (ICD10: C50): 17 cases

HER2-positive breast cance (ICD10: C50)r: 15 cases

Lung adenocarcinoma (ICD10: C349): 43 cases

MSI-H CRC (ICD10: C18, 19, 20): 94 cases

CRC with liver metastasis: 12 cases, CRC without liver metastasis: 16 cases (ICD10: C18, 19, 20)

Lung adenocarcinoma (ICD10: C34): 126 cases

Desmoid tumor (ICD10: D48.1): 55 cases

Pulmonary pleomorphic carcinoma (NCBI MedGen UID: 87198): 33 cases [48 samples] (fresh frozen samples: 12 cases [12 samples], FFPE samples: 21 cases [36 samples])

Colorectal cancer (ICD10: C189): 141 cases [170 samples] (including 14 cases used for RNA-seq only)

Salivary duct carcinoma (ICD10: C089): 67 cases [fresh-frozen tissue samples: 19, FFPE samples: 48]

Targets RNA-seq
Target Loci for Capture Methods -
Platform Illumina [HiSeq 2000/2500]
Library Source RNAs extracted from tumor tissues
Cell Lines -
Library Construction (kit name)

NEBNext® Ultra™ Directional RNA Library Prep Kit for Illumina®

FFPE samples of pulmonary pleomorphic carcinoma: TruSeq RNA Access Library Prep kit

Fragmentation Methods Heat treatment
Spot Type Paired-end
Read Length (without Barcodes, Adaptors, Primers, and Linkers)

Breast cancer and lung adenocarcinoma: 103 bp

MSI-H CRC: 104 bp or 135 bp

CRC (JGAD000139): 133 bp

Lung adenocarcinoma (JGAD000301): 134 bp or 137 bp

Desmoid tumor: 134 bp or 137 bp

Pulmonary pleomorphic carcinoma: 132 bp or 133 bp

Breast cancer (JGAD000457): 100 bp

Colorectal cancer: 125 bp or 127 bp

Salivary duct carcinoma: 132 bp or 133 bp

Japanese Genotype-phenotype Archive Dataset ID

Breast cancers: JGAD000095, JGAD000457

Lung adenocarcinoma: JGAD000111

MSI-H CRC: JGAD000122

CRC: JGAD000139

Lung adenocarcinoma: JGAD000301 (fastq)

Desmoid tumor: JGAD000376

Pulmonary pleomorphic carcinoma: JGAD000407

Colorectal cancer: JGAD000446

Salivary duct carcinoma: JGAD000653

Total Data Volume

JGAD000095: 8.94 TB (fastq)

JGAD000111: 1.31 TB (fastq)

JGAD000122: 5.99 TB (fastq)

JGAD000139: 1.91 TB (fastq)

JGAD000301: 5 TB (fastq)

JGAD000376: 2.8 TB (fastq)

JGAD000407: 786.9 GB (fastq)

JGAD000446: 7.5 TB (fastq)

JGAD000457: 41.2 GB (fastq)

JGAD000653: 970.3 GB (fastq)

Comments (Policies)

NBDC policy (JGAD000095, JGAD000122, JGAD000139, JGAD000301, JGAD000376, JGAD000407, JGAD000446, JGAD000457, JGAD000653)

NBDC policy & Company User Limit(JGAD000111)

 

RNA access

Participants/Materials MSI-H CRC (ICD10: C18, 19, 20): 18 cases
Targets RNA access
Target Loci for Capture Methods -
Platform Illumina [HiSeq 2000/2500]
Library Source RNAs extracted from tumor tissues
Cell Lines -
Library Construction (kit name) TruSeq® RNA Access Library Prep Kit
Fragmentation Methods Heat treatment
Spot Type Paired-end
Read Length (without Barcodes, Adaptors, Primers, and Linkers) 104 bp or 135 bp
Japanese Genotype-phenotype Archive Dataset ID JGAD000122 (fastq)
Total Data Volume 5.99 TB (fastq)
Comments (Policies) NBDC policy

 

Methylation array

Participants/Materials

MSI-H CRC (ICD10: C18, 19, 20): 93 cases

normal tissues: 20 samples

Targets Methylation array
Target Loci for Capture Methods -
Platform Illumina [Infinium Human MethylationEPIC BeadChip]
Source DNAs extracted from tumor tissues and paired non-tumor tissues
Cell Lines -
Library Construction (kit name) Infinium Human MethylationEPIC BeadChip Kit
Algorithms for Calculating Methylation-rate (software) GenomeStudio (Illumina)
Filtering Methods Detection P-value >= 0.05
Normalization of microarray -
Probe Number 867,926 Probe Numbers
Japanese Genotype-phenotype Archive Dataset ID JGAD000122
Total Data Volume 5.99 TB (tsv)
Comments (Policies) NBDC policy

 

SNP array

Participants/Materials: MSI-H CRC (ICD10: C18, 19, 20): 25 cases
Targets genome wide CNVs
Target Loci for Capture Methods -
Platform Illumina [HumanOmni2.5-8 BeadChip]
Source DNAs extracted from tumor tissues and paired non-tumor tissues
Cell Lines -
Library Construction (kit name) Infinium HumanOmni2.5-8 kit
Algorithm for detecting CNVs (software) GenomeStudio (Illumina)
Filtering Methods As described in 'Illumine Infinium Assay Ver.2.1.'
CNV number -
Japanese Genotype-phenotype Archive Dataset ID JGAD000122
Total Data Volume 5.99 TB (tsv)
Comments (Policies) NBDC policy

 

Target Capture

Participants/Materials

Pulmonary pleomorphic carcinoma (NCBI MedGen UID: 87198): 20 cases [50 samples] (tumor tissue: 20 cases [37 samples], non-tumor tissue: 13 cases [13 samples], paired samples from 13 cases)

Colorectal cancer (ICD10: C189): 31 cases [49 samples]

Salivary duct carcinoma (ICD10: C089): 67 cases [67 samples] (fresh-frozen tissue samples: 19, FFPE samples: 48, normal tissue surrounding tumors: 66 samples)

Targets Target Capture
Target Loci for Capture Methods

Todai OncoPanel

http://todaioncopanel.umin.jp/

Platform Illumina [HiSeq 2500]
Library Source DNAs extracted from tumor tissues and paired non-tumor tissues (FFPE)
Cell Lines -
Library Construction (kit name) SureSelectXT Custom kit
Fragmentation Methods Heat treatment
Spot Type Paired-end
Read Length (without Barcodes, Adaptors, Primers, and Linkers)

Pulmonary pleomorphic carcinoma: 132 bp or 133 bp

Colorectal cancer: 125 bp

Salivary duct carcinoma: 128 bp, 132 bp, 133 bp, 145 bp

Japanese Genotype-phenotype Archive Dataset ID

Pulmonary pleomorphic carcinoma: JGAD000407

Colorectal cancer: JGAD000446

Salivary duct carcinoma: JGAD000653

Total Data Volume

JGAD000407: 786.9 GB (bam [ref: hg19/hg38])

JGAD000446: 7.5 TB (bam [ref: hg38)

JGAD000653 : 970.3 GB (bam [ref: hg38)

Comments (Policies) NBDC policy

 

WGS (JGAS000360)

Participants/Materials

Lung adenocarcinoma (ICD10: C34): 7 cases (tumor tissues and paired non-tumor tissues for 3 cases; tumor tissues only for 4 cases)

MSI-H CRC (ICD10: C18, C19, C20): 12 cases (tumor tissues and paired non-tumor tissues)

Triple negative breast cancer (ICD10: C50): 11 cases (tumor tissues only)

Targets WGS
Target Loci for Capture Methods -
Platform 10x Genomics [Chromium Controller]
Library Source DNAs extracted from tumor and non-tumor tissues
Cell Lines -
Library Construction (kit name) 10x Genomics Chromium Genome Sequencing Solution
Fragmentation Methods 10x Genomics Chromium Genome Sequencing Solution
Spot Type Paired-end
Read Length (without Barcodes, Adaptors, Primers, and Linkers) 125 bp
Japanese Genotype-phenotype Archive Dataset ID JGAD000474
Total Data Volume 3.3 TB (fastq, bed, vcf [ref:hg38])
Comments (Policies) NBDC policy

 

Iso-seq

Participants/Materials

Triple negative breast cancer (ICD10: C50): 14 cases

Estrogen receptor-positive breast cancer (ICD10: C50): 8 cases

Targets Iso-seq
Target Loci for Capture Methods -
Platform PacBio [Sequel]
Library Source RNAs extracted from tumor tissues
Cell Lines -
Library Construction (kit name) SMRTbell Template Prep Kit 1.0
Fragmentation Methods -
Spot Type Single-end
Read Length (without Barcodes, Adaptors, Primers, and Linkers) -
Japanese Genotype-phenotype Archive Dataset ID JGAD000457
Total Data Volume 41.2 GB (fastq)
Comments (Policies) NBDC policy

 

WGS (JGAS000335)

Participants/Materials

Colorectal cancer (ICD10: C189): 21 cases (including 3 cases used for WES, 18 cases used for TGS)

Targets WGS
Target Loci for Capture Methods -
Platform Illumina [Novaseq6000]
Library Source DNAs extracted from tumor and non-tumor tissues
Cell Lines -
Library Construction (kit name) TruSeq DNA PCR-free Library Prep Kit
Fragmentation Methods Ultrasonic fragmentation (Covaris)
Spot Type Paired-end
Read Length (without Barcodes, Adaptors, Primers, and Linkers) 150 bp
Japanese Genotype-phenotype Archive Dataset ID JGAD000446
Total Data Volume 7.5 TB (bam [ref:hg38])
Comments (Policies) NBDC policy

 

DATA PROVIDER

Principal Investigator: Hiroyuki Mano / Shinji Kohsaka

Affiliation: Division of Cellular Signaling, National Cancer Center Research Institute

Project / Group Name: -

URL: https://www.ncc.go.jp/en/ri/division/genetics/index.html

Funds / Grants (Research Project Number):

NameTitleProject Number
Leading Advanced Projects for medical innovation, The Japan Agency for Medical Research and Development (AMED) Project for novel therapeutic targets in cancer -
KAKENHI Grant-in-Aid for Scientific Research (C) Oncogenic mutations of RAC small GTPases in Human cancers 26430106
Grant from The Princess Takamatsu Cancer Research Fund Analysis of molecular mechanisms of oncogenic activity of RAC small GTPase -
Project for Cancer Research and Therapeutic Evolution (P-CREATE), Japan Agency for Medical Research and Development (AMED) Elucidation of initiation and progression mechanism of human epithelial tumors towards identification of novel therapeutic targets JP17cm0106502
Program for an Integrated Database of Clinical and Genomic Information, Japan Agency for Medical Research and Development (AMED) Establishment of infrastructure for genomic medicine and construction of knowledge database JP18kk0205003
Practical Research for Innovative Cancer Control, Japan Agency for Medical Research and Development (AMED) Implementation of precision medicine by the application of an innovative method for functional evaluation of oncogenes. JP18ck0106252
Practical Research for Innovative Cancer Control, Japan Agency for Medical Research and Development (AMED) Comprehensive exploration of biomarkers by high-throughput functional analysis for the strategic drug development JP20ck0106536
Platform Program for Promotion of Genome Medicine, Japan Agency for Medical Research and Development (AMED) Informatics for analyzing de novo human genome assemblies JP19km0405204

 

PUBLICATIONS

TitleDOIDataset ID
1 Integrative analysis of genomic alterations in triple-negative breast cancer in association with homologous recombination deficiency doi: 10.1371/journal.pgen.1006853 JGAD000095
2 Inactivating mutations and hypermethylation of the NKX2-1/TTF-1 gene in non-terminal respiratory unit-type lung adenocarcinomas. doi: 10.1111/cas.13313 JGAD000110, JGAD000111
3 Fusion Kinases Identified by Genomic Analyses of Sporadic Microsatellite Instability-High Colorectal Cancers doi: 10.1158/1078-0432.CCR-18-1574 JGAD000122
4 Genomic profiles of colorectal carcinoma with liver metastases and newly identified fusion genes doi: 10.1111/cas.14127 JGAD000139
5 Identification of Novel CD74-NRG2α Fusion From Comprehensive Profiling of Lung Adenocarcinoma in Japanese Never or Light Smokers doi: 10.1016/j.jtho.2020.01.021 JGAD000301
6 Comprehensive molecular and clinicopathological profiling of desmoid tumors doi: 10.1016/j.ejca.2020.12.001 JGAD000376
7 Comprehensive molecular profiling of pulmonary pleomorphic carcinoma. doi: 10.1038/s41698-021-00201-3 JGAD000407
8 Multi-sample Full-length Transcriptome Analysis of 22 Breast Cancer Clinical Specimens with Long-Read Sequencing doi: 10.1101/2020.07.15.199851 JGAD000457
9 Exploration of predictive biomarkers for postoperative recurrence of stage II/III colorectal cancer using genomic sequencing doi: 10.1002/cam4.4710 JGAD000446
10 Identification of novel prognostic and predictive biomarkers in salivary duct carcinoma via comprehensive molecular profiling doi: 10.1038/s41698-022-00324-1 JGAD000653

 

USERS (Controlled-access Data)

Principal InvestigatorAffiliationCountry/RegionResearch TitleData in Use (Dataset ID)Period of Data Use
Subhajyoti De Rutgers Cancer Institute, Rutgers the State University of New Jersey JGAD000095 2018/08/06-2021/05/31
Youping Deng Department of Complementary and Integrative Medicine, University of Hawai Manoa JGAD000110, JGAD000111 2018/10/04-2025/07/10
Tetsuya Sato Research Department, Miraca Research Institute G.K. Search for Lynch syndrome colorectal cancer-related genes using contolled-access clinical sequence data JGAD000095, JGAD000122 2019/08/05-2020/03/31
Kouya Shiraishi Division of Genome Biology, National Cancer Research Institute Elucidation of immune-system networks between host and tumor based on genomic analysis JGAD000095, JGAD000110, JGAD000111, JGAD000122 2019/08/05-2023/03/31
Ruping Sun Department of laboratory medicine and pathology, University of Minnesota Accounting for Copy Number States in Quantifying Tumor Heterogeneity JGAD000095 2020/03/19-2022/08/01
Masaki Mandai Kyoto University Faculty of Medicene, department of Gynecology and Obstetrics Integrated analyses of omics (genomics, transcriptomics, proteomics and metabolomics) associated with clinical variables for developing indivisualizedtreatment in gynecological malignancy JGAD000095, JGAD000110, JGAD000111, JGAD000122 2020/03/13-2025/03/31
Hirofumi Nakaoka Department of Cancer Genome Research, Sasaki Institute Japan Genetic analysis of lung cancer JGAD000110, JGAD000301, JGAD000407 2022/08/06-2023/03/31
Michiaki Hamada Faculty of Science and Engineering, Waseda University Japan Construction of RNA-targeted Drug Discovery Database JGAD000095, JGAD000111, JGAD000122, JGAD000139, JGAD000301, JGAD000376, JGAD000407, JGAD000457 2022/12/26-2025/03/31
Nobuyuki Kakiuchi The Hakubi Center for Advanced Research, Kyoto University Japan Comprehensive analysis of genetic aberrations in solid tumors JGAD000122 2023/03/19-2024/03/31
Jin Wang Sun Yat-sen University Cancer Center China Genomics study of desmoid tumor JGAD000376 2023/09/06-2029/12/30
Charles Swanton The Francis Crick Institute United Kingdom of Great Britain and Northern Ireland Studying lung cancer evolution in smokers and never-smokers JGAD000110, JGAD000111, JGAD000301 2024/09/03-2026/05/07