NBDC Research ID: hum0158.v3
SUMMARY
Aims: To investigate genomic alterations of Japanese liver cancers
Methods: DNAs and RNAs were extracted from liver cancer tissues and paired non-cancer (normal) tissues or blood samples. NGS libaraies were prepared for whole genome sequencing (WGS) and RNA-seq. Sequencing was perfomred by Illumina HiSeq or Genome Analyzer.
Participants/Materials: DNAs and RNAs extracted from cancer tissues and normal tissues of Japanese liver cancer patients.
Dataset ID | Type of Data | Criteria | Release Date |
---|---|---|---|
JGAS000151 | Controlled-access (Type I) | 2018/10/22 | |
JGAS000151 (Data addition) | NGS (WGS) | Controlled-access (Type I) | 2019/06/21 |
JGAS000151 (Data addition) | bam/gvcf data of NGS (WGS) | Controlled-access (Type I) | 2021/07/13 |
*Data users need to apply an application for Using NBDC Human Data to reach the Controlled-access Data. Learn more
MOLECULAR DATA
Participants/Materials |
liver cancer (ICD10: C220, 221, 227): 258 cases + 5 cases cancer tissues: 301 samples + 5 samples paired non-cancer tissues: 265 samples (257 blood samples, 3 liver tissues + 5 blood samples) |
Targets | WGS |
Target Loci for Capture Methods | - |
Platform | Illumina [HiSeq 2000, Genome Analyzer IIx, NovaSeq 6000] |
Library Source | DNAs extracted from cancer tissues and paired non-cancer tissues or blood samples from liver cancer patients |
Cell Lines | - |
Library Construction (kit name) | TruSeq DNA LT Sample Prep Kit, TruSeq Nano DNA Low Throughput Library Prep Kit, Paired-End DNA Sample Prep Kit, TruSeq Nano DNA Library Preparation Kit |
Fragmentation Methods | Ultrasonic fragmentation (Covaris) |
Spot Type | Paired-end |
Read Length (without Barcodes, Adaptors, Primers, and Linkers) | 100 bp |
QC |
Data with bad base quality and high %GC content were removed. Aligment: Data matched for the following condition were removed. - Low mapping rate - Different insert size - Gender information mismatch between meta-data and genotype data - Suspected sex chromosome aberration Genotyping: GATK’s best practices includes a variant filtering step following Variant Quality Score Recalibration (VQSR) - DP/GP (DP < 5, GQ < 20, DP > 60, GQ < 95 ) - Heterozygosity (F>=0.05) - Hardy-Weinberg equilibrium (p < 10^-6) - Repeat & Low Complexity Principal Component Analysis (PCA): PCA was performed with individuals included in the 1000 genomes project and outliers from Japanese cluster were removed.
After these filtering steps, variants located in the regions listed as the HighConfidenceRegion (Genome-In-A-Bottle project) were flagged. |
Deduplication | Picard 2.10.6 |
Calibration for re-alignment and base quality | GATK 3.7 |
Mapping Methods | BWA mem 0.7.12 |
Mapping Quality | Reads with MAPQ<20 were excluded at variant calling with GATK 3.7 HaplotypeCaller |
Reference Genome Sequence | GRCh37/hg19 (hs37d5) |
Coverage (Depth) | HiSeq 2000: 31.8x, Genome Analyzer IIx: 30.0x, NovaSeq 6000: 28.0x |
Detecting Methods for Variation | GATK 3.7 HaplotypeCaller |
SNV Numbers (after QC) |
76,768,387 (Autosomal Chromosomes) 2,898,518 (X Chromosome) |
INDEL Numbers (after QC) |
10,202,908 (Autosomal Chromosomes) 410,435 (X Chromosome) |
Japanese Genotype-phenotype Archive Dataset ID |
JGAD000228 (fastq) JGAD000404 (bam/vcf files of non-tumor tissues derived from 220 liver cancer patients): Whole genome sequencing analyzed data included in the JGAD000117 were mapped to the GRCh37 reference genome sequence, and variant detection was carried out using the GATK (Genome Analysis Toolkit) standards. This project is an initiative of the GEnome Medical alliance Japan (GEM Japan, GEM-J). Lean more.. |
Total Data Volume | 48 TB (fastq) + 581 GB (fastq) + 25.2 TB (bam, vcf) |
Comments (Policies) | NBDC policy |
Participants/Materials: |
liver cancer (ICD10: C220, 221, 227): 238 cases cancer tissues: 238 samples paired non-cancer tissues: 201 samples |
Targets | RNA-seq |
Target Loci for Capture Methods | - |
Platform | Illumina [HiSeq 2000, Genome Analyzer IIx] |
Library Source | RNAs extracted from cancer tissues and paired non-cancer tissues from liver cancer patients |
Cell Lines | - |
Library Construction (kit name) | TruSeq RNA Sample Prep Kit v2 or TruSeq Stranded mRNA Library Prep Kit |
Fragmentation Methods | Heat treatment |
Spot Type | Paired-end |
Read Length (without Barcodes, Adaptors, Primers, and Linkers) | 100 bp |
Japanese Genotype-phenotype Archive Dataset ID | JGAD000229 |
Total Data Volume | 3 TB (fastq) |
Comments (Policies) | NBDC policy |
DATA PROVIDER
Principal Investigator: Hidewaki Nakagawa
Affiliation: RIKEN Center for Integrative Medical Sciences
Project / Group Name: -
Funds / Grants (Research Project Number):
Name | Title | Project Number |
---|---|---|
PUBLICATIONS
Title | DOI | Dataset ID | |
---|---|---|---|
1 | Whole-genome mutational landscape and characterization of noncoding and structural mutations in liver cancer. | doi: 10.1038/ng.3547 |
JGAD000228 JGAD000229 |
2 | Genomic and Transcriptomic Profiling of Combined Hepatocellular and Intrahepatic Cholangiocarcinoma Reveals Distinct Molecular Subtypes. | doi: 10.1016/j.ccell.2019.04.007 |
JGAD000228 JGAD000229 |
USERS (Controlled-access Data)
Principal Investigator | Affiliation | Country/Region | Research Title | Data in Use (Dataset ID) | Period of Data Use |
---|---|---|---|---|---|
Kengo Kinoshita | Tohoku Medical Megabank Organization | Construction of Japanese whole genome database | JGAD000228 | 2019/06/24-2022/03/31 | |
Teruhisa Hochin | Faculty of Information and Human Sciences, Kyoto Institute of Technology | Analysis of mutation of human genome using Machine Learning | JGAD000228, JGAD000229 | 2019/07/09-2021/03/31 | |
Kouya Shiraishi | Division of Genome Biology, National Cancer Research Institute | Elucidation of immune-system networks between host and tumor based on genomic analysis | JGAD000228, JGAD000229 | 2019/08/05-2023/03/31 | |
Osamu Ogasawara | Bioinformation and DDBJ Center, National Institute of Genetics | Evaluation of human genome analysis workflow using JGA/AGD genome data. | JGAD000228, JGAD000229 | 2019/10/11-2024/03/31 | |
Kichoon Lee | Department of Animal Sciences, The Ohio State University | Investigation on allele-specific gene expressions in humans | JGAD000228, JGAD000229 | 2021/05/24-2021/12/31 | |
Jinyan Huang | School of Medicine, Zhejiang University | Comprehensive analysis of alternative splicing in malignant tumors | JGAD000228, JGAD000404 | 2022/03/07-2024/01/01 | |
Michiaki Hamada | Faculty of Science and Engineering, Waseda University | Japan | Construction of RNA-targeted Drug Discovery Database | JGAD000229 | 2022/12/26-2025/03/31 |