NBDC Research ID: hum0103.v3
Click to Latest version.
SUMMARY
Aims: To investigate genomic alterations of Japanese biliary tract cancers
Methods: DNAs were extracted from biliary tract cancer and paired non-cancer (normal) tissues. NGS libaraies were prepared by using TruSeq DNA Sample Prep kit (Illumina) for whole genome sequencing (WGS) and Nextera Rapid Capture kit (Illumina) / Agilent SureSelect Exome V5 kit for whole exome sequencing (WES). Sequencing was perfomred by Illumina HiSeq 2000/2500 or NovaSeq 6000.
Participants/Materials: WGS and WES data of Japanese biliary tract cancer patients.
Dataset ID | Type of Data | Criteria | Release Date |
---|---|---|---|
JGAS000109 | Controlled-access (Type I) | 2018/02/27 | |
JGAS000109 (Data addition) | NGS (WGS) | Controlled-access (Type I) | 2019/06/21 |
JGAS000109 (Data addition) | bam/gvcf data of NGS(WGS) | Controlled-access (Type I) | 2021/07/13 |
*Data users need to apply an application for Using NBDC Human Data to reach the Controlled-access Data. Learn more
MOLECULAR DATA
Participants/Materials |
biliary tract cancer (ICD10: C22, 23, 24): 14 cases + 3 cases cancer tissues: 23 samples + 6 samples paired non-cancer tissues: 14 samples + 3 samples |
Targets | WGS |
Target Loci for Capture Methods | - |
Platform | Illumina [HiSeq 2000/2500, NovaSeq 6000] |
Library Source | DNAs extracted from cancer and paired non-cancer (normal) tissues from biliary tract cancer patients |
Cell Lines | - |
Library Construction (kit name) | TruSeq DNA Sample Prep kit |
Fragmentation Methods | Ultrasonic fragmentation (Covaris) |
Spot Type | Paired-end |
Read Length (without Barcodes, Adaptors, Primers, and Linkers) | 100 bp |
QC |
Data with bad base quality and high %GC content were removed. Aligment: Data matched for the following condition were removed. - Low mapping rate - Different insert size - Gender information mismatch between meta-data and genotype data - Suspected sex chromosome aberration Genotyping: GATK’s best practices includes a variant filtering step following Variant Quality Score Recalibration (VQSR) - DP/GP (DP < 5, GQ < 20, DP > 60, GQ < 95 ) - Heterozygosity (F>=0.05) - Hardy-Weinberg equilibrium (p < 10^-6) - Repeat & Low Complexity Principal Component Analysis (PCA): PCA was performed with individuals included in the 1000 genomes project and outliers from Japanese cluster were removed.
After these filtering steps, variants located in the regions listed as the HighConfidenceRegion (Genome-In-A-Bottle project) were flagged. |
Deduplication | Picard 2.10.6 |
Calibration for re-alignment and base quality | GATK 3.7 |
Mapping Methods | BWA mem 0.7.12 |
Mapping Quality | Reads with MAPQ< 20 were excluded at variant calling with GATK 3.7 HaplotypeCaller |
Reference Genome Sequence | GRCh37/hg19 (hs37d5) |
Coverage (Depth) | HiSeq 2000/2500: 31.8x, NovaSeq 6000: 28.0x |
Detecting Methods for Variation | GATK 3.7 HaplotypeCaller |
SNV Numbers (after QC) |
76,768,387 (Autosomal Chromosomes) 2,898,518 (X Chromosome) |
INDEL Numbers (after QC) |
10,202,908 (Autosomal Chromosomes) 410,435 (X Chromosome) |
Japanese Genotype-phenotype Archive Dataset ID |
JGAD000117 (fastq) (included in EGAS00001000678 [EGAD00001000809]) JGAD000403 (bam, vcf): Whole genome sequencing analyzed data included in the JGAD000117 were mapped to the GRCh37 reference genome sequence, and variant detection was carried out using the GATK (Genome Analysis Toolkit) standards. This project is an initiative of the GEnome Medical alliance Japan (GEM Japan, GEM-J). Lean more.. |
Total Data Volume | 2.4 TB (fastq) + 375 GB (fastq) + 1.6 TB (bam, vcf) |
Comments (Policies) | NBDC policy |
Participants/Materials |
biliary tract cancer (ICD10: C22, 23, 24): 81 cases cancer tissues: 138 samples paired non-cancer 128 samples |
Targets | Exome |
Target Loci for Capture Methods | - |
Platform | Illumina [HiSeq 2000/2500] |
Library Source | DNAs extracted from cancer and paired non-cancer (normal) tissues from biliary tract cancer patients |
Cell Lines | - |
Library Construction (kit name) | Nextera Rapid Capture kit or SureSelect Exome V5 kit |
Fragmentation Methods | Ultrasonic fragmentation (Covaris) |
Spot Type | Paired-end |
Read Length (without Barcodes, Adaptors, Primers, and Linkers) | 125 bp |
Japanese Genotype-phenotype Archive Dataset ID | JGAD000118 |
Total Data Volume | 1.7 TB (fastq) |
Comments (Policies) | NBDC policy |
DATA PROVIDER
Principal Investigator: Hidewaki Nakagawa
Affiliation: RIKEN Center for Integrative Medical Sciences
Project / Group Name: -
Funds / Grants (Research Project Number):
Name | Title | Project Number |
---|---|---|
PUBLICATIONS
Title | DOI | Dataset ID | |
---|---|---|---|
1 | Genomic characterization of biliary tract cancers identifies their driver genes and predisposing mutations. | doi: 10.1016/j.jhep.2018.01.009 |
JGAD000117 JGAD000118 |
2 |
USERS (Controlled-Access Data)
Principal Investigator | Affiliation | Research Title | Data in Use (Dataset ID) | Period of Data Use |
---|---|---|---|---|
Tatsuhiro Shibata | Division of Cancer Genomics, National Cancer Center Research Institute | JGAD000117, JGAD000118 | 2018/09/12-2022/09/30 | |
Kengo Kinoshita | Tohoku Medical Megabank Organization | Construction of Japanese whole genome database | JGAD000117 | 2019/06/24-2022/03/31 |