NBDC Research ID: hum0182.v2

SUMMARY

Aims: Indetification of stractural variations by using of long read whole genome sequencing data

Methods: Whole genome sequencing with Nanopore sequencer, HiSeq 2000 and Genome Analyzer IIx (Illumina)

Participants/Materials: DNA samples from 2 Japanese individuals.

1) DNA extracted from normal blood cell of a liver cancer patient (ICGC: RK067 [hum0158])

2) HapMap sample (NA18943)

3) DNA extracted from normal blood cell of 174 liver cancer patients (ICGC: RK001-RK338 [hum0158])

Data Set ID	Type of Data	Criteria	Release Date
JGAS000180	NGS (WGS): RK067	Controlled-Access (Type I)	2019/06/20
DRA008482	NGS (WGS): NA18943	Unrestricted Access	2019/06/20
JGAS000180	NGS (WGS): RK001-RK338	Controlled-Access (Type I)	2020/05/12

*Release Note

* Data users need to apply an application for Using NBDC Human Data to reach the Controlled-access Data.

MOLECULAR DATA

JGAS000180 / DRA008482


Participants/Materials	1) RK067 (a liver cancer patient): 1 case 2) NA18943 (HapMap): 1 sample
Targets	WGS
Target Loci for Capture Methods	-
Platform	Nanopore［MinION］
Library Source	1) DNA extracted from blood sample (normal cell) of a liver cancer patient 2) HapMap DNA sample
Cell Lines	https://www.coriell.org/0/Sections/Search/Sample_Detail.aspx?Ref=NA18943&Product=DNA
Library Construction (kit name)	1D Ligation Sequencing Kit (Cat#SQK-LSK108)
Fragmentation Methods	g-TUBE (Covaris)
Spot Type	Single-end
Read Length (without Barcodes, Adaptors, Primers, and Linkers)	1) 7463 bp 2) 3479 bp
Japan Genotype-Phenotype Archive Data Set ID / DDBJ Sequence Read Archive ID	1) JGAD000261 2) DRA008482
Total Data Volume	1) 128 GB (fastq) 2) 79.7 GB (fastq)
Comments (Policies)	NBDC policy

When the research results including the data which were downloaded from NHA/DRA/JGA, are published or presented somewhere, the data user must refer the papers which are related to the data, or include in the acknowledgment. Learn more

JGAS000180


Participants/Materials	3) RK001-RK338 (liver cancer patients): 174 cases
Targets	WGS
Target Loci for Capture Methods	-
Platform	Illumina [HiSeq 2000, Genome Analyzer IIx]
Library Source	DNA extracted from blood sample (normal cell) of liver cancer patients
Cell Lines	-
Library Construction (kit name)	TruSeq DNA LT Sample Prep Kit, TruSeq Nano DNA Low Throughput Library Prep Kit, Paired-End DNA Sample Prep Kit, TruSeq Nano DNA Library Preparation Kit
Fragmentation Methods	Ultrasonic fragmentation (Covaris)
Spot Type	Paired-end
Read Length (without Barcodes, Adaptors, Primers, and Linkers)	100 bp
QC/Filtering Methods	-
Deduplication	Picard
Mapping Methods	bwa
Reference Genome Sequence	hg19
Coverage (Depth)	30X
Detecting Methods for Variation	VCMM (Shigemizu et al. Sci Rep (2013))
Detecting Methods for Structural Variation	IMSindel and joint-call recovery method (Shigemizu et al. Sci Rep (2018), Wong et al. Genome Med (2019)^*ref1)
SNV Numbers (after QC)	5,239,921
SV Numbers (after QC)	4,378
Japan Genotype-Phenotype Archive Data Set ID	JGAD000261
Total Data Volume	3 GB (VCF [ref: hg19])
Comments (Policies)	NBDC policy

DATA PROVIDER

Principal Investigators: Akihiro Fujimoto

Affiliation: Department of Drug Discovery Medicine, Graduate School of Medicine, Kyoto University

Project / Group Name: Japan Agency for Medical Research and Development (AMED)

Funds / Grants (Research Project Number):

Name	Title	Project Number
Platform Program for Promotion of Genome Medicine, Japan Agency for Medical Research and Development (AMED)	Development of advanced data analysis methods for genome sequencing	18km0405207h0003

PUBLICATIONS

	Title	DOI	Data Set ID
1	Identification of intermediate-sized deletions and inference of their impact on gene expression in a human population.	doi: 10.1186/s13073-019-0656-4	JGAD000261 DRA008482
2

USERS (Controlled-Access Data)

Principal Investigator:	Affiliation:	Data in Use (Data Set ID)	Period of Data Use