NBDC Research ID: hum0174.v6
SUMMARY
Aims: To build a database of genomic structural variants in Japanese population
Methods: We sequenced genomic DNAs using PacBio, 10X Genomics and Nanopore sequencing technologies, and analyzed genomic structural variations.
Participants/Materials: Japanese (collected by Japanese B cell DNA bank)
| Dataset ID | Type of Data | Criteria | Release Date |
|---|---|---|---|
| JGAS000173 | NGS (WGS): Sequence raw data, Structural Variants data for each sample | Controlled-access (Type I) | 2020/10/06 |
| JGAS000173 (Data addition) | NGS (WGS) | Controlled-access (Type I) | 2020/11/27 |
| JGAS000580 | NGS (WGS) | Controlled-access (Type I) | 2023/06/29 |
| JGAS000286 | NGS (WGS): Sequence raw data, Structural Variants data for each sample | Controlled-access (Type I) | 2023/07/06 |
| JGAS000505 | NGS (WGS): Sequence raw data, haplotype data for each sample | Controlled-access (Type I) | 2023/07/10 |
| JGAS000596 | NGS (WGS) | Controlled-access (Type I) | 2023/12/28 |
* Data users need to apply an application for Using NBDC Human Data to reach the Controlled-access Data. Learn more
MOLECULAR DATA
| Participants/Materials: | Purified DNA from Japanese-origin B cell lines: 10 samples |
| Targets | WGS |
| Target Loci for Capture Methods | - |
| Platform |
1. PacBio [Sequel] 2. 10x Genomics [Chromium Controller] |
| Library Source | Purified DNA from Japanese-origin B cell lines |
| Cell Lines | the Health Science Research Resources Bank (HSRRB), the National Institutes of Biomedical Innovation, Health and Nutrition (NIBIOHN) |
| Library Construction (kit name) |
1. the library prep. kit for SMRT sequencing by Pacific Biosciences 2. 10X Genomics-Chromium system |
| Fragmentation Methods |
1. Megaruptor, g-tube 2. None |
| Spot Type |
1. Single-end 2. Paired-end |
| Read Length (without Barcodes, Adaptors, Primers, and Linkers) |
1. 14000 bp 2. 151 bp |
| QC Methods |
1. Qubit, Pulsed-field gel electrophoresis, TapeStation, Bioanalyzer 2. qPCR, Bioanalyzer |
| Mapping Methods |
1. minimap2 2. longranger by 10X Genomics |
| Depth (average) |
1. 29x 2. 19x |
| Structural Variants Detection Methods |
1. Sniffles 2. longranger by 10X Genomics |
| Polymorphism Number (after QC) |
1. 16870/sample 2. 11700/sample |
| Japanese Genotype-phenotype Archive Dataset ID | JGAD000251 |
| Total Data Volume | 1 TB (fastq, bam [ref: unmapped], bed, vcf [ref: hg38]) |
| Comments (Policies) | NBDC policy |
| Participants/Materials: | Purified DNA from Japanese-origin B cell liens: 11 samples |
| Targets | WGS |
| Target Loci for Capture Methods | - |
| Platform | PacBio [Sequel] |
| Library Source | Purified DNA from Japanese-origin B cell lines |
| Cell Lines | the Health Science Research Resources Bank (HSRRB), the National Institutes of Biomedical Innovation, Health and Nutrition (NIBIOHN) |
| Library Construction (kit name) | the library prep. kit for SMRT sequencing by Pacific Biosciences |
| Fragmentation Methods | Megaruptor, g-tube |
| Spot Type | Single-end |
| Read Length (without Barcodes, Adaptors, Primers, and Linkers) | 14000 bp |
| Japanese Genotype-phenotype Archive Dataset ID | JGAD000251 |
| Total Data Volume | 3.44 TB (bam) |
| Comments (Policies) | NBDC policy |
| Participants/Materials: | Purified DNA from Japanese-origin B cell liens: 1 samples |
| Targets | WGS |
| Target Loci for Capture Methods | MHC, LRC, Chr1, SMN1/SMN2 |
| Platform | Nanopore [PromethION] |
| Library Source | Purified DNA from Japanese-origin B cell lines |
| Cell Lines | the Health Science Research Resources Bank (HSRRB), the National Institutes of Biomedical Innovation, Health and Nutrition (NIBIOHN) |
| Library Construction (kit name) | Ultra-Long DNA Sequencing Kit (SQK-ULK001) |
| Fragmentation Methods | Transposase-based |
| Spot Type | Single-end |
| Read Length (without Barcodes, Adaptors, Primers, and Linkers) | 56.2 Kbp ~ 63.8 Kbp (N50) |
| Mapping Methods | minimap2 (v2.24) with "-x map-ont" |
| Mapping Quality | - |
| Reference Genome Sequence | T2T-CHM13v2.0 |
| Coverage (Depth) | 81x ~ 104x (median) |
| Japanese Genotype-phenotype Archive Dataset ID | JGAD000706 |
| Total Data Volume | 1.4 GB (bam) |
| Comments (Policies) | NBDC policy |
| Participants/Materials: |
Purified DNA from Japanese-origin B cell lines: 177 samples (CCS: 112 samples, CLR: 65 samples) |
| Targets | WGS |
| Target Loci for Capture Methods | - |
| Platform | PacBio [Sequel, Sequel II] |
| Library Source | Purified DNA from Japanese-origin B cell lines |
| Cell Lines | the Health Science Research Resources Bank (HSRRB), the National Institutes of Biomedical Innovation, Health and Nutrition (NIBIOHN) |
| Library Construction (kit name) | the library prep. kit for SMRT sequencing by Pacific Biosciences |
| Fragmentation Methods | Megaruptor, g-tube |
| Spot Type | Single-end |
| Read Length (without Barcodes, Adaptors, Primers, and Linkers) | 14000 bp |
| QC Methods | Qubit, Pulsed-field gel electrophoresis, TapeStation, Bioanalyzer |
| Mapping Methods | minimap2 |
| Depth (average) |
CCS: 9.5x CLR: 36x |
| SNV Call | DeepVariant |
| SNV Haplotyping | WhatsHap |
| Structural Variants Detection Methods | pbsv |
| diploid assembly | HiCanu |
| Japanese Genotype-phenotype Archive Dataset ID | JGAD000392 |
| Total Data Volume | 31.8 TB (bam, vcf, fasta) |
| Comments (Policies) | NBDC policy |
| Participants/Materials: | Purified DNA from Japanese-origin B cell lines: 177 + 30 samples |
| Targets | WGS |
| Target Loci for Capture Methods | - |
| Platform | PacBio [Sequel II] |
| Library Source | Purified DNA from Japanese-origin B cell lines |
| Cell Lines | the Health Science Research Resources Bank (HSRRB), the National Institutes of Biomedical Innovation, Health and Nutrition (NIBIOHN) |
| Library Construction (kit name) | the library prep. kit for SMRT sequencing by Pacific Biosciences |
| Fragmentation Methods | Megaruptor |
| Spot Type | Single-end |
| Read Length (without Barcodes, Adaptors, Primers, and Linkers) | 14,949 bp |
| QC Methods | Qubit, NanoDrop, TapeStation, Femto Pulse, Pulsed-field gel electrophoresis |
| Mapping Methods | minimap2 (hg38-no_alt) |
| Depth (average) | CCS: 9.06x |
| SNV Call | DeepVariant |
| SNV Haplotyping | WhatsHap |
| Structural Variants Detection Methods | pbsv |
| diploid assembly | HiCanu |
| Japanese Genotype-phenotype Archive Dataset ID |
JGAD000622 (177 samples) JGAD000725 (30 samples) |
| Total Data Volume |
JGAD000622: 7.5 TB (bam/vcf/contig_fasta for 30 samples, fastq for 147 samples) JGAD000725: 705 GB (fastq) |
| Comments (Policies) | NBDC policy |
DATA PROVIDER
Principal Investigator: Shinichi Morishita
Affiliation: Graduate School of Frontier Sciences, the University of Tokyo
Project / Group Name: -
Funds / Grants (Research Project Number):
| Name | Title | Project Number |
|---|---|---|
| Advanced Genome Research and Bioinformatics Study to Facilitate Medical Innovation, Platform Program for Promotion of Genome Medicine, Japan Agency for Medical Research and Development (AMED) | Informatics for analyzing de novo human genome assemblies | JP16km0405204 |
| Biobank - Construction and Utilization biobank for genomic medicine REalization, Japan Agency for Medical Research and Development (AMED) | Informatics for analyzing de novo human genome assemblies | JP21tm0424219 |
PUBLICATIONS
| Title | DOI | Dataset ID | |
|---|---|---|---|
| 1 | Rapid and ongoing evolution of repetitive sequence structures in human centromeres. | doi: 10.1126/sciadv.abd9230 | JGAD000251 |
| 2 | JTK: targeted diploid genome assembler | doi: 10.1093/bioinformatics/btad398 | JGAD000706 |
| 3 | A landscape of complex tandem repeats within individual human genomes | doi: 10.1038/s41467-023-41262-1 | JGAD000392 JGAD000622 |
USRES (Controlled-access Data)
| Principal Investigator | Affiliation | Country/Region | Research Title | Data in Use (Dataset ID) | Period of Data Use |
|---|---|---|---|---|---|
| Yuta Kochi | Department of Genomic Function and Diversity, Medical Research Institute, Tokyo Medical and Dental University | Japan | Genetic study of complex diseases through comprehensive analysis of functional genetic variations | JGAD000251 | 2023/04/10-2024/03/31 |
| Yukinori Okada | Department of Statistical Genetics, Osaka University Graduate School of Medicine | Japan | Elucidation of disease etiology by trans-layer omics analysis | JGAD000251 JGAD000392 JGAD000622 JGAD000725 |
2024/03/05-2029/03/31 |