NBDC Research ID: hum0248.v1
Click to Latest version.
SUMMARY
Aims: To construct a Japanese reference genome sequence
Methods: Sequence data were obtained from each of three healthy Japanese males by the PacBio, Bionano, and Illumina HiSeq platforms. De novo assembly was performed for each individual, and then the three assemblies were integrated by meta-assembly. In addition, the majority variant was adopted by majority vote for the polymorphic sites among the three assemblies. Finally, the meta-scaffolds were anchored by markers from genetic and radiation hybrid maps and integrated as a pseudo-chromosome sequence set.
Participants/Materials: three Japanese male individuals
URL: https://www.megabank.tohoku.ac.jp/english/timeline/20190225_01/
Data Set ID | Type of Data | Criteria | Release Date |
---|---|---|---|
AP023461-AP024084 | Japanese reference genome sequence | Un-restricted Access | 2020/10/30 |
*When the research results including the data which were downloaded from NHA/DRA, are published or presented somewhere, the data user must refer the papers which are related to the data, or include in the acknowledgment. Learn more
MOLECULAR DATA
Participants/Materials | 3 Japanese male individuals |
Targets | WGS |
Target Loci for Capture Methods | - |
Platform |
PacBio [RS II] Bionano [Irys, Saphyr] Illumina [HiSeq 2500] |
Library Source | gDNA extracted from peripheral blood cells |
Cell Lines | - |
Library Construction (kit name) |
PacBio: DNA template prep kit 2.0 Bionano: DNA isolation in gel plug, treated with Proteinase K and RNase, solubilized plug with GELase, nick-label-repair with Nt.BspQI and Nb.BssSI for jg1a, or with direct labeling and staining for jg1b and jg1c. Illumina: TruSeq DNA PCR-Free HT sample prep kit |
Fragmentation Methods | Illumina: Ultrasonic fragmentation (Covaris) |
Spot Type | Illumina: Paired-end, Mate pair |
Read Length (without Barcodes, Adaptors, Primers, and Linkers) |
PacBio: 10 kb - jg1a: 10,589 bp (mean) - jg1b: 10,066 bp (mean) - jg1c: 9,226 bp (mean) Bionano: >146 kb - jg1a.BspQI: 318,216 bp (mean) - jg1a.BssSI: 228,101 bp (mean) - jg1b.DLS: 169,138 bp (mean) - jg1c.DLS: 146,026 bp (mean) Illumina: 162 or 259 bp |
QC Methods |
PacBio: QC with Falcon software with length_cutoff = 9000, length_cutoff_pr = 15000 Bionano: QC with BionanoSolve software with default settings Illumina: NA |
Genome Sequence Construction Methods |
1. highly contiguous de novo assembly: 1) PacBio long reads were de novo assembled to yield primary contigs 2) Bionano raw data were also de novo assembled (independent of the PacBio assembly) to yield genome maps 3) the PacBio-derived contigs were scaffolded by the Bionano genome maps 2. Polishing the hybrid scaffolds with Illumina short reads (paired-end) 3. Integrating and filling the gaps of the hybrid scaffolds of each individual with an aid of mate pair Illumina short reads 4. meta-assembly with Metassembler software 5. Anchoring scaffolds to chromosomes with genetic and radiation hybrid maps |
Coverage(Depth) |
PacBio: >122× - jg1a: 122× - jg1b: 123× - jg1c: 128× Bionano: >123× - jg1a.BspQI: 123× - jg1a.BssSI: 140× - jg1b: 160× - jg1c: 175× Illumina paired end: >26× - jg1a.162PE: 29× - jg1a.259PE: 26× - jg1b.162PE: 31× - jg1b.259PE: 28× - jg1c.162PE: 31× - jg1c.259PE: 26× Illumina mate-pair: >12× - jg1a: 13× - jg1b: 12× - jg1c: 12× |
Variation Detection Methods | SNVs between hs37d5 and JG1 in the autosomes and X chromosome were called using minimap2 and paftools software |
Single Nucleotide Variants Number | 2,501,575 SNVs |
Structural Variants Detection Methods | genome-by-genome alignement with minimap2 and paftools software between GRCh38 and JG1 |
Structural Varinats Number | 8,697 insertions and 6,190 deletions >50 bp in length. |
Mass Submission System ID | AP023461-AP024084 |
Total Data Volume | 821 MB (fasta) |
Comments (Policies) | NBDC policy |
DATA PROVIDER
Principal Investigator: Masayuki Yamamoto
Affiliation: Tohoku University School of Medicine, Tohoku Medical Megabank Organization
Project / Group Name: JRGA (Japanese Reference Genome Assembly)
URL: jMorp: https://jmorp.megabank.tohoku.ac.jp/
Funds / Grants (Research Project Number):
Name | Title | Project Number |
---|---|---|
Japan Agency for Medical Research and Development (AMED) | Tohoku Medical Megabank Project (Tohoku University) Special account of the Great East Japan Earthquake disaster recovery | JP20km0105001 |
Japan Agency for Medical Research and Development (AMED) | Tohoku Medical Megabank Project (Tohoku University) General accounting | JP20km0105002 |
Platform Program for Promotion of Genome Medicine, Japan Agency for Medical Research and Development (AMED) | Facilitation of R&D Platform for AMED Genome Mecidine Support | JP20km0405001 |
KAKENHI Grant-in-Aid for Scientific Research on Innovative Areas (Research in a proposed research area) | Constructive understanding of multi-scale dynamism of neuropsychiatric disorders | JP19H05200 |
KAKENHI Grant-in-Aid for Scientific Research (C) | NGS analysis of a large genome cohort by deep learning Research Project | JP19K06625 |
PUBLICATIONS
Title | DOI | Data Set ID | |
---|---|---|---|
1 | Construction and Integration of Three De Novo Japanese Human Genome Assemblies toward a Population-Specific Reference | doi:10.1101/861658 | AP023461-AP024084 |
2 |