NBDC Research ID: hum0386.v1



Aims: Long-read sequencing data has a high error rate, and its handling has not been established. Therefore, it is important to establish useful bioinformatics analysis methods. Mapping-based analysis and whole-genome-assembly-based variant detection have problems such as errors due to repetitive sequences and lack of nucleotide sequence information. Compared to deletions, insertions are difficult to analyze because many of them are derived from repetitive sequences and sequence information is difficult to obtain. In this study, we developed a novel analysis method for long reads and applied it to whole genome analysis of two samples to elucidate the entire human insertion sequence and insertion mechanism.

Methods: We sequenced the genomic DNA of NA18943 from a B cell line using a single platform, MinION. The sequencing data totaled 231.6 Gbp (77× coverage).

Participants/Materials: NA18943 is a healthy sample used in the HapMap project. We performed whole genome sequencing of NA18943 using Oxford Nanopore sequencer.


Dataset IDType of DataCriteriaRelease Date
DRA015813 NGS (WGS) Unrestricted-access 2023/04/11

*Release Note

*When the research results including the data which were downloaded from NHA/DRA, are published or presented somewhere, the data user must refer the papers which are related to the data, or include in the acknowledgment. Learn more




Participants/Materials NA18943 (HapMap): 1 sample
Targets WGS
Target Loci for Capture Methods -
Platform Nanopore [MinION]
Library Source HapMap DNA sample
Cell Lines https://www.coriell.org/0/Sections/Search/Sample_Detail.aspx?Ref=NA18943&Product=DNA
Library Construction (kit name) 1D Ligation Sequencing Kit (Cat#SQK-LSK108)
Fragmentation Methods g-TUBE (Covaris)
Spot Type Single-end
Read Length (without Barcodes, Adaptors, Primers, and Linkers) 3.3-11.1 kbp (5.1 kbp on average)
DDBJ Sequence Read Archive ID DRA015813
Total Data Volume 470 GB (fastq)
Comments (Policies) NBDC policy



Principal Investigator: Akihiro Fujimoto

Affiliation: Graduate School of Medicine,The University of Tokyo

Project / Group Name: Department of Human Genetics

URL: http://www.humgenet.m.u-tokyo.ac.jp/index.en.html

Funds / Grants (Research Project Number):

NameTitleProject Number
Platform Program for Promotion of Genome Medicine, Japan Agency for Medical Research and Development (AMED) Development of advanced data analysis methods for genome sequencing JP20km0405207
KAKENHI Grant-in-Aid for Scientific Research on Innovative Areas (Research in a proposed research area) Deciphering Origin and Establishment of Japonesians mainly based on genome sequence data 18H05511



TitleDOIDataset ID
1 Localized assembly for long reads enables genome-wide analysis of repetitive regions at single-base resolution in human genomes doi: 10.1186/s40246-023-00467-7 DRA015813