NBDC Research ID: hum0501.v1

 

SUMMARY

Aims: The number of basal divergences, admixture, and the degrees of isolation among Indigenous North Eurasian and Native South American populations remain debated, with most insights derived from genome-wide genotyping data. This study aims to improve our understanding of the ancient dynamics and shaping of the contemporary populations of North Eurasia and the Americas. Using large-scale whole-genome sequencing of 1,537 individuals across 139 ethnic groups from these regions, we identify population structures, clarify the prehistoric migrations and the role of past environments in the diversification of human populations.

Methods: Whole genome sequencing

Participants/Materials: 1,323 healthy individuals

URL: https://www.genomeasia100k.org/

Data Set IDType of DataCriteriaRelease Date
JGAS000781 NGS (WGS) Controlled-access (Type I) 2025/05/15

*Release Note

*Data users need to apply an application for Using NBDC Human Data to reach the Controlled-access Data. Learn more

 

MOLECULAR DATA

 

JGAS000781

Participants/Materials

1,323 healthy individuals

   313 from ADRC (Asian DNA Repository Consortium)

   486 from RIMG (Research Institute of Medical Genetics SB RAMS in Russia)

   524 from GenomeAsia 100K Project (GenomeAsia 100K Consortium, 2019, Nature)

Targets WGS
Target Loci for Capture Methods -
Platform Illumina [HiSeq X]
Library Source gDNA extracted from saliva or peripheral blood
Cell Lines -
Library Construction (kit name) TruSeq DNA Nano
Fragmentation Methods Ultrasonic fragmentation
Spot Type Paired-end
Read Length (without Barcodes, Adaptors, Primers, and Linkers) 150 bp x 2
QC

Variants with VQSLOD <0 were excluded.

removed multi-allelic SNPs and indels, remaining only biallelic SNPs.

Deduplication SAMBLASTER
Calibration for re-alignment and base quality -
Mapping Methods BWA-MEM v0.7.13
Mapping Quality MAPQ = ~60
Reference Genome Sequence GRCh37
Coverage (Depth) >20X
Detecting Methods for Variation GATK v3.5
SNP Numbers (after QC) 52,589,813
INDEL Numbers (after QC) -
Japanese Genotype-phenotype Archive Dataset ID JGAD000923
Total Data Volume 310.8 GB (vcf, tabix)
Comments (Policies) NBDC policy and Company User Limit

 

DATA PROVIDER

Principal Investigator: Hie Lim Kim

Affiliation: Asian School of the Environment, Nanyang Technological University

Project / Group Name: -

Funds / Grants (Research Project Number):

Name Title Project Number
GenomeAsia 100K consortium

 

PUBLICATIONS

Title DOIData Set ID
1 From North Asia to South America: Tracing the longest human migration through genomic sequencing doi: 10.1126/science.adk5081 JGAD000923

 

USERS (Controlled-access Data)

Principal InvestigatorAffiliationCountry/RegionResearch TitleData in Use (Dataset ID)Period of Data Use