NBDC Research ID: hum0495.v2
SUMMARY
Aims: Atrial fibrillation (AF) is common in older adults, and AF-associated ischemic stroke can lead to reduced quality of life or a bedridden state. This study will collect multi-layered data—including genetic, clinical, physiological, and electrocardiographic information—from AF patients and healthy controls, and will develop AI-based algorithms to stratify the risks of AF onset and stroke. Our goal is to establish a foundation applicable to primary screening through health checkups and IoT devices, personalized preemptive medicine, and drug discovery or new drug development.
Methods: [hum0495.v1.gwas.v1] We performed whole exome sequencing and processed the sequencing data according to the best practices described in the Genome Analysis Toolkit (GATK). We also performed gene-based association tests, specifically burden tests, sequence kernel association test (SKAT), and SKAT-O.
[JGAS000866] Genotyping was performed using SNP arrays, followed by imputation with the 1000 Genomes reference panel, and GWAS was performed using covariate-adjusted logistic regression.
Participants/Materials: 1,176 PAF patients and 1,172 non-PAF patients
| Dataset ID | Type of Data | Criteria | Release Date |
|---|---|---|---|
| hum0495.v1.gwas.v1 | GWAS for PAF using whole exome sequencing data | Unrestricted-access | 2025/02/13 |
| JGAS000866 | GWAS for PAF | Controlled-access (Type I) | 2026/02/27 |
* Data users need to apply an application for Using NBDC Human Data to reach the Controlled-access Data. Learn more
*When the research results including the data which were downloaded from NHA/DRA, are published or presented somewhere, the data user must refer the papers which are related to the data, or include in the acknowledgment. Learn more
MOLECULAR DATA
| Participants/Materials |
PAF (ICD10: I48.0): 1,176 cases non-PAF (control) : 1,172 individuals |
| Targets | Exome / Genome wide SNPs |
| Target Loci for Capture Methods | - |
| Platform | Illumina [NovaSeq 6000] |
| Library Source | DNA extracted from peripheral blood cells |
| Cell Lines | - |
| Library Construction (kit name) | SureSelectXT Kit |
| Fragmentation Methods | Ultrasonic fragmentation (Covaris) |
| Spot Type | Paired-end |
| Read Length (without Barcodes, Adaptors, Primers, and Linkers) | 150 bp |
| Genotype Call Methods (software) | GATK HaplotypeCaller |
| Association Analysis & Meta Analysis (software) | Burden, SKAT, SKATO (R:package SKAT) |
| Filtering Methods |
Sample QC: (1) Sample call rate < 0.97 (2) Samples with sex mismatches were excluded. (3) One sample for each pair of second degree or closer relatives (kinship coefficient >0.088) was removed. (4) Samples with outliers in sample size, heterozygosity and missing rates were excluded. Variant QC: (1) genotype quality >= 20 (2) depth >=10 (3) allele balance (4) variant call rate >= 0.97 (5) Hardy‒Weinberg equilibrium P-values >1 × 10−8 (6) PCA |
| Marker Number (after QC) | 518,621 |
| NBDC Dataset ID |
(Click the gwas number to download files) |
| Total Data Volume | 822 KB (tsv) |
| Comments (Policies) | NBDC policy |
| Participants/Materials |
PAF (ICD10: I48.0): 1,038 cases non-PAF (control) : 744 individuals |
| Targets | Genome wide SNPs |
| Target Loci for Capture Methods | - |
| Platform | Illumina [Infinium Asian Screening Array] |
| Source | DNA extracted from peripheral blood cells |
| Cell Lines | - |
| Reagents (Kit, Version) | Infinium Asian Screening Array-24 v1.0 BeadChip |
| Genotype Call Methods (software) |
genotyping: GenomeStudio haplotype phasing: SHAPEIT2 imputation: Minimac3 Imputation reference: 1000 Genomes panel |
| Association Analysis (software) | PLINK v1.9 |
| Filtering Methods |
Sample QC: We excluded samples with (1) Sample call rate < 0.97 (2) Samples with sex mismatches (3) excess heterozygosity > mean ± 3SD (4) relatedness with PI_HAT > 0.185 (5) outlier samples from East Asian clusters in principal component analysis with 1000 Genomes Project samples. Variant QC: We excluded variants with (1) variant call rate < 0.95 (2) Hardy‒Weinberg equilibrium P < 1.0×10-6 (3) minor allele frequency < 0.01 (4) non- autosomal variants Post-imputation QC: minor allele frequency < 0.01 and imputation score (Rsq) < 0.3 |
| Marker Number (after QC) | 8,094,202 SNVs |
| Japanese Genotype-phenotype Archive Dataset ID | JGAD001009 |
| Total Data Volume | 971.9 MB (csv) |
| Comments (Policies) | NBDC policy |
DATA PROVIDER
Principal Investigator: Toshihiro Tanaka
Affiliation: Department of Human Genetics and Disease Diversity, Institute of Science Tokyo
Project / Group Name: -
Funds / Grants (Research Project Number):
| Name | Title | Project Number |
|---|---|---|
| Project for Medical Device and Healthcare, Japan Agency for Medical Research and Development (AMED) | Establishment of intelligent infrastructure of prevention and detection of atrial fibrillation | JP21he2102002 |
PUBLICATIONS
| Title | DOI | Dataset ID | |
|---|---|---|---|
| 1 | Rare genetic variants involved in increased risk of paroxysmal atrial fibrillation in a Japanese population | doi: 10.1038/s41598-025-97794-7 | hum0495.v1.gwas.v1 |
| 2 |
USRES (Controlled-access Data)
| Principal Investigator | Affiliation | Country/Region | Research Title | Data in Use (Dataset ID) | Period of Data Use |
|---|---|---|---|---|---|