NBDC Research ID: hum0197.v6
Click to Latest version.
SUMMARY
Aims: Elucidation of disease biology based on trans-omics analysis, GWAS in the Japanese and trans-ethnic populations
Methods: Metagenome shotgun sequencing, genome-wide association study (GWAS), small RNA-seq and eQTL analyses
Participants/Materials:
Metagenomic data of gut microbiome in the Japanese population (95 + 103 + 227 + 30 individuals)
Autoimmune pulmonary alveolar proteinosis cases: 198, Control participants: 395
Populations: Biobank Japan (n = 179,000), UK biobank (n = 361,000), ans FinnGen (n = 136,000), Phenotypes: 215
141 Japanese individuals
Data Set ID | Type of Data | Criteria | Release Date |
---|---|---|---|
JGAS000205 | Metagenome | Controlled Access (Type I) | 2019/11/15 |
hum0197.v2.gwas.v1 | GWAS for autoimmune pulmonary alveolar proteinosis | Un-restricted Access | 2020/11/27 |
JGAS000260 | Metagenome | Controlled Access (Type I) | 2020/11/27 |
hum0197.v3.gwas.v1 | GWAS for 215 phenotypes | Un-restricted Access | 2021/03/22 |
JGAS000316 | Metagenome | Controlled Access (Type I) | 2021/10/12 |
JGAS000415 | Metagenome | Controlled Access (Type I) | 2021/12/10 |
hum0197.v5.gwas.v1 | GWAS for 10 phenotypes | Un-restricted Access | 2021/12/21 |
hum0197.v5.finemap.v1 | Fine-mapping for 79 phenotypes | Un-restricted Access | 2021/12/21 |
JGAS000504 | Read count data of miRNA | Controlled Access (Type I) | 2022/02/08 |
hum0197.v6.eqtl.v1 | eQTL data | Un-restricted Access | 2022/02/08 |
*Data users need to apply an application for Using NBDC Human Data to reach the Controlled-access Data. Learn more
*When the research results including the data which were downloaded from NHA/DRA, are published or presented somewhere, the data user must refer the papers which are related to the data, or include in the acknowledgment. Learn more
MOLECULAR DATA
Participants/Materials: | 95 Japanese individuals |
Targets | Metagenome |
Target Loci for Capture Methods | - |
Platform | Illumina [HiSeq 3000] |
Library Source | DNA extracted from gut microbiome |
Cell Lines | - |
Library Construction (kit name) | KAPA Hyper Prep Kit |
Fragmentation Methods | Ultrasonic fragmentation (Covaris) |
Spot Type | Paired-end |
Read Length (without Barcodes, Adaptors, Primers, and Linkers) | 150 bp |
Japanese Genotype-phenotype Archive Data set ID | JGAD000290 |
Total Data Volume | 477 GB (fastq) |
Comments (Policies) | NBDC policy |
Participants/Materials |
Autoimmune pulmonary alveolar proteinosis cases (ICD10: J840): 198 Control participants: 395 |
Targets | genome wide SNPs |
Target Loci for Capture Methods | - |
Platform | Illumina [Infinium Asian Screening Array] |
Library Source | DNAs extracted from peripheral blood cells |
Cell Lines | - |
Reagents (Kit, Version) | Infinium Asian Screening Array |
Genotype Call Methods (software) | GenomeStudio for genotyping, shapeit2 for haplotype phasing, and minimac3 for imputation |
Association Analysis (software) | PLINK2 |
Filtering Methods |
Sample QC: We excluded samples with low genotyping call rates (call rate < 98%) and in close genetic relation (PI_HAT > 0.175). We included samples of the estimated East Asian ancestry. Variant QC: We excluded variants with (1) genotyping call rate < 98%, (2) P value for Hardy–Weinberg equilibrium < 1.0 × 10−6, and (3) minor allele count < 5, or (4) > 10% frequency difference with the imputation reference panel. |
Marker Number (after QC) | 12,153,232 autosomal variants and 242,876 X-chromosomal variants after QC. |
NBDC Data Set ID |
(Click the Data Set ID to download the file) |
Total Data Volume | 390MB for autosome (txt.gz) and 19MB for X chromosome (txt.gz) |
Comments (Policies) | NBDC policy |
Participants/Materials: | 103 Japanese individuals |
Targets | Metagenome |
Target Loci for Capture Methods | - |
Platform | Illumina [HiSeq 3000] |
Library Source | DNA extracted from gut microbiome |
Cell Lines | - |
Library Construction (kit name) | KAPA Hyper Prep Kit |
Fragmentation Methods | Ultrasonic fragmentation (Covaris) |
Spot Type | Paired-end |
Read Length (without Barcodes, Adaptors, Primers, and Linkers) | 150 bp |
Japanese Genotype-phenotype Archive Data set ID | JGAD000363 |
Total Data Volume | 408 GB (fastq) |
Comments (Policies) | NBDC policy |
Participants/Materials | Biobank Japan (n = 179,000), UK biobank (n = 361,000), FinnGen (n = 136,000), no. Phenotypes: 215 |
Targets | genome wide SNPs |
Target Loci for Capture Methods | - |
Platform |
BBJ: Illumina [HumanOmniExpressExome BeadChip, HumanOmniExpress BeadChip, HumanExome BeadChip] UK Biobank: Applied Biosystems [UK BiLEVE Axiom Array, UK Biobank Axiom Array] FinnGen: Thermo Fisher Scientific [FinnGen1 ThermoFisher Array or other genotyping arrays] |
Library Source | DNAs extracted from peripheral blood cells |
Cell Lines | - |
Reagents (Kit, Version) |
BBJ: HumanOmniExpressExome BeadChip, HumanOmniExpress BeadChip, HumanExome BeadChip UK Biobank: UK BiLEVE Axiom Array, UK Biobank Axiom Array FinnGen: FinnGen1 ThermoFisher Array or other genotyping arrays |
Genotype Call Methods (software) |
BBJ: Eagle, Minimac3 UK Biobank: IMPUTE4 FinnGen: beagle4.1 |
Association Analysis (software) |
For binary traits, SAIGE software was used with age, age2, sex, age×sex, age2×sex, and top 20 principal components as covariates. For quantitative traits (biomarkers), BOLT-LMM or plink software was used with the same covariates.
|
Filtering Methods |
BBJ: We included imputed variants with Rsq > 0.7. UK Biobank: We excluded the variants with (i) INFO score ≤ 0.8, (ii) MAF ≤ 0.0001 (except for missense and protein-truncating variants annotated by VEP, which were excluded if MAF ≤ 1 × 10-6), and (iii) PHWE ≤ 1 × 10-10. FinnGen: We excluded variants with an imputation INFO score < 0.8 or MAF < 0.0001. |
Marker Number (after QC) |
BBJ: 13,530,797 variants UK Biobank: 13,791,467 variants FinnGen: 16,859,359 variants |
NBDC Data Set ID |
(Click the Data Set ID to download the file) |
Total Data Volume |
BBJ: ~1.5G for autosome and ~33M for chrX UK Biobank: ~1.5G for autosome and ~15M for chrX FinnGen: ~740M for autosome and ~20M for chrX |
Comments (Policies) | NBDC policy |
Participants/Materials: | 227 Japanese individuals |
Targets | Metagenome |
Target Loci for Capture Methods | - |
Platform | Illumina [HiSeq 3000, NovaSeq 6000] |
Library Source | DNA extracted from gut microbiome |
Cell Lines | - |
Library Construction (kit name) | KAPA Hyper Prep Kit |
Fragmentation Methods | Ultrasonic fragmentation (Covaris) |
Spot Type | Paired-end |
Read Length (without Barcodes, Adaptors, Primers, and Linkers) | 150 bp |
Japanese Genotype-phenotype Archive Data set ID | JGAD000427 |
Total Data Volume | 881.2 GB (fastq) |
Comments (Policies) | NBDC policy |
Participants/Materials: | 30 Japanese individuals |
Targets | Metagenome |
Target Loci for Capture Methods | - |
Platform | Illumina [HiSeq 3000] |
Library Source | DNA extracted from gut microbiome |
Cell Lines | - |
Library Construction (kit name) | KAPA Hyper Prep Kit |
Fragmentation Methods | Ultrasonic fragmentation (Covaris) |
Spot Type | Paired-end |
Read Length (without Barcodes, Adaptors, Primers, and Linkers) | 150 bp |
Japanese Genotype-phenotype Archive Data set ID | JGAD000532 |
Total Data Volume | 106.7 GB (fastq) |
Comments (Policies) | NBDC policy |
hum0197.v5.gwas.v1 / hum0197.v5.finemap.v1
Participants/Materials | Biobank Japan (n = 179,000), no. Phenotypes: 79 |
Targets | genome wide SNPs |
Target Loci for Capture Methods | - |
Platform | Illumina [HumanOmniExpressExome BeadChip, HumanOmniExpress BeadChip, HumanExome BeadChip] |
Library Source | DNAs extracted from peripheral blood cells |
Cell Lines | - |
Reagents (Kit, Version) | HumanOmniExpressExome BeadChip, HumanOmniExpress BeadChip, HumanExome BeadChip |
Genotype Call Methods (software) | Eagle, Minimac3 |
Association Analysis (software) |
GWAS: For binary traits, SAIGE software was used with age, age2, sex, age×sex, age2×sex, and top 20 principal components as covariates. For quantitative traits (biomarkers), BOLT-LMM was used with the same covariates. Fine-mapping: FINEMAP and SuSiE were used with GWAS summary statistics and in-sample dosage LD, allowing up to 10 causal variants per region. |
Filtering Methods |
GWAS: We included imputed variants with Rsq > 0.7. For binary traits, variants with MAC < 10 were additionally excluded. Fine-mapping: We defined fine-mapping regions based on a 3 Mb window around each lead variant and merged regions if they overlapped. We excluded the major histocompatibility complex (MHC) region (chr 6: 25–36 Mb) from analysis due to extensive LD structure in the region. For each method, we only included variants from successfully fine-mapped regions while excluding those from failed regions (e.g., due to conversion failure or available memory restrictions). |
Marker Number (after QC) | 13,531,752 variants (ref: hg19) |
NBDC Data Set ID |
hum0197.v5.gwas.v1 / hum0197.v5.finemap.v1 (Click the Data Set ID to download the file) |
Total Data Volume | 14 GB |
Comments (Policies) | NBDC policy |
Participants/Materials: | 141 Japanese individuals |
Targets | small RNA-seq |
Target Loci for Capture Methods | - |
Platform | Illumina [HiSeq 2500] |
Library Source | RNAs extracted from PBMC |
Cell Lines | - |
Library Construction (kit name) | SMARTer smRNA-Seq Kit |
Fragmentation Methods | - |
Spot Type | Single-end |
Read Length (without Barcodes, Adaptors, Primers, and Linkers) | 100 bp |
Mapping Methods | bowtie (GRCh37) |
Detecting method for read count (software) | featureCounts + miRbase v22 |
QC | We performed adapter trimming using Cutadapt v1.8 and removed reads with a low quality score (Phred quality score < 20 in >20% of total bases) using fastp v0.20.0. Also, we removed reads with a length of >29 bp or <15 bp, which are not expected to be mature miRNAs. Mature miRNAs detected with ≥1 read in at least half of the individuals were included in the dataset. |
miRNA number | 343 |
Japanese Genotype-phenotype Archive Data set ID | JGAD000621 |
Total Data Volume | 54.7 KB (txt) |
Comments (Policies) | NBDC policy |
Participants/Materials | 141 Japanese individuals |
Targets | eQTL |
Target Loci for Capture Methods | - |
Platform |
small RNA-seq: Illumina [HiSeq 2500] WGS: Illumina [HiSeq X Ten] |
Library Source | read count data of JGAS000504 and whole genome sequencing data using genomic DNA exracted from whole blood |
Cell Lines | - |
Reagents (Kit, Version) |
small RNA-seq: See JGAS000504 WGS: TruSeq DNA PCR-Free Library Preparation Kit |
Genotype Call / Detecting read count Methods (software) |
See JGAS000504 for read count data. WGS: Sequenced reads were aligned against the reference human genome with the decoy sequence (GRCh37, human_g1k_v37_decoy) using BWA-MEM v0.7.13. |
QC |
See JGAS000504 for read count data. WGS: We removed the variants (i)with low genotyping call rates (<0.90), (ii)with ExcessHet > 60 or (iii) with Hardy–Weinberg Pvalue < 1.0 × 10−10. Genotype refinement was performed using Beagle v5.1. |
Marker Number (after QC) |
See JGAS000504 for read count data. WGS: 12,171,854 variants |
eQTL algorithm | We analyzed the association between genetic variants with minor allele frequency (MAF) ≥ 0.01 within a cis-window around each miRNA (±1 Mb of the mature miRNA) and normalized expression values using MatrixEQTL v2.3. |
NBDC Data Set ID |
(Click the Data Set ID to download the file) |
Total Data Volume | 1.1 MB (txt) |
Comments (Policies) | NBDC policy |
DATA PROVIDER
Principal Investigator: Yukinori Okada
Affiliation: Department of Statistical Genetics, Osaka University Graduate School of Medicine
Project / Group Name: -
Funds / Grants (Research Project Number):
Name | Title | Project Number |
---|---|---|
Precursory Research for Innovative Medical care (PRIME), Advanced Research & Development Programs for Medical Innovation, Japan Agency for Medical Research and Development (AMED) | Crosstalk among microbiome, host, disease, and drug discovery enhanced by statistical genetics | JP19gm6010001 |
FORCE, Advanced Research & Development Programs for Medical Innovation, Japan Agency for Medical Research and Development (AMED) | Elucidation of disease-specific microbiota and personalized medicine by metagenome-wide association studies | JP20gm4010006 |
Practical Research Project for Rare / Intractable Diseases, Japan Agency for Medical Research and Development (AMED) | Biology and in silico drug repositioning of pulmonary alveolar proteinosis using trans-layer omics analysis | JP20ek0109413 |
Practical Research Project for Allergic Diseases and Immunology, Japan Agency for Medical Research and Development (AMED) | Nucleic genome drug discovery for autoimmune diseases through in-silico and patient-oriented screening utilizing large-scale disease genetics | JP19ek0410041 |
Practical Research Project for Allergic Diseases and Immunology, Japan Agency for Medical Research and Development (AMED) | Genomic prediction medicine of rheumatoid arthritis based on comprehensive immune-omics resources | JP21ek0410075 |
Platform Program for Promotion of Genome Medicine, Japan Agency for Medical Research and Development (AMED) | Implementation of genomic prediction medicine based on statistical genetics | JP21km0405211 |
Platform Program for Promotion of Genome Medicine, Japan Agency for Medical Research and Development (AMED) | Next-generation genomics analyses elucidates biology, personalized medicine, and drug discovery of psoriasis | JP21km0405217 |
KAKENHI Grant-in-Aid for Scientific Research (A) | Elucidation of disease biology and tissue specificity by trans-layer omics analysis and whole-genome sequencing | 19H01021 |
PUBLICATIONS
Title | DOI | Data Set ID | |
---|---|---|---|
1 | Metagenome-wide association study of gut microbiome revealed novel aetiology of rheumatoid arthritis in the Japanese population. | doi: 10.1136/annrheumdis-2019-215743 | JGAD000290 |
2 | Genetic determinants of risk in autoimmune pulmonary alveolar proteinosis. | doi: 10.1038/s41467-021-21011-y | hum0197.v2.gwas.v1 |
3 | A metagenome-wide association study of gut microbiome in patients with multiple sclerosis revealed novel disease pathology. | doi: 10.3389/fcimb.2020.585973 | JGAD000363 |
4 | A global atlas of genetic associations of 220 deep phenotypes | doi: 10.1101/2020.10.23.20213652 | hum0197.v3.gwas.v1 |
5 | Metagenome-wide association study revealed disease-specific landscape of the gut microbiome of systemic lupus erythematosus in Japanese | doi: 10.1136/annrheumdis-2021-220687 | JGAD000427 |
6 | Whole gut virome analysis of 476 Japanese revealed a link between phage and autoimmune disease | doi: 10.1136/annrheumdis-2021-221267 | JGAD000532 |
7 | Insights from complex trait fine-mapping across diverse populations | doi: 10.1101/2021.09.03.21262975 |
hum0197.v5.gwas.v1 hum0197.v5.finemap.v1 |
8 | Genetic architecture of microRNA expression and its link to complex diseases in the Japanese population. | doi: 10.1093/hmg/ddab361 |
JGAD000621 hum0197.v6.eqtl.v1 |
USRES (Controlled-Access Data)
Principal Investigator | Affiliation | Research Title | Data in Use (Data Set ID) | Period of Data Use |
---|---|---|---|---|