NBDC Research ID: hum0197.v18
Click to Latest version.
SUMMARY
Aims: Elucidation of disease biology based on trans-omics analysis, GWAS in the Japanese and trans-ethnic populations, Elucidation of the mechanism of COVID-19 severity
Methods: Metagenome shotgun sequencing, genome-wide association study (GWAS), small RNA-seq and eQTL analyses
Participants/Materials:
Metagenomic data of gut microbiome in the Japanese population (95 + 103 + 227 + 30 + 136 individuals)
Autoimmune pulmonary alveolar proteinosis cases: 198, Control participants: 395
Populations: Biobank Japan (n = 179,000), UK biobank (n = 361,000), and FinnGen (n = 136,000), Phenotypes: 220
141 Japanese individuals
Metagenomic data of gut microbiome in Inflammatory Bowel Disease (35 Ulcerative Colitis and 39 Crohn's disease) and 40 Healthy controls
Intracranial germ cell tumors cases: 133, Control participants: 762
Populations: Biobank Japan (n = 161,801) and UK biobank (n = 377,583), Phenotypes: 9
PBMC from Japanese population (COVID-19: n = 30 + 43, Healthy controls: n = 31 + 44)
Microbial genome: Metagenome-Assembled Genome (MAG), Viral genome, CRISPR spacers
Metagenomic data of gut microbiome in the Japanese population (88 + 5 individuals) and healthy individuals (n = 73)
BioBank Japan (n=180,215), UK Biobank (n=377,441), and large-scale meta-analysis including the summary statistics of other cohorts [FinnGen, Breast Cancer Association Consortium (BCAC), and Prostate Cancer Association Group to Investigate Cancer Associated Alterations in the Genome (PRACTICAL)] for breast and prostate cancer (n=648,746 and 482,080), Phenotypes: 15
Hunner-type interstitial cystitis cases: 144, Control participants: 41,516
524 Japanese individuals for gut microbiome–host genome association analysis, 362 Japanese individuals for plasma metabolite–host genome association analysis
Dataset ID | Type of Data | Criteria | Release Date |
---|---|---|---|
JGAS000205 | Metagenome | Controlled-access (Type I) | 2019/11/15 |
hum0197.v2.gwas.v1 | GWAS for autoimmune pulmonary alveolar proteinosis | Unrestricted-access | 2020/11/27 |
JGAS000260 | Metagenome | Controlled-access (Type I) | 2020/11/27 |
hum0197.v3.gwas.v1 | GWAS for 215 phenotypes | Unrestricted-access | 2021/03/22 |
JGAS000316 | Metagenome | Controlled-access (Type I) | 2021/10/12 |
JGAS000415 | Metagenome | Controlled-access (Type I) | 2021/12/10 |
hum0197.v5.gwas.v1 | GWAS for 10 phenotypes | Unrestricted-access | 2021/12/21 |
hum0197.v5.finemap.v1 | Fine-mapping for 79 phenotypes | Unrestricted-access | 2021/12/21 |
JGAS000504 | Read count data of miRNA | Controlled-access (Type I) | 2022/02/08 |
hum0197.v6.eqtl.v1 | eQTL data | Unrestricted-access | 2022/02/08 |
JGAS000530 | Metagenome | Controlled-access (Type I) | 2022/05/23 |
JGAS000531 | Metagenome | Controlled-access (Type I) | 2022/06/03 |
hum0197.v9.gwas.GCT.v1 | GWAS for intracranial germ cell tumors | Unrestricted-access | 2022/06/10 |
hum0197.v10.gwas.v1 | GWAS for 9 phenotypes | Unrestricted-access | 2022/06/16 |
JGAS000543 | Raw sequencing data of single-cell RNA-seq | Controlled-access (Type I) | 2022/07/21 |
hum0197.v12 | MAG, Viral genome and CRISPR spacers of Microbial genome | Unrestricted-access | 2022/12/01 |
JGAS000543 (data addition) | clinical data | Controlled-access (Type I) | 2023/02/14 |
JGAS000593 | Raw sequencing data of single-cell RNA-seq, clinical data | Controlled-access (Type I) | 2023/02/14 |
hum0197.v3.gwas.v1 (data addition) | GWAS for 5 phenotypes | Unrestricted-access | 2023/02/16 |
JGAS000600 | Metagenome | Controlled-access (Type I) | 2023/03/29 |
hum0197.v16.gwas.v1 | GWAS for 15 phenotypes | Unrestricted-access | 2023/06/06 |
hum0197.v17.hic-gwas.v1 | GWAS for Hunner-type interstitial cystitis | Unrestricted-access | 2023/06/27 |
hum0197.v18.gwas.v1 |
GWAS for gut microbiome GWAS for plasma metabolite GWAS for KEGG Gene Ortholog and KEGG Pathway |
Unrestricted-access | 2023/10/02 |
* Data users need to apply an application for Using NBDC Human Data to reach the Controlled-access Data. Learn more
*When the research results including the data which were downloaded from NHA/DRA, are published or presented somewhere, the data user must refer the papers which are related to the data, or include in the acknowledgment. Learn more
MOLECULAR DATA
JGAS000205 / JGAS000260 / JGAS000316 / JGAS000415 / JGAS000530 / JGAS000531
Participants/Materials: |
95+103+227+30+136 Japanese individuals Inflammatory Bowel Disease 35 Ulcerative Colitis 39 Crohn's disease 40 Healthy controls |
Targets | Metagenome |
Target Loci for Capture Methods | - |
Platform | Illumina [HiSeq 3000, NovaSeq 6000] |
Library Source | DNA extracted from gut microbiome |
Cell Lines | - |
Library Construction (kit name) | KAPA Hyper Prep Kit |
Fragmentation Methods | Ultrasonic fragmentation (Covaris) |
Spot Type | Paired-end |
Read Length (without Barcodes, Adaptors, Primers, and Linkers) | 150 bp |
Japanese Genotype-phenotype Archive Dataset ID |
JGAD000290 (95 Japanese individuals) JGAD000363 (103 Japanese individuals) JGAD000427 (227 Japanese individuals) JGAD000532 (30 Japanese individuals) JGAD000649 (Inflammatory Bowel Disease) JGAD000650 (136 Japanese individuals) |
Total Data Volume |
JGAD000290:477 GB(fastq) JGAD000363:408 GB(fastq) JGAD000427:881.2 GB(fastq) JGAD000532:106.7 GB(fastq) JGAD000649:374.6 GB (fastq) JGAD000650:541.4 GB(fastq) |
Comments (Policies) | NBDC policy |
Participants/Materials |
Autoimmune pulmonary alveolar proteinosis cases (ICD10: J840): 198 Control participants: 395 |
Targets | genome wide SNPs |
Target Loci for Capture Methods | - |
Platform | Illumina [Infinium Asian Screening Array] |
Library Source | DNAs extracted from peripheral blood cells |
Cell Lines | - |
Reagents (Kit, Version) | Infinium Asian Screening Array |
Genotype Call Methods (software) | GenomeStudio for genotyping, shapeit2 for haplotype phasing, and minimac3 for imputation |
Association Analysis (software) | PLINK2 |
Filtering Methods |
Sample QC: We excluded samples with low genotyping call rates (call rate < 98%) and in close genetic relation (PI_HAT > 0.175). We included samples of the estimated East Asian ancestry. Variant QC: We excluded variants with (1) genotyping call rate < 98%, (2) P value for Hardy–Weinberg equilibrium < 1.0 × 10−6, and (3) minor allele count < 5, or (4) > 10% frequency difference with the imputation reference panel. |
Marker Number (after QC) | 12,153,232 autosomal variants and 242,876 X-chromosomal variants after QC. |
NBDC Dataset ID |
(Click the Dataset ID to download the file) |
Total Data Volume | 390MB for autosome (txt.gz) and 19MB for X chromosome (txt.gz) |
Comments (Policies) | NBDC policy |
Participants/Materials | Biobank Japan (n = 179,000), UK biobank (n = 361,000), FinnGen (n = 136,000), no. Phenotypes: 220 |
Targets | genome wide SNPs |
Target Loci for Capture Methods | - |
Platform |
BBJ: Illumina [HumanOmniExpressExome BeadChip, HumanOmniExpress BeadChip, HumanExome BeadChip] UK Biobank: Applied Biosystems [UK BiLEVE Axiom Array, UK Biobank Axiom Array] FinnGen: Thermo Fisher Scientific [FinnGen1 ThermoFisher Array or other genotyping arrays] |
Library Source | DNAs extracted from peripheral blood cells |
Cell Lines | - |
Reagents (Kit, Version) |
BBJ: HumanOmniExpressExome BeadChip, HumanOmniExpress BeadChip, HumanExome BeadChip UK Biobank: UK BiLEVE Axiom Array, UK Biobank Axiom Array FinnGen: FinnGen1 ThermoFisher Array or other genotyping arrays |
Genotype Call Methods (software) |
BBJ: Eagle, Minimac3 UK Biobank: IMPUTE4 FinnGen: beagle4.1 |
Association Analysis (software) |
For binary traits, SAIGE software was used with age, age2, sex, age×sex, age2×sex, and top 20 principal components as covariates. For quantitative traits (biomarkers), BOLT-LMM or plink software was used with the same covariates.
|
Filtering Methods |
BBJ: We included imputed variants with Rsq > 0.7. UK Biobank: We excluded the variants with (i) INFO score ≤ 0.8, (ii) MAF ≤ 0.0001 (except for missense and protein-truncating variants annotated by VEP, which were excluded if MAF ≤ 1 × 10-6), and (iii) PHWE ≤ 1 × 10-10. FinnGen: We excluded variants with an imputation INFO score < 0.8 or MAF < 0.0001. |
Marker Number (after QC) |
BBJ: 13,530,797 variants UK Biobank: 13,791,467 variants FinnGen: 16,859,359 variants |
NBDC Dataset ID |
(Click the Dataset ID to download the file) |
Total Data Volume |
BBJ: ~1.5G for autosome and ~33M for chrX UK Biobank: ~1.5G for autosome and ~15M for chrX FinnGen: ~740M for autosome and ~20M for chrX |
Comments (Policies) | NBDC policy |
hum0197.v5.gwas.v1 / hum0197.v5.finemap.v1
Participants/Materials | Biobank Japan (n = 179,000), no. Phenotypes: 79 |
Targets | genome wide SNPs |
Target Loci for Capture Methods | - |
Platform | Illumina [HumanOmniExpressExome BeadChip, HumanOmniExpress BeadChip, HumanExome BeadChip] |
Library Source | DNAs extracted from peripheral blood cells |
Cell Lines | - |
Reagents (Kit, Version) | HumanOmniExpressExome BeadChip, HumanOmniExpress BeadChip, HumanExome BeadChip |
Genotype Call Methods (software) | Eagle, Minimac3 |
Association Analysis (software) |
GWAS: For binary traits, SAIGE software was used with age, age2, sex, age×sex, age2×sex, and top 20 principal components as covariates. For quantitative traits (biomarkers), BOLT-LMM was used with the same covariates. Fine-mapping: FINEMAP and SuSiE were used with GWAS summary statistics and in-sample dosage LD, allowing up to 10 causal variants per region. |
Filtering Methods |
GWAS: We included imputed variants with Rsq > 0.7. For binary traits, variants with MAC < 10 were additionally excluded. Fine-mapping: We defined fine-mapping regions based on a 3 Mb window around each lead variant and merged regions if they overlapped. We excluded the major histocompatibility complex (MHC) region (chr 6: 25–36 Mb) from analysis due to extensive LD structure in the region. For each method, we only included variants from successfully fine-mapped regions while excluding those from failed regions (e.g., due to conversion failure or available memory restrictions). |
Marker Number (after QC) | 13,531,752 variants (ref: hg19) |
NBDC Dataset ID |
hum0197.v5.gwas.v1 / hum0197.v5.finemap.v1 (Click the Dataset ID to download the file) |
Total Data Volume | 14 GB |
Comments (Policies) | NBDC policy |
Participants/Materials: | 141 Japanese individuals |
Targets | small RNA-seq |
Target Loci for Capture Methods | - |
Platform | Illumina [HiSeq 2500] |
Library Source | RNAs extracted from PBMC |
Cell Lines | - |
Library Construction (kit name) | SMARTer smRNA-Seq Kit |
Fragmentation Methods | - |
Spot Type | Single-end |
Read Length (without Barcodes, Adaptors, Primers, and Linkers) | 100 bp |
Mapping Methods | bowtie (GRCh37) |
Detecting method for read count (software) | featureCounts + miRbase v22 |
QC | We performed adapter trimming using Cutadapt v1.8 and removed reads with a low quality score (Phred quality score < 20 in >20% of total bases) using fastp v0.20.0. Also, we removed reads with a length of >29 bp or <15 bp, which are not expected to be mature miRNAs. Mature miRNAs detected with ≥1 read in at least half of the individuals were included in the dataset. |
miRNA number | 343 |
Japanese Genotype-phenotype Archive Dataset ID | JGAD000621 |
Total Data Volume | 54.7 KB (txt) |
Comments (Policies) | NBDC policy |
Participants/Materials | 141 Japanese individuals |
Targets | eQTL |
Target Loci for Capture Methods | - |
Platform |
small RNA-seq: Illumina [HiSeq 2500] WGS: Illumina [HiSeq X Ten] |
Library Source | read count data of JGAS000504 and whole genome sequencing data using genomic DNA exracted from whole blood |
Cell Lines | - |
Reagents (Kit, Version) |
small RNA-seq: See JGAS000504 WGS: TruSeq DNA PCR-Free Library Preparation Kit |
Genotype Call / Detecting read count Methods (software) |
See JGAS000504 for read count data. WGS: Sequenced reads were aligned against the reference human genome with the decoy sequence (GRCh37, human_g1k_v37_decoy) using BWA-MEM v0.7.13. |
QC |
See JGAS000504 for read count data. WGS: We removed the variants (i)with low genotyping call rates (<0.90), (ii)with ExcessHet > 60 or (iii) with Hardy–Weinberg Pvalue < 1.0 × 10−10. Genotype refinement was performed using Beagle v5.1. |
Marker Number (after QC) |
See JGAS000504 for read count data. WGS: 12,171,854 variants |
eQTL algorithm | We analyzed the association between genetic variants with minor allele frequency (MAF) ≥ 0.01 within a cis-window around each miRNA (±1 Mb of the mature miRNA) and normalized expression values using MatrixEQTL v2.3. |
NBDC Dataset ID |
(Click the Dataset ID to download the file) |
Total Data Volume | 1.1 MB (txt) |
Comments (Policies) | NBDC policy |
Participants/Materials |
Intracranial germ cell tumors cases (ICD10: C719): 133 Control participants: 762 |
Targets | genome wide SNPs |
Target Loci for Capture Methods | - |
Platform | Illumina [Infinium Asian Screening Array] |
Library Source | DNAs extracted from peripheral blood cells |
Cell Lines | - |
Reagents (Kit, Version) | Infinium Asian Screening Array |
Genotype Call Methods (software) |
GenomeStudio for genotyping shapeit2 for haplotype phasing minimac3 for imputation |
Association Analysis (software) | PLINK2 |
Filtering Methods |
Sample QC: We excluded individuals (i) with genotyping call rate < 0.97, (ii) in close kinship (PI_HAT > 0.17), and (iii) estimated of non-East Asian ancestry were excluded. Variant QC: We excluded variants with (i) genotyping call rate < 0.99, (ii) minor allele count < 5, (iii) P value for Hardy–Weinberg equilibrium < 1.0 × 10−5 in controls, and (iv) > 10% allele frequency difference with the imputation reference panel or the allele frequency panel of Tohoku Medical Megabank Project. Post-imputation QC: We excluded imputed variants with Rsq < 0.7 and minor allele frequency < 0.5%. |
Marker Number (after QC) |
7,803,874 autosomal variants 181,867 X-chromosomal variants |
NBDC Dataset ID |
(Click the Dataset ID to download the file) |
Total Data Volume | 248 MB (txt) |
Comments (Policies) | NBDC policy |
Participants/Materials |
Biobank Japan (n=161,801), UK biobank (n=377,583), no. Phenotypes: 9 Patients: Autoimmune [Rheumatoid arthritis (ICD10: M05), Graves' disease (ICD10: C719), type I diabetes mellitus (ICD10: E10)] Allergy [asthma (ICD10: J45), Atopic dermatitis (ICD10: L20), Pollinosis (ICD10: J301)] Controls: non-autoimmune +non-allergy individuals (There is overlap among patients in each disease category) |
Targets | genome wide SNPs |
Target Loci for Capture Methods | - |
Platform |
BBJ: Illumina [HumanOmniExpressExome BeadChip, HumanOmniExpress BeadChip, HumanExome BeadChip] UK Biobank: Applied Biosystems [UK BiLEVE Axiom Array, UK Biobank Axiom Array] |
Library Source | DNAs extracted from peripheral blood cells |
Cell Lines | - |
Reagents (Kit, Version) |
BBJ: HumanOmniExpressExome BeadChip, HumanOmniExpress BeadChip, HumanExome BeadChip UK Biobank: UK BiLEVE Axiom Array, UK Biobank Axiom Array |
Genotype Call Methods (software) |
BBJ: Eagle, Minimac3 UK Biobank: IMPUTE4 |
Association Analysis (software) |
SAIGE software was used with age, sex, and top five principal components as covariates. RE2C software was used for the multi-trait meta-analysis adjusting for sample overlap between GWAS summary data. |
Filtering Methods |
We excluded the variants with Rsq < 0.7 and MAF < 0.005. |
Marker Number (after QC) |
BBJ: 8,374,220 autosomal variants for individual trait / 8,369,174 autosomal variants for meta-analysis UKB: 10,864,380 autosomal variants for individual trait / 10,858,065 autosomal variants for meta-analysis BBJ + UK Biobank: 5,965,154 autosomal variants for meta-analysis |
NBDC Dataset ID |
(Click the Dataset ID to download the files) |
Total Data Volume |
BBJ: ~ 760MB for individual trait / ~ 430MB for multi-trait meta-analysis UK Biobank: ~ 1.1GB for individual trait / ~ 550MB for multi-trait meta-analysis BBJ+UK Biobank: ~ 310MB for multi-trait meta-analysis |
Comments (Policies) | NBDC policy |
Participants/Materials |
COVID-19 (ICD10: U071) : 30 + 43 cases Healthy controls : 31 + 44 individuals |
Targets | scRNA-seq |
Target Loci for Capture Methods | - |
Platform | Illumina [NovaSeq 6000] |
Library Source | RNAs extracted from PBMC |
Cell Lines | - |
Library Construction (kit name) | Chromium Next GEM Single Cell 5’ Library & Gel Bead Kit v1.1, Chromium Next GEM Chip G Single Cell Kit, Single Index Kit T Set A |
Fragmentation Methods | Enzymatic fragmentation |
Spot Type | Paired-end |
Read Length (without Barcodes, Adaptors, Primers, and Linkers) | 91 bp |
NBDC Dataset ID | |
Total Data Volume | 1.3 + 2.0 TB (fastq, xlsx [clinical data]) |
Comments (Policies) | NBDC policy |
Participants/Materials: |
Japanese gut microbiome JGAS000205 / JGAS000260 / JGAS000316 / JGAS000415 / JGAS000530 / JGAS000531, Public data (DRA006684) |
Targets | Metagenome |
Target Loci for Capture Methods | - |
Platform | Illumina [HiSeq 2500/3000, NovaSeq 6000] |
Library Source | JGAS000205 / JGAS000260 / JGAS000316 / JGAS000415 / JGAS000530 / JGAS000531, DRA006684 |
Cell Lines | - |
MAG methods | De novo assembly with metaspades was performed. Then, binning with dastools (metabat2、maxbin2、concoct) was applied. |
JDDBJ Sequence Read Archive ID |
JGA MAG: 20220531NSUB000031HIGH_JGA_JMAG_GENOME_*.acclist.txt DRA014186 (JGAS000205 / JGAS000260 / JGAS000316 / JGAS000415 / JGAS000530 / JGAS000531) DRA014188 (JGAS000205 / JGAS000260 / JGAS000316 / JGAS000415 / JGAS000530 / JGAS000531) DRA014191 (JGAS000205 / JGAS000260 / JGAS000316 / JGAS000415 / JGAS000530 / JGAS000531) DRA014192 (JGAS000205 / JGAS000260 / JGAS000316 / JGAS000415 / JGAS000530 / JGAS000531) TPA MAG: EMNX01000001-EMNX01000025、EMNY01000001-EMNY01000068、EMNZ01000001-EMNZ01000149, EMOA01000001-EMOA01000067 DRA014184 (DRA006684) |
Total Data Volume |
JGA MAG: 153 GB (fasta) DRA014186: 11.5 GB (fasta) DRA014188: 11.9 GB (fasta) DRA014191: 12.2 GB (fasta) DRA014192: 5.75 GB (fasta) TPA MAG: 11.9 MB (fasta) DRA014184: 3.65 GB (fasta) |
Comments (Policies) | NBDC policy |
Participants/Materials: |
Japanese gut microbiome JGAS000205 / JGAS000260 / JGAS000316 / JGAS000415 / JGAS000530 / JGAS000531, Public data (DRA006684) |
Targets | NGS (WGS) |
Target Loci for Capture Methods | - |
Platform | Illumina [HiSeq 2500/3000, NovaSeq 6000] |
Library Source | JGAS000205 / JGAS000260 / JGAS000316 / JGAS000415 / JGAS000530 / JGAS000531, DRA006684 |
Cell Lines | - |
Virus genome contsruction | De novo assembly with metaspades was performed. Then, viral contigs were detected with virfinder and virsorter. |
JDDBJ Sequence Read Archive ID |
JGAS000205 / JGAS000260 / JGAS000316 / JGAS000415 / JGAS000530 / JGAS000531: BRDB01000001-BRDB01028816 DRA006684: EMNW01000001-EMNW01002579 |
Total Data Volume |
JGAS000205 / JGAS000260 / JGAS000316 / JGAS000415 / JGAS000530 / JGAS000531: 1.09 GB (fasta) DRA006684: 98.3 MB (fasta) |
Comments (Policies) | NBDC policy |
Participants/Materials: |
Japanese gut microbiome JGAS000205 / JGAS000260 / JGAS000316 / JGAS000415 / JGAS000530 / JGAS000531, Public data (DRA006684) |
Targets | NGS (WGS) |
Target Loci for Capture Methods | - |
Platform | Illumina [HiSeq 2500/3000, NovaSeq 6000] |
Library Source | JGAS000205 / JGAS000260 / JGAS000316 / JGAS000415 / JGAS000530 / JGAS000531, DRA006684 |
Cell Lines | - |
CRISPR contsruction | MINCED was applied to the MAGs. |
DDBJ Sequence Read Archive ID |
DRA014186 (JGAS000205 / JGAS000260 / JGAS000316 / JGAS000415 / JGAS000530 / JGAS000531) DRA014184 (DRA006684) |
Total Data Volume |
DRA014184: 17.9 MB (fasta) DRA014186: 1.43 MB (fasta) |
Comments (Policies) | NBDC policy |
Participants/Materials: |
88 Japanese individuals (shotgun sequencing) 73 healthy individuals (shotgun sequencing) - DNA extraction was performed with phenol-chloroform extraction: 73 samples - DNA extraction with DNeasy PowerSoil Pro kit: 47 samples 5 Japanese individuals (deep shotgun sequencing) |
Targets | Metagenome |
Target Loci for Capture Methods | - |
Platform | Illumina [HiSeq 3000, NovaSeq 6000] |
Library Source | DNA extracted from gut microbiome |
Cell Lines | - |
Library Construction (kit name) | KAPA Hyper Prep Kit |
Fragmentation Methods | Ultrasonic fragmentation (Covaris) |
Spot Type | Paired-end |
Read Length (without Barcodes, Adaptors, Primers, and Linkers) | 150 bp |
Japanese Genotype-phenotype Archive Dataset ID | JGAD000729 |
Total Data Volume | 2.6 TB(fastq) |
Comments (Policies) | NBDC policy |
Participants/Materials |
BioBank Japan (n=180,215), UK Biobank (n=377,441) large-scale meta-analysis including the summary statistics of other cohorts (FinnGen, BCAC, and PRACTICAL) for breast and prostate cancer (n=648,746 and 482,080) Patients: biliary tract (ICD10: C22.1, 23-24), breast (ICD10: C509, cervical (ICD10: C53), colorectal (ICD10: C18-20), endometrial (ICD10: C54), esophageal (ICD10: C15), gastric (ICD10: C16), hepatocellular (ICD10: C22.0), lung (ICD10: C34), non-Hodgkin's lymphoma (ICD10: C82-83), ovarian (ICD10: C56), pancreatic (ICD10: C25), and prostate (ICD10: C61) cancer Controls: without cancer individuals (There is overlap among patients in each disease category) |
Targets | genome wide SNPs |
Target Loci for Capture Methods | - |
Platform |
BBJ: Illumina [HumanOmniExpressExome BeadChip, HumanOmniExpress BeadChip, HumanExome BeadChip] UK Biobank: Applied Biosystems [UK BiLEVE Axiom Array, UK Biobank Axiom Array] FinnGen: Thermo Fisher Scientific [FinnGen1 ThermoFisher Array or other genotyping arrays] BCAC: Illumina [iCOGS OncoArray] PRACTICAL: Illumina [iCOGS OncoArray] |
Library Source | DNAs extracted from peripheral blood cells |
Cell Lines | - |
Reagents (Kit, Version) |
BBJ: HumanOmniExpressExome BeadChip, HumanOmniExpress BeadChip, HumanExome BeadChip UK Biobank: UK BiLEVE Axiom Array, UK Biobank Axiom Array FinnGen: FinnGen1 ThermoFisher Array or other genotyping arrays BCAC: Infinium OncoArray-500K v1.0 BeadChip Kit PRACTICAL: Infinium OncoArray-500K v1.0 BeadChip Kit |
Genotype Call Methods (software) |
BBJ: Eagle, Minimac3 UK Biobank: IMPUTE4 FinnGen: beagle4.1 BCAC: IMPUTE2 PRACTICAL: IMPUTE2 |
Association Analysis (software) |
SAIGE software was used with age, sex, and top five principal components as covariates. RE2C software was used for the multi-trait meta-analysis adjusting for sample overlap between GWAS summary data. |
Filtering Methods |
Sample QC and Variant QC for each dataset: refer to ReadMe file We excluded the variants with Rsq < 0.7 and MAF < 0.01. |
Marker Number (after QC) |
BBJ: 13MN (7,398,798) , each cancer (7,442,557 (7,420,485-7,444,681)) UK Biobank: 13MN (9,602,853), each cancer (9,620,786 (9,620,343-9,620,935)) BBJ + UK Biobank: 13MN (5,374,018), each cancer (5,696,155 (5,677,934-5,698,357)) BBJ + UK Biobank + FinnGen + BCAC (breast cancer): 5,104,756 BBJ + UK Biobank + FinnGen + PRACTICAL (prostate cancer): 5,105,796 BBJ + UK Biobank + FinnGen + BCAC + PRACTICAL (breast cancer + prostate cancer): 5,100,089 *mean (min-max) for each cancer |
NBDC Dataset ID |
(Click the Dataset ID to download the files) |
Total Data Volume |
BBJ: 13MN (287 MB), each cancer (625 (605-633) MB) UK Biobank: 13MN (362 MB), each cancer (841 (814-859) MB) BBJ + UK Biobank: 13MN (202 MB), each cancer (260 (255-264) MB) BBJ + UK Biobank + FinnGen + BCAC (breast cancer): 242 MB BBJ + UK Biobank + FinnGen + PRACTICAL (prostate cancer): 243 MB BBJ + UK Biobank + FinnGen + BCAC + PRACTICAL (breast cancer + prostate cancer): 253 MB *mean (min-max) for each cancer |
Comments (Policies) | NBDC policy |
Participants/Materials |
Hunner-type interstitial cystitis cases (ICD10: N301): 144 Control participants: 41,516 |
Targets | genome wide SNPs |
Target Loci for Capture Methods | - |
Platform | Illumina [Infinium Asian Screening Array] |
Library Source | DNAs extracted from peripheral blood cells |
Cell Lines | - |
Reagents (Kit, Version) | Infinium Asian Screening Array |
Genotype Call Methods (software) |
GenomeStudio for genotyping shapeit4 for haplotype phasing minimac4 for imputation |
Association Analysis (software) | SAIGE |
Filtering Methods |
Sample QC: We excluded individuals with low genotyping call rates (call rate < 98%). We included individuals of the estimated Japanese ancestry using PCA. Variant QC: We excluded variants with (1) genotyping call rate < 99%, (2) minor allele count < 5, (3) P-value for Hardy–Weinberg equilibrium < 1.0 × 10^−10, and (4) > 5% allele frequency difference compared with the imputation reference panel or the allele frequency panel of Tohoku Medical Megabank Project. Post-imputation QC: We excluded imputed variants with Rsq < 0.7 and minor allele frequency < 0.5%. |
Marker Number (after QC) | 7,909,790 variants (hg19) |
NBDC Dataset ID |
(Click the Dataset ID to download the file) |
Total Data Volume | 700 MB (txt) |
Comments (Policies) | NBDC policy |
Participants/Materials |
524 Japanese individuals (423 species in the gut microbiome) 306 Japanese individuals (306 plasma metabolites) 524 Japanese individuals (KEGG Gene Ortholog and KEGG Pathway) |
Targets | genome wide SNPs |
Target Loci for Capture Methods | - |
Platform |
SNP array: Illumina [Infinium Asian Screening Array] Whole genome sequencing: Illumina [HiSeq X Ten] Metagenome shotgun sequencing: Illumina [HiSeq 2500/3000、NovaSeq 6000] |
Library Source | DNAs extracted from peripheral blood cells |
Cell Lines | - |
Reagents (Kit, Version) |
SNP array: Infinium Asian Screening Array Whole genome sequencing: TruSeq DNA PCR-Free Library Preparation Kit Metagenome shotgun sequencing: KAPA Hyper Prep Kit |
Genotype Call Methods (software) |
SNP array: Genotyping: GenomeStudio Haplotype phasing: shapeit4 Imputation: minimac4 WGS: WA-MEM v0.7.13 + GATK v3.8-0 |
Association Analysis (software) | PLINK2 |
Filtering Methods |
SNP array data: Sample QC: We excluded individuals with low genotyping call rates (call rate < 98%). We included individuals of the estimated Asian ancestry using PCA. Variant QC: We excluded variants with (1) genotyping call rate < 99%, (2) minor allele count < 5, (3) P-value for Hardy–Weinberg equilibrium < 1.0 × 10^−10, and (4) > 5% allele frequency difference compared with the imputation reference panel or the allele frequency panel of Tohoku Medical Megabank Project. Post-imputation QC: We excluded imputed variants with Rsq < 0.7 and minor allele frequency < 1%. WGS: We excluded variants with genotype call rate <90%, ExcessHet > 60, Hardy-Weinberg P<1.0×10−10 After imputation with Beagle v5.1, we excluded imputed variants with minor allele frequency < 1%. |
Marker Number (after QC) |
Gut microbiome/KEGG (SNP array): 7,213,470 variants (hg19) Metabolome (WGS): 6,840,258 variants (GRCh37) |
NBDC Dataset ID |
hum0197.v18.gwas.v1 (Gut microbiome, Plasma metabolites, KEGG) (Click the link above to download the files) |
Total Data Volume |
Gut microbiome: 206 GB Metabolome: 90.7 GB KEGG: 300 MB |
Comments (Policies) | NBDC policy |
DATA PROVIDER
Principal Investigator: Yukinori Okada
Affiliation: Department of Statistical Genetics, Osaka University Graduate School of Medicine
Project / Group Name: -
Funds / Grants (Research Project Number):
Name | Title | Project Number |
---|---|---|
Precursory Research for Innovative Medical care (PRIME), Advanced Research & Development Programs for Medical Innovation, Japan Agency for Medical Research and Development (AMED) | Crosstalk among microbiome, host, disease, and drug discovery enhanced by statistical genetics | JP19gm6010001 |
FORCE, Advanced Research & Development Programs for Medical Innovation, Japan Agency for Medical Research and Development (AMED) | Elucidation of disease-specific microbiota and personalized medicine by metagenome-wide association studies | JP20gm4010006 |
Practical Research Project for Rare / Intractable Diseases, Japan Agency for Medical Research and Development (AMED) | Biology and in silico drug repositioning of pulmonary alveolar proteinosis using trans-layer omics analysis | JP20ek0109413 |
Practical Research Project for Allergic Diseases and Immunology, Japan Agency for Medical Research and Development (AMED) | Nucleic genome drug discovery for autoimmune diseases through in-silico and patient-oriented screening utilizing large-scale disease genetics | JP19ek0410041 |
Practical Research Project for Allergic Diseases and Immunology, Japan Agency for Medical Research and Development (AMED) | Genomic prediction medicine of rheumatoid arthritis based on comprehensive immune-omics resources | JP21ek0410075 |
Platform Program for Promotion of Genome Medicine, Japan Agency for Medical Research and Development (AMED) | Implementation of genomic prediction medicine based on statistical genetics | JP21km0405211 |
Platform Program for Promotion of Genome Medicine, Japan Agency for Medical Research and Development (AMED) | Next-generation genomics analyses elucidates biology, personalized medicine, and drug discovery of psoriasis | JP21km0405217 |
KAKENHI Grant-in-Aid for Scientific Research (A) | Elucidation of disease biology and tissue specificity by trans-layer omics analysis and whole-genome sequencing | 19H01021 |
KAKENHI Grant-in-Aid for Scientific Research (A) | Elucidation of immune and allergic disease dynamics by integrative sequencing analysis | 22H00476 |
PUBLICATIONS
Title | DOI | Dataset ID | |
---|---|---|---|
1 | Metagenome-wide association study of gut microbiome revealed novel aetiology of rheumatoid arthritis in the Japanese population. | doi: 10.1136/annrheumdis-2019-215743 | JGAD000290 |
2 | Genetic determinants of risk in autoimmune pulmonary alveolar proteinosis. | doi: 10.1038/s41467-021-21011-y | hum0197.v2.gwas.v1 |
3 | A metagenome-wide association study of gut microbiome in patients with multiple sclerosis revealed novel disease pathology. | doi: 10.3389/fcimb.2020.585973 | JGAD000363 |
4 | A global atlas of genetic associations of 220 deep phenotypes | doi: 10.1101/2020.10.23.20213652 | hum0197.v3.gwas.v1 |
5 | Metagenome-wide association study revealed disease-specific landscape of the gut microbiome of systemic lupus erythematosus in Japanese | doi: 10.1136/annrheumdis-2021-220687 | JGAD000427 |
6 | Whole gut virome analysis of 476 Japanese revealed a link between phage and autoimmune disease | doi: 10.1136/annrheumdis-2021-221267 | JGAD000532 |
7 | Insights from complex trait fine-mapping across diverse populations | doi: 10.1101/2021.09.03.21262975 |
hum0197.v5.gwas.v1 hum0197.v5.finemap.v1 |
8 | Genetic architecture of microRNA expression and its link to complex diseases in the Japanese population. | doi: 10.1093/hmg/ddab361 |
JGAD000621 hum0197.v6.eqtl.v1 |
9 | Multi-trait and cross-population genome-wide association studies across autoimmune and allergic diseases identify shared and distinct genetic components. | doi: 10.1136/annrheumdis-2022-222460 | hum0197.v10.gwas.v1 |
10 | DOCK2 is involved in the host genetics and biology of severe COVID-19 | doi: 10.1038/s41586-022-05163-5 | JGAD000662 |
11 | Prokaryotic and viral genomes recovered from 787 Japanese gut metagenomes revealed microbial features linked to diets, populations, and diseases | doi: 10.1016/j.xgen.2022.100219 | hum0197.v12 |
12 | Reconstruction of the personal information from human genome reads in gut metagenome sequencing data | doi: 10.1038/s41564-023-01381-3 | JGAD000729 |
13 | Pan-cancer and cross-population genome-wide association studies dissect shared genetic backgrounds underlying carcinogenesis | doi: 10.1038/s41467-023-39136-7 | hum0197.v16.gwas.v1 |
14 | Genome-wide association analysis identifies susceptibility loci within the major histocompatibility complex region for Hunner-type interstitial cystitis | doi: 10.1016/j.xcrm.2023.101114 | hum0197.v17.hic-gwas.v1 |
15 | Analysis of gut microbiome, host genetics, and plasma metabolites reveals gut microbiome-host interactions in the Japanese population | doi: 10.1016/j.celrep.2023.113324 | hum0197.v18.gwas.v1 |
USRES (Controlled-access Data)
Principal Investigator | Affiliation | Country/Region | Research Title | Data in Use (Dataset ID) | Period of Data Use |
---|---|---|---|---|---|
Ilana Brito | Meinig School of Biomedical Engineering, Cornell University | United States of America | Comparative metagenomics of lupus patients' microbiomes | JGAD000290, JGAD000363, JGAD000427, JGAD000532 | 2022/05/12-2024/05/04 |
Yongxin Li | Department of Chemistry, The University of Hong Kong | Hong Kong | Comparison of gut bacterial diversity and composition in MS/EAE | JGAD000363 | 2022/09/19-2024/07/01 |
Tina Fuchs | Institute for Clinical Chemistry, Medical Faculty Mannheim, Heidelberg University | Germany | Investigating the clonality of VIREM cells in COVID-19 patients | JGAD000662, JGAD000772 | 2024/02/26-2024/12/31 |