About NBDC Human Database
An enormous amount of human data is being generated with advances in next-generation sequencing and other analytical technologies. We therefore need rules and mechanisms for organizing and storing such data and for effectively utilizing them to make progress in the life sciences.
To promote sharing and utilization of human data while considering the protection of personal information, the Database Center for Life Science (DBCLS) of the Joint Support-Center for Data Science Research, Research Organization of Information and Systems (ROIS-DS) created a platform for sharing various data generated from human specimens, which are available for publicly access in cooperation with the DNA Data Bank of Japan.
You can apply to use or submit human data through this website.
Violators of the guidelines who have not submitted a report on the deletion of Controlled-access data shall be disclosed here.
NBDC Research ID: hum0197.v24
Aims: Elucidation of disease biology based on trans-omics analysis, GWAS in the Japanese and trans-ethnic populations, Elucidation of the mechanism of COVID-19 severity, Improving the performance of type 2 diabetes polygenic predictions, Elucidation of the genetic architecture of recurrent pregnancy loss, Elucidation of the association between Jomon component in the Japanese population and phenotypes and diseases, Elucidation of the entire HPV integration in HPV-associated Oropharyngeal Cancer
Methods: Metagenome shotgun sequencing, genome-wide association study (GWAS), small RNA-seq and eQTL analyses, whole genome sequencing (WGS), single-cell RNA sequencing, proteomics
Participants/Materials:
Metagenomic data of gut microbiome in the Japanese population (95 + 103 + 227 + 30 + 136 individuals)
Autoimmune pulmonary alveolar proteinosis cases: 198, Control participants: 395
Populations: Biobank Japan (n = 179,000), UK biobank (n = 361,000), and FinnGen (n = 136,000), Phenotypes: 220
141 Japanese individuals
Metagenomic data of gut microbiome in Inflammatory Bowel Disease (35 Ulcerative Colitis and 39 Crohn's disease) and 40 Healthy controls
Intracranial germ cell tumors cases: 133, Control participants: 762
Populations: Biobank Japan (n = 161,801) and UK biobank (n = 377,583), Phenotypes: 9
Peripheral blood mononuclear cell (PBMC) from Japanese population (COVID-19: n = 30 + 43, Healthy controls: n = 31 + 44)
Microbial genome: Metagenome-Assembled Genome (MAG), Viral genome, CRISPR spacers
Metagenomic data of gut microbiome in the Japanese population (88 + 5 individuals) and healthy individuals (n = 73)
BioBank Japan (n=180,215), UK Biobank (n=377,441), and large-scale meta-analysis including the summary statistics of other cohorts [FinnGen, Breast Cancer Association Consortium (BCAC), and Prostate Cancer Association Group to Investigate Cancer Associated Alterations in the Genome (PRACTICAL)] for breast and prostate cancer (n=648,746 and 482,080), Phenotypes: 15
Hunner-type interstitial cystitis cases: 144, Control participants: 41,516
524 Japanese individuals for gut microbiome–host genome association analysis, 362 Japanese individuals for plasma metabolite–host genome association analysis
The weights of variants existing in the target cohorts, Tohoku Medical Megabank and the second cohort of BBJ, calculated from GWAS results on 27,642 type 2 diabetes cases and 70,242 controls from BioBank Japan and UK Biobank
Recurrent pregnancy loss cases: 1,728, Control participants: 24,315
Autoimmune diseases cases: 2,238, Healthy controls: 2,919
The first cohort of BioBank Japan (n = 171,287)
HPV-associated oropharyngeal cancer cases: 32, Non-HPV-associated oropharyngeal cancer cases: 17, Healthy controls: 2
Neuromyelitis optica spectrum disorders (NMOSD) cases: 240, Control participants: 50,578
Single-cell eQTL summary statistics of 40 immune cell types for Japanese population (COVID-19: n = 88, Healthy controls: n = 146)
Plasma proteomics data from 83 COVID-19 patients and 144 healthy controls of Japanese
Single-cell RNA-sequencing for PBMC from 15 COVID-19 patients and 72 healthy controls of Japanese
Dataset ID | Type of Data | Criteria | Release Date |
---|---|---|---|
JGAS000205 | Metagenome | Controlled-access (Type I) | 2019/11/15 |
hum0197.v2.gwas.v1 | GWAS for autoimmune pulmonary alveolar proteinosis | Unrestricted-access | 2020/11/27 |
JGAS000260 | Metagenome | Controlled-access (Type I) | 2020/11/27 |
hum0197.v3.gwas.v1 | GWAS for 215 phenotypes | Unrestricted-access | 2021/03/22 |
JGAS000316 | Metagenome | Controlled-access (Type I) | 2021/10/12 |
JGAS000415 | Metagenome | Controlled-access (Type I) | 2021/12/10 |
hum0197.v5.gwas.v1 | GWAS for 10 phenotypes | Unrestricted-access | 2021/12/21 |
hum0197.v5.finemap.v1 | Fine-mapping for 79 phenotypes | Unrestricted-access | 2021/12/21 |
JGAS000504 | Read count data of miRNA | Controlled-access (Type I) | 2022/02/08 |
hum0197.v6.eqtl.v1 | eQTL data | Unrestricted-access | 2022/02/08 |
JGAS000530 | Metagenome | Controlled-access (Type I) | 2022/05/23 |
JGAS000531 | Metagenome | Controlled-access (Type I) | 2022/06/03 |
hum0197.v9.gwas.GCT.v1 | GWAS for intracranial germ cell tumors | Unrestricted-access | 2022/06/10 |
hum0197.v10.gwas.v1 | GWAS for 9 phenotypes | Unrestricted-access | 2022/06/16 |
JGAS000543 | Raw sequencing data of single-cell RNA-seq | Controlled-access (Type I) | 2022/07/21 |
hum0197.v12 | MAG, Viral genome and CRISPR spacers of Microbial genome | Unrestricted-access | 2022/12/01 |
JGAS000543 (data addition) | clinical data | Controlled-access (Type I) | 2023/02/14 |
JGAS000593 | Raw sequencing data of single-cell RNA-seq, clinical data | Controlled-access (Type I) | 2023/02/14 |
hum0197.v3.gwas.v1 (data addition) | GWAS for 5 phenotypes | Unrestricted-access | 2023/02/16 |
JGAS000600 | Metagenome | Controlled-access (Type I) | 2023/03/29 |
hum0197.v16.gwas.v1 | GWAS for 15 phenotypes | Unrestricted-access | 2023/06/06 |
hum0197.v17.hic-gwas.v1 | GWAS for Hunner-type interstitial cystitis | Unrestricted-access | 2023/06/27 |
hum0197.v18.gwas.v1 |
GWAS for gut microbiome GWAS for plasma metabolite GWAS for KEGG Gene Ortholog and KEGG Pathway |
Unrestricted-access | 2023/10/02 |
hum0197.v19.prs.v1 | The weights of variants calculated from GWAS results on type 2 diabetes | Unrestricted-access | 2024/05/29 |
hum0197.v20.gwas.v1 | GWAS for recurrent pregnancy loss | Unrestricted-access | 2024/05/30 |
hum0197.v21.gwas-ehhv6.v1 | GWAS for autoimmune diseases | Unrestricted-access | 2024/10/28 |
JGAS000741 | Controlled-access (Type I) | 2024/10/28 | |
hum0197.v21.gwas-jomon.v1 | GWAS for the individual Jomon proportions | Unrestricted-access | 2024/10/28 |
JGAS000751 | NGS (WGS, RNA-seq) | Controlled-access (Type I) | 2024/11/11 |
hum0197.v23.gwas-nmosd.v1 | Summary statistics of the genome-wide meta-analysis of NMOSD | Unrestricted-access | 2024/12/18 |
E-GEAD-887 | Integrated single-cell object data of PBMCs from 25 NMOSD patients | Unrestricted-access | 2024/12/18 |
E-GEAD-1054 | Single-cell eQTL summary statistics of 40 immune cell types | Unrestricted-access | 2025/05/07 |
JGAS000783 | scRNA-seq, plasma proteomics data | Controlled-access (Type I) | 2025/05/07 |
*Data users need to apply an application for Using NBDC Human Data to reach the Controlled-access Data. Learn more
*When the research results including the data which were downloaded from NHA/DRA, are published or presented somewhere, the data user must refer the papers which are related to the data, or include in the acknowledgment. Learn more
JGAS000205 / JGAS000260 / JGAS000316 / JGAS000415 / JGAS000530 / JGAS000531
Participants/Materials: |
95+103+227+30+136 Japanese individuals Inflammatory Bowel Disease 35 Ulcerative Colitis 39 Crohn's disease 40 Healthy controls |
Targets | Metagenome |
Target Loci for Capture Methods | - |
Platform | Illumina [HiSeq 3000, NovaSeq 6000] |
Library Source | DNA extracted from gut microbiome |
Cell Lines | - |
Library Construction (kit name) | KAPA Hyper Prep Kit |
Fragmentation Methods | Ultrasonic fragmentation (Covaris) |
Spot Type | Paired-end |
Read Length (without Barcodes, Adaptors, Primers, and Linkers) | 150 bp |
Japanese Genotype-phenotype Archive Dataset ID |
JGAD000290 (95 Japanese individuals) JGAD000363 (103 Japanese individuals) JGAD000427 (227 Japanese individuals) JGAD000532 (30 Japanese individuals) JGAD000649 (Inflammatory Bowel Disease) JGAD000650 (136 Japanese individuals) |
Total Data Volume |
JGAD000290:477 GB(fastq) JGAD000363:408 GB(fastq) JGAD000427:881.2 GB(fastq) JGAD000532:106.7 GB(fastq) JGAD000649:374.6 GB (fastq) JGAD000650:541.4 GB(fastq) |
Comments (Policies) | NBDC policy |
Participants/Materials |
Autoimmune pulmonary alveolar proteinosis cases (ICD10: J840): 198 Control participants: 395 |
Targets | genome wide SNPs |
Target Loci for Capture Methods | - |
Platform | Illumina [Infinium Asian Screening Array] |
Library Source | DNAs extracted from peripheral blood cells |
Cell Lines | - |
Reagents (Kit, Version) | Infinium Asian Screening Array |
Genotype Call Methods (software) | GenomeStudio for genotyping, shapeit2 for haplotype phasing, and minimac3 for imputation |
Association Analysis (software) | PLINK2 |
Filtering Methods |
Sample QC: We excluded samples with low genotyping call rates (call rate < 98%) and in close genetic relation (PI_HAT > 0.175). We included samples of the estimated East Asian ancestry. Variant QC: We excluded variants with (1) genotyping call rate < 98%, (2) P value for Hardy–Weinberg equilibrium < 1.0 × 10−6, and (3) minor allele count < 5, or (4) > 10% frequency difference with the imputation reference panel. |
Marker Number (after QC) | 12,153,232 autosomal variants and 242,876 X-chromosomal variants after QC. |
NBDC Dataset ID |
(Click the Dataset ID to download the file) |
Total Data Volume | 390MB for autosome (txt.gz) and 19MB for X chromosome (txt.gz) |
Comments (Policies) | NBDC policy |
Participants/Materials | Biobank Japan (n = 179,000), UK biobank (n = 361,000), FinnGen (n = 136,000), no. Phenotypes: 220 |
Targets | genome wide SNPs |
Target Loci for Capture Methods | - |
Platform |
BBJ: Illumina [HumanOmniExpressExome BeadChip, HumanOmniExpress BeadChip, HumanExome BeadChip] UK Biobank: Applied Biosystems [UK BiLEVE Axiom Array, UK Biobank Axiom Array] FinnGen: Thermo Fisher Scientific [FinnGen1 ThermoFisher Array or other genotyping arrays] |
Library Source | DNAs extracted from peripheral blood cells |
Cell Lines | - |
Reagents (Kit, Version) |
BBJ: HumanOmniExpressExome BeadChip, HumanOmniExpress BeadChip, HumanExome BeadChip UK Biobank: UK BiLEVE Axiom Array, UK Biobank Axiom Array FinnGen: FinnGen1 ThermoFisher Array or other genotyping arrays |
Genotype Call Methods (software) |
BBJ: Eagle, Minimac3 UK Biobank: IMPUTE4 FinnGen: beagle4.1 |
Association Analysis (software) |
For binary traits, SAIGE software was used with age, age2, sex, age×sex, age2×sex, and top 20 principal components as covariates. For quantitative traits (biomarkers), BOLT-LMM or plink software was used with the same covariates.
|
Filtering Methods |
BBJ: We included imputed variants with Rsq > 0.7. UK Biobank: We excluded the variants with (i) INFO score ≤ 0.8, (ii) MAF ≤ 0.0001 (except for missense and protein-truncating variants annotated by VEP, which were excluded if MAF ≤ 1 × 10-6), and (iii) PHWE ≤ 1 × 10-10. FinnGen: We excluded variants with an imputation INFO score < 0.8 or MAF < 0.0001. |
Marker Number (after QC) |
BBJ: 13,530,797 variants UK Biobank: 13,791,467 variants FinnGen: 16,859,359 variants |
NBDC Dataset ID |
(Click the Dataset ID to download the file) |
Total Data Volume |
BBJ: ~1.5G for autosome and ~33M for chrX UK Biobank: ~1.5G for autosome and ~15M for chrX FinnGen: ~740M for autosome and ~20M for chrX |
Comments (Policies) | NBDC policy |
hum0197.v5.gwas.v1 / hum0197.v5.finemap.v1
Participants/Materials | Biobank Japan (n = 179,000), no. Phenotypes: 79 |
Targets | genome wide SNPs |
Target Loci for Capture Methods | - |
Platform | Illumina [HumanOmniExpressExome BeadChip, HumanOmniExpress BeadChip, HumanExome BeadChip] |
Library Source | DNAs extracted from peripheral blood cells |
Cell Lines | - |
Reagents (Kit, Version) | HumanOmniExpressExome BeadChip, HumanOmniExpress BeadChip, HumanExome BeadChip |
Genotype Call Methods (software) | Eagle, Minimac3 |
Association Analysis (software) |
GWAS: For binary traits, SAIGE software was used with age, age2, sex, age×sex, age2×sex, and top 20 principal components as covariates. For quantitative traits (biomarkers), BOLT-LMM was used with the same covariates. Fine-mapping: FINEMAP and SuSiE were used with GWAS summary statistics and in-sample dosage LD, allowing up to 10 causal variants per region. |
Filtering Methods |
GWAS: We included imputed variants with Rsq > 0.7. For binary traits, variants with MAC < 10 were additionally excluded. Fine-mapping: We defined fine-mapping regions based on a 3 Mb window around each lead variant and merged regions if they overlapped. We excluded the major histocompatibility complex (MHC) region (chr 6: 25–36 Mb) from analysis due to extensive LD structure in the region. For each method, we only included variants from successfully fine-mapped regions while excluding those from failed regions (e.g., due to conversion failure or available memory restrictions). |
Marker Number (after QC) | 13,531,752 variants (ref: hg19) |
NBDC Dataset ID |
hum0197.v5.gwas.v1 / hum0197.v5.finemap.v1 (Click the Dataset ID to download the file) |
Total Data Volume | 14 GB |
Comments (Policies) | NBDC policy |
Participants/Materials: | 141 Japanese individuals |
Targets | small RNA-seq |
Target Loci for Capture Methods | - |
Platform | Illumina [HiSeq 2500] |
Library Source | RNAs extracted from PBMC |
Cell Lines | - |
Library Construction (kit name) | SMARTer smRNA-Seq Kit |
Fragmentation Methods | - |
Spot Type | Single-end |
Read Length (without Barcodes, Adaptors, Primers, and Linkers) | 100 bp |
Mapping Methods | bowtie (GRCh37) |
Detecting method for read count (software) | featureCounts + miRbase v22 |
QC | We performed adapter trimming using Cutadapt v1.8 and removed reads with a low quality score (Phred quality score < 20 in >20% of total bases) using fastp v0.20.0. Also, we removed reads with a length of >29 bp or <15 bp, which are not expected to be mature miRNAs. Mature miRNAs detected with ≥1 read in at least half of the individuals were included in the dataset. |
miRNA number | 343 |
Japanese Genotype-phenotype Archive Dataset ID | JGAD000621 |
Total Data Volume | 54.7 KB (txt) |
Comments (Policies) | NBDC policy |
Participants/Materials | 141 Japanese individuals |
Targets | eQTL |
Target Loci for Capture Methods | - |
Platform |
small RNA-seq: Illumina [HiSeq 2500] WGS: Illumina [HiSeq X Ten] |
Library Source | read count data of JGAS000504 and whole genome sequencing data using genomic DNA exracted from whole blood |
Cell Lines | - |
Reagents (Kit, Version) |
small RNA-seq: See JGAS000504 WGS: TruSeq DNA PCR-Free Library Preparation Kit |
Genotype Call / Detecting read count Methods (software) |
See JGAS000504 for read count data. WGS: Sequenced reads were aligned against the reference human genome with the decoy sequence (GRCh37, human_g1k_v37_decoy) using BWA-MEM v0.7.13. |
QC |
See JGAS000504 for read count data. WGS: We removed the variants (i)with low genotyping call rates (<0.90), (ii)with ExcessHet > 60 or (iii) with Hardy–Weinberg Pvalue < 1.0 × 10−10. Genotype refinement was performed using Beagle v5.1. |
Marker Number (after QC) |
See JGAS000504 for read count data. WGS: 12,171,854 variants |
eQTL algorithm | We analyzed the association between genetic variants with minor allele frequency (MAF) ≥ 0.01 within a cis-window around each miRNA (±1 Mb of the mature miRNA) and normalized expression values using MatrixEQTL v2.3. |
NBDC Dataset ID |
(Click the Dataset ID to download the file) |
Total Data Volume | 1.1 MB (txt) |
Comments (Policies) | NBDC policy |
Participants/Materials |
Intracranial germ cell tumors cases (ICD10: C719): 133 Control participants: 762 |
Targets | genome wide SNPs |
Target Loci for Capture Methods | - |
Platform | Illumina [Infinium Asian Screening Array] |
Library Source | DNAs extracted from peripheral blood cells |
Cell Lines | - |
Reagents (Kit, Version) | Infinium Asian Screening Array |
Genotype Call Methods (software) |
GenomeStudio for genotyping shapeit2 for haplotype phasing minimac3 for imputation |
Association Analysis (software) | PLINK2 |
Filtering Methods |
Sample QC: We excluded individuals (i) with genotyping call rate < 0.97, (ii) in close kinship (PI_HAT > 0.17), and (iii) estimated of non-East Asian ancestry were excluded. Variant QC: We excluded variants with (i) genotyping call rate < 0.99, (ii) minor allele count < 5, (iii) P value for Hardy–Weinberg equilibrium < 1.0 × 10−5 in controls, and (iv) > 10% allele frequency difference with the imputation reference panel or the allele frequency panel of Tohoku Medical Megabank Project. Post-imputation QC: We excluded imputed variants with Rsq < 0.7 and minor allele frequency < 0.5%. |
Marker Number (after QC) |
7,803,874 autosomal variants 181,867 X-chromosomal variants |
NBDC Dataset ID |
(Click the Dataset ID to download the file) |
Total Data Volume | 248 MB (txt) |
Comments (Policies) | NBDC policy |
Participants/Materials |
Biobank Japan (n=161,801), UK biobank (n=377,583), no. Phenotypes: 9 Patients: Autoimmune [Rheumatoid arthritis (ICD10: M05), Graves' disease (ICD10: C719), type I diabetes mellitus (ICD10: E10)] Allergy [asthma (ICD10: J45), Atopic dermatitis (ICD10: L20), Pollinosis (ICD10: J301)] Controls: non-autoimmune +non-allergy individuals (There is overlap among patients in each disease category) |
Targets | genome wide SNPs |
Target Loci for Capture Methods | - |
Platform |
BBJ: Illumina [HumanOmniExpressExome BeadChip, HumanOmniExpress BeadChip, HumanExome BeadChip] UK Biobank: Applied Biosystems [UK BiLEVE Axiom Array, UK Biobank Axiom Array] |
Library Source | DNAs extracted from peripheral blood cells |
Cell Lines | - |
Reagents (Kit, Version) |
BBJ: HumanOmniExpressExome BeadChip, HumanOmniExpress BeadChip, HumanExome BeadChip UK Biobank: UK BiLEVE Axiom Array, UK Biobank Axiom Array |
Genotype Call Methods (software) |
BBJ: Eagle, Minimac3 UK Biobank: IMPUTE4 |
Association Analysis (software) |
SAIGE software was used with age, sex, and top five principal components as covariates. RE2C software was used for the multi-trait meta-analysis adjusting for sample overlap between GWAS summary data. |
Filtering Methods |
We excluded the variants with Rsq < 0.7 and MAF < 0.005. |
Marker Number (after QC) |
BBJ: 8,374,220 autosomal variants for individual trait / 8,369,174 autosomal variants for meta-analysis UKB: 10,864,380 autosomal variants for individual trait / 10,858,065 autosomal variants for meta-analysis BBJ + UK Biobank: 5,965,154 autosomal variants for meta-analysis |
NBDC Dataset ID |
(Click the Dataset ID to download the files) |
Total Data Volume |
BBJ: ~ 760MB for individual trait / ~ 430MB for multi-trait meta-analysis UK Biobank: ~ 1.1GB for individual trait / ~ 550MB for multi-trait meta-analysis BBJ+UK Biobank: ~ 310MB for multi-trait meta-analysis |
Comments (Policies) | NBDC policy |
JGAS000543 / JGAS000593 / JGAS000783
Participants/Materials |
COVID-19 (ICD10: U071) : 30 + 43 + 15 cases Healthy controls : 31 + 44 + 27 individuals |
Targets | scRNA-seq |
Target Loci for Capture Methods | - |
Platform | Illumina [NovaSeq 6000] |
Library Source | RNAs extracted from PBMC |
Cell Lines | - |
Library Construction (kit name) | Chromium Next GEM Single Cell 5’ Library & Gel Bead Kit v1.1, Chromium Next GEM Chip G Single Cell Kit, Single Index Kit T Set A |
Fragmentation Methods | Enzymatic fragmentation |
Spot Type | Paired-end |
Read Length (without Barcodes, Adaptors, Primers, and Linkers) | 91 bp |
NBDC Dataset ID | |
Total Data Volume | 1.3 + 2.0 + 2.1 TB (fastq, xlsx [clinical data]) |
Comments (Policies) | NBDC policy |
Participants/Materials: |
Japanese gut microbiome JGAS000205 / JGAS000260 / JGAS000316 / JGAS000415 / JGAS000530 / JGAS000531, Public data (DRA006684) |
Targets | Metagenome |
Target Loci for Capture Methods | - |
Platform | Illumina [HiSeq 2500/3000, NovaSeq 6000] |
Library Source | JGAS000205 / JGAS000260 / JGAS000316 / JGAS000415 / JGAS000530 / JGAS000531, DRA006684 |
Cell Lines | - |
MAG methods | De novo assembly with metaspades was performed. Then, binning with dastools (metabat2、maxbin2、concoct) was applied. |
JDDBJ Sequence Read Archive ID |
JGA MAG: 20220531NSUB000031HIGH_JGA_JMAG_GENOME_*.acclist.txt DRA014186 (JGAS000205 / JGAS000260 / JGAS000316 / JGAS000415 / JGAS000530 / JGAS000531) DRA014188 (JGAS000205 / JGAS000260 / JGAS000316 / JGAS000415 / JGAS000530 / JGAS000531) DRA014191 (JGAS000205 / JGAS000260 / JGAS000316 / JGAS000415 / JGAS000530 / JGAS000531) DRA014192 (JGAS000205 / JGAS000260 / JGAS000316 / JGAS000415 / JGAS000530 / JGAS000531) TPA MAG: EMNX01000001-EMNX01000025、EMNY01000001-EMNY01000068、EMNZ01000001-EMNZ01000149, EMOA01000001-EMOA01000067 DRA014184 (DRA006684) |
Total Data Volume |
JGA MAG: 153 GB (fasta) DRA014186: 11.5 GB (fasta) DRA014188: 11.9 GB (fasta) DRA014191: 12.2 GB (fasta) DRA014192: 5.75 GB (fasta) TPA MAG: 11.9 MB (fasta) DRA014184: 3.65 GB (fasta) |
Comments (Policies) | NBDC policy |
Participants/Materials: |
Japanese gut microbiome JGAS000205 / JGAS000260 / JGAS000316 / JGAS000415 / JGAS000530 / JGAS000531, Public data (DRA006684) |
Targets | NGS (WGS) |
Target Loci for Capture Methods | - |
Platform | Illumina [HiSeq 2500/3000, NovaSeq 6000] |
Library Source | JGAS000205 / JGAS000260 / JGAS000316 / JGAS000415 / JGAS000530 / JGAS000531, DRA006684 |
Cell Lines | - |
Virus genome contsruction | De novo assembly with metaspades was performed. Then, viral contigs were detected with virfinder and virsorter. |
JDDBJ Sequence Read Archive ID |
JGAS000205 / JGAS000260 / JGAS000316 / JGAS000415 / JGAS000530 / JGAS000531: BRDB01000001-BRDB01028816 DRA006684: EMNW01000001-EMNW01002579 |
Total Data Volume |
JGAS000205 / JGAS000260 / JGAS000316 / JGAS000415 / JGAS000530 / JGAS000531: 1.09 GB (fasta) DRA006684: 98.3 MB (fasta) |
Comments (Policies) | NBDC policy |
Participants/Materials: |
Japanese gut microbiome JGAS000205 / JGAS000260 / JGAS000316 / JGAS000415 / JGAS000530 / JGAS000531, Public data (DRA006684) |
Targets | NGS (WGS) |
Target Loci for Capture Methods | - |
Platform | Illumina [HiSeq 2500/3000, NovaSeq 6000] |
Library Source | JGAS000205 / JGAS000260 / JGAS000316 / JGAS000415 / JGAS000530 / JGAS000531, DRA006684 |
Cell Lines | - |
CRISPR contsruction | MINCED was applied to the MAGs. |
DDBJ Sequence Read Archive ID |
DRA014186 (JGAS000205 / JGAS000260 / JGAS000316 / JGAS000415 / JGAS000530 / JGAS000531) DRA014184 (DRA006684) |
Total Data Volume |
DRA014184: 17.9 MB (fasta) DRA014186: 1.43 MB (fasta) |
Comments (Policies) | NBDC policy |
Participants/Materials: |
88 Japanese individuals (shotgun sequencing) 73 healthy individuals (shotgun sequencing) - DNA extraction was performed with phenol-chloroform extraction: 73 samples - DNA extraction with DNeasy PowerSoil Pro kit: 47 samples 5 Japanese individuals (deep shotgun sequencing) |
Targets | Metagenome |
Target Loci for Capture Methods | - |
Platform | Illumina [HiSeq 3000, NovaSeq 6000] |
Library Source | DNA extracted from gut microbiome |
Cell Lines | - |
Library Construction (kit name) | KAPA Hyper Prep Kit |
Fragmentation Methods | Ultrasonic fragmentation (Covaris) |
Spot Type | Paired-end |
Read Length (without Barcodes, Adaptors, Primers, and Linkers) | 150 bp |
Japanese Genotype-phenotype Archive Dataset ID | JGAD000729 |
Total Data Volume | 2.6 TB(fastq) |
Comments (Policies) | NBDC policy |
Participants/Materials |
BioBank Japan (n=180,215), UK Biobank (n=377,441) large-scale meta-analysis including the summary statistics of other cohorts (FinnGen, BCAC, and PRACTICAL) for breast and prostate cancer (n=648,746 and 482,080) Patients: biliary tract (ICD10: C22.1, 23-24), breast (ICD10: C509, cervical (ICD10: C53), colorectal (ICD10: C18-20), endometrial (ICD10: C54), esophageal (ICD10: C15), gastric (ICD10: C16), hepatocellular (ICD10: C22.0), lung (ICD10: C34), non-Hodgkin's lymphoma (ICD10: C82-83), ovarian (ICD10: C56), pancreatic (ICD10: C25), and prostate (ICD10: C61) cancer Controls: without cancer individuals (There is overlap among patients in each disease category) |
Targets | genome wide SNPs |
Target Loci for Capture Methods | - |
Platform |
BBJ: Illumina [HumanOmniExpressExome BeadChip, HumanOmniExpress BeadChip, HumanExome BeadChip] UK Biobank: Applied Biosystems [UK BiLEVE Axiom Array, UK Biobank Axiom Array] FinnGen: Thermo Fisher Scientific [FinnGen1 ThermoFisher Array or other genotyping arrays] BCAC: Illumina [iCOGS OncoArray] PRACTICAL: Illumina [iCOGS OncoArray] |
Library Source | DNAs extracted from peripheral blood cells |
Cell Lines | - |
Reagents (Kit, Version) |
BBJ: HumanOmniExpressExome BeadChip, HumanOmniExpress BeadChip, HumanExome BeadChip UK Biobank: UK BiLEVE Axiom Array, UK Biobank Axiom Array FinnGen: FinnGen1 ThermoFisher Array or other genotyping arrays BCAC: Infinium OncoArray-500K v1.0 BeadChip Kit PRACTICAL: Infinium OncoArray-500K v1.0 BeadChip Kit |
Genotype Call Methods (software) |
BBJ: Eagle, Minimac3 UK Biobank: IMPUTE4 FinnGen: beagle4.1 BCAC: IMPUTE2 PRACTICAL: IMPUTE2 |
Association Analysis (software) |
SAIGE software was used with age, sex, and top five principal components as covariates. RE2C software was used for the multi-trait meta-analysis adjusting for sample overlap between GWAS summary data. |
Filtering Methods |
Sample QC and Variant QC for each dataset: refer to ReadMe file We excluded the variants with Rsq < 0.7 and MAF < 0.01. |
Marker Number (after QC) |
BBJ: 13MN (7,398,798) , each cancer (7,442,557 (7,420,485-7,444,681)) UK Biobank: 13MN (9,602,853), each cancer (9,620,786 (9,620,343-9,620,935)) BBJ + UK Biobank: 13MN (5,374,018), each cancer (5,696,155 (5,677,934-5,698,357)) BBJ + UK Biobank + FinnGen + BCAC (breast cancer): 5,104,756 BBJ + UK Biobank + FinnGen + PRACTICAL (prostate cancer): 5,105,796 BBJ + UK Biobank + FinnGen + BCAC + PRACTICAL (breast cancer + prostate cancer): 5,100,089 *mean (min-max) for each cancer |
NBDC Dataset ID |
(Click the Dataset ID to download the files) |
Total Data Volume |
BBJ: 13MN (287 MB), each cancer (625 (605-633) MB) UK Biobank: 13MN (362 MB), each cancer (841 (814-859) MB) BBJ + UK Biobank: 13MN (202 MB), each cancer (260 (255-264) MB) BBJ + UK Biobank + FinnGen + BCAC (breast cancer): 242 MB BBJ + UK Biobank + FinnGen + PRACTICAL (prostate cancer): 243 MB BBJ + UK Biobank + FinnGen + BCAC + PRACTICAL (breast cancer + prostate cancer): 253 MB *mean (min-max) for each cancer |
Comments (Policies) | NBDC policy |
Participants/Materials |
Hunner-type interstitial cystitis cases (ICD10: N301): 144 Control participants: 41,516 |
Targets | genome wide SNPs |
Target Loci for Capture Methods | - |
Platform | Illumina [Infinium Asian Screening Array] |
Library Source | DNAs extracted from peripheral blood cells |
Cell Lines | - |
Reagents (Kit, Version) | Infinium Asian Screening Array |
Genotype Call Methods (software) |
GenomeStudio for genotyping shapeit4 for haplotype phasing minimac4 for imputation |
Association Analysis (software) | SAIGE |
Filtering Methods |
Sample QC: We excluded individuals with low genotyping call rates (call rate < 98%). We included individuals of the estimated Japanese ancestry using PCA. Variant QC: We excluded variants with (1) genotyping call rate < 99%, (2) minor allele count < 5, (3) P-value for Hardy–Weinberg equilibrium < 1.0 × 10^−10, and (4) > 5% allele frequency difference compared with the imputation reference panel or the allele frequency panel of Tohoku Medical Megabank Project. Post-imputation QC: We excluded imputed variants with Rsq < 0.7 and minor allele frequency < 0.5%. |
Marker Number (after QC) | 7,909,790 variants (hg19) |
NBDC Dataset ID |
(Click the Dataset ID to download the file) |
Total Data Volume | 700 MB (txt) |
Comments (Policies) | NBDC policy |
Participants/Materials |
524 Japanese individuals (423 species in the gut microbiome) 306 Japanese individuals (306 plasma metabolites) 524 Japanese individuals (KEGG Gene Ortholog and KEGG Pathway) |
Targets | genome wide SNPs |
Target Loci for Capture Methods | - |
Platform |
SNP array: Illumina [Infinium Asian Screening Array] Whole genome sequencing: Illumina [HiSeq X Ten] Metagenome shotgun sequencing: Illumina [HiSeq 2500/3000、NovaSeq 6000] |
Library Source | DNAs extracted from peripheral blood cells |
Cell Lines | - |
Reagents (Kit, Version) |
SNP array: Infinium Asian Screening Array Whole genome sequencing: TruSeq DNA PCR-Free Library Preparation Kit Metagenome shotgun sequencing: KAPA Hyper Prep Kit |
Genotype Call Methods (software) |
SNP array: Genotyping: GenomeStudio Haplotype phasing: shapeit4 Imputation: minimac4 WGS: WA-MEM v0.7.13 + GATK v3.8-0 |
Association Analysis (software) | PLINK2 |
Filtering Methods |
SNP array data: Sample QC: We excluded individuals with low genotyping call rates (call rate < 98%). We included individuals of the estimated Asian ancestry using PCA. Variant QC: We excluded variants with (1) genotyping call rate < 99%, (2) minor allele count < 5, (3) P-value for Hardy–Weinberg equilibrium < 1.0 × 10^−10, and (4) > 5% allele frequency difference compared with the imputation reference panel or the allele frequency panel of Tohoku Medical Megabank Project. Post-imputation QC: We excluded imputed variants with Rsq < 0.7 and minor allele frequency < 1%. WGS: We excluded variants with genotype call rate <90%, ExcessHet > 60, Hardy-Weinberg P<1.0×10−10 After imputation with Beagle v5.1, we excluded imputed variants with minor allele frequency < 1%. |
Marker Number (after QC) |
Gut microbiome/KEGG (SNP array): 7,213,470 variants (hg19) Metabolome (WGS): 6,840,258 variants (GRCh37) |
NBDC Dataset ID |
hum0197.v18.gwas.v1 (Gut microbiome, Plasma metabolites, KEGG) (Click the link above to download the files) |
Total Data Volume |
Gut microbiome: 206 GB Metabolome: 90.7 GB KEGG: 300 MB |
Comments (Policies) | NBDC policy |
Participants/Materials |
BioBank Japan Type 2 diabetes (ICD10: E11): 27,642 cases Control participants: 70,242 UK Biobank Type 2 diabetes (ICD10: E11): 27,642 cases Control participants: 70,242 |
Targets | genome wide SNPs |
Target Loci for Capture Methods | - |
Platform |
BBJ: Illumina [HumanOmniExpressExome BeadChip, HumanOmniExpress BeadChip, HumanExome BeadChip] UK Biobank: Applied Biosystems [UK BiLEVE Axiom Array, UK Biobank Axiom Array] |
Library Source | DNAs extracted from peripheral blood cells |
Cell Lines | - |
Reagents (Kit, Version) |
BBJ: HumanOmniExpressExome BeadChip, HumanOmniExpress BeadChip, HumanExome BeadChip UK Biobank: UK BiLEVE Axiom Array, UK Biobank Axiom Array |
Genotype Call Methods (software) |
plink2 |
Association Analysis (software) |
BBJ: Eagle, Minimac3 UK Biobank: IMPUTE4 |
Filtering Methods |
Variants with imputation quality of Rsq < 0.3 or minor allele frequency (MAF) < 1% were excluded The details are described below |
Marker Number (after QC) |
BBJ second cohort: 728,824 variants ToMMo: 855,161 variants |
NBDC Dataset ID |
(Click the Dataset ID to download the file) |
Total Data Volume | 180 MB (txt) |
Comments (Policies) | NBDC policy |
Participants/Materials |
Recurrent pregnancy loss cases (ICD10: N96): 1,728 Control participants: 24,315 |
Targets | genome wide SNPs |
Target Loci for Capture Methods | - |
Platform | Illumina [Infinium Asian Screening Array] |
Library Source | DNAs extracted from peripheral blood cells |
Cell Lines | - |
Reagents (Kit, Version) | Infinium Asian Screening Array |
Genotype Call Methods (software) |
Genotyping: GenomeStudio Haplotype phasing: shapeit4 Imputation: minimac4 |
Association Analysis (software) | SAIGE |
Filtering Methods |
Sample QC: We excluded individuals with low genotyping call rates (call rate < 98%). We included individuals of the estimated Japanese ancestry using PCA. Variant QC: We excluded variants with (1) genotyping call rate < 99%, (2) minor allele count < 5, (3) P-value for Hardy–Weinberg equilibrium < 1.0 × 10^−10, and (4) > 5% allele frequency difference compared with the imputation reference panel or the allele frequency panel of Tohoku Medical Megabank Project. Post-imputation QC: We excluded imputed variants with Rsq < 0.7 and minor allele frequency < 0.5%. |
Marker Number (after QC) | 8,717,430 variants(hg19) |
NBDC Dataset ID |
(Click the Dataset ID to download the file) |
Total Data Volume | 465 MB (txt) |
Comments (Policies) | NBDC policy |
Participants/Materials |
Autoimmune diseases (ICD10: L400, M0690, M329, J840, G35): 2,238 cases Control participants: 2,919 |
Targets | WGS |
Target Loci for Capture Methods | - |
Platform | Illumina [NovaSeq 6000/HiSeq X Ten] |
Library Source | DNAs extracted from peripheral blood cells |
Cell Lines | - |
Library Construction (kit name) | TruSeq DNA PCR-free Library Prep kit |
Fragmentation Methods | Ultrasonic fragmentation |
Spot Type | Paired-end |
Read Length (without Barcodes, Adaptors, Primers, and Linkers) | 150 bp x 2 |
Methods for removing host sequence/detecting viral sequence (software) |
https://github.com/shohei-kojima/integrated_HHV6_recon https://github.com/shohei-kojima/human_anellovirus_detection |
QC | We conducted principal component analysis (PCA) against HapMap3 data using SNP data of the same individuals to confirm the East Asian genetic background. |
Reference sequence for viral genome | Refer to the softwares' GitHub repositry. |
Japanese Genotype-phenotype Archive Dataset ID | JGAD000876 |
Total Data Volume | 181.3 GB (fastq) |
Comments (Policies) | NBDC policy |
Participants/Materials |
Systemic lupus erythematosus (ICD10: M329): 8 cases eHHV-6B-positive: 3 cases eHHV-6B-negative: 5 cases |
Targets | scRNA-seq |
Target Loci for Capture Methods | - |
Platform | Illumina [NovaSeq 6000] |
Library Source | RNAs extracted from PBMC |
Cell Lines | - |
Library Construction (kit name) | Chromium Next GEM Single Cell 5’ Library & Gel Bead Kit v1.1, Chromium Next GEM Chip G Single Cell Kit, Single Index Kit T Set A |
Fragmentation Methods | Enzymatic fragmentation |
Spot Type | Paired-end |
Read Length (without Barcodes, Adaptors, Primers, and Linkers) | 91 bp |
NBDC Dataset ID | JGAD000876 |
Total Data Volume | 181.3 GB (fastq) |
Comments (Policies) | NBDC policy |
Participants/Materials |
Autoimmune diseases (ICD10: L400, M0690, M329, J840, G35): 238 cases eHHV-6B-positive: 22 cases eHHV-6B-negative: 216 cases |
Targets | genome wide SNPs |
Target Loci for Capture Methods | - |
Platform | Illumina [NovaSeq 6000/HiSeq X Ten] |
Library Source | DNAs extracted from peripheral blood cells |
Cell Lines | - |
Reagents (Kit, Version) | TruSeq DNA PCR-free Library Prep kit/span> |
Genotype Call Methods (software) | The FASTQ reads were aligned to T2T-CHM13v2.0 with BWA-MEM (v0.7.27), followed by GATK4 MarkDuplicates and Base Quality Score Recalibration (v4.2.6.1) according to the GATK Best Practice. Then, we performed per-sample SNP and indel calling using GATK4 HaplotypeCaller and joint genotyping using GATK4 GenomicsDBImport and GenotypeGVCF. We conducted LD-based genotype refinement for low-confidence genotypes and missing sites in WGS data using BEAGLE v5.4 with default settings. |
Association Analysis (software) | PLINK v2.0 software was used with top two principal components and sex as covariates. |
Filtering Methods |
Sample QC: Individuals were excluded if they showed conflicting sex assignments between genetically inferred sex by variants and WGS coverage, deviating heterozygosity rate (±3 standard deviations), or cryptic relatedness (pi-hat > 0.2). We included samples of the estimated Japanese ancestry using PCA. Four cases were excluded. Variant QC: We excluded (1) non-autosomal variants, (2) multi-allelic sites and spanning deletions, and (3) variants with P-value for Hardy?Weinberg equilibrium < 1e-10 in cases and < 1e-6 in controls. |
Marker Number (after QC) | 6,464,509 SNPs |
NBDC Dataset ID |
(Click the Dataset ID to download the file) |
Total Data Volume | 416 MB (tsv) |
Comments (Policies) | NBDC policy |
Participants/Materials | The first cohort of Biobank Japan (n = 171,287) |
Targets | genome wide SNPs |
Target Loci for Capture Methods | - |
Platform | Illumina [HumanOmniExpressExome BeadChip, HumanOmniExpress BeadChip, HumanExome BeadChip] |
Library Source | DNAs extracted from peripheral blood cells |
Cell Lines | - |
Reagents (Kit, Version) | HumanOmniExpressExome BeadChip, HumanOmniExpress BeadChip, HumanExome BeadChip |
Genotype Call Methods (software) | Eagle, Minimac3 |
Association Analysis (software) |
1) GCTA-fastGWA with the adjustment of covariates: age, age2, sex, the top 20 PCs, 45 disease status, geographic regions, and PCA clusters. 2) Fixed-effect meta-analysis of Mainland summary data including individuals from the Mainland and EA_admix clusters (n = 151,075) and of Ryukyu summary data including individuals from the Ryukyu, Ryukyu admix, and Hokkaido_sub clusters (n = 10,080) using METAL. |
Filtering Methods |
Sample QC: We excluded (i) individuals with lower call rates (< 99%), (ii) closely related individuals with genetic relatedness ≥ 0.178 calculated from a genetic related matrix (GRM) by GCTA (version 1.93.3beta2). We included samples of the estimated Japanese ancestry using PCA. Variant QC: We excluded variants with (i) call rate < 99%, (ii) P value for Hardy-Weinberg equilibrium (HWE) < 1.0 × 10-6, (iii) number of heterozygotes < 5, and (iv) a concordance rate < 99.5% or a non-reference concordance rate between GWAS array and whole genome sequencing. after association test: Double genomic control correction method using METAL was conducted. Computing Z score for each variant by considering the sign of the beta coefficient and the associated p-value, we left the variants with positive Z score. |
Marker Number (after QC) | 3,454,970 SNPs |
NBDC Dataset ID |
(Click the Dataset ID to download the file) |
Total Data Volume | 65 MB (txt) |
Comments (Policies) | NBDC policy |
Participants/Materials |
HPV-associated oropharyngeal cancer (ICD10: C10): 14 cases tumor tissues: 14 samples non-tumor tissues (blood): 14 samples |
Targets | WGS |
Target Loci for Capture Methods | - |
Platform | MGI [DNBSEQ-T7] |
Library Source | DNAs extracted from tumor tissues and blood samples |
Cell Lines | - |
Library Construction (kit name) | TruSeq DNA PCR-free Library Prep kit |
Fragmentation Methods | Ultrasonic fragmentation |
Spot Type | Paired-end |
Read Length (without Barcodes, Adaptors, Primers, and Linkers) | 150 bp x 2 |
NBDC Dataset ID | JGAD000890 |
Total Data Volume | 1.5 TB (fastq) |
Comments (Policies) | NBDC policy |
Participants/Materials |
HPV-associated oropharyngeal cancer (ICD10: C10): 19 cases tumor tissues: 19 samples Non-HPV-associated oropharyngeal cancer (ICD10: C10): 17 cases tumor tissues: 17 samples Healthy controls: 2 individuals normal tonsils: 2 samples |
Targets | RNA-seq |
Target Loci for Capture Methods | - |
Platform | Illumina [HiSeq X Ten] |
Library Source | RNAs extracted from tumor tissues and normal tonsils |
Cell Lines | - |
Library Construction (kit name) | TruSeq Stranded mRNA Library Prep Kit |
Fragmentation Methods | chemical fragmentation |
Spot Type | Paired-end |
Read Length (without Barcodes, Adaptors, Primers, and Linkers) | 150 bp x 2 |
NBDC Dataset ID | JGAD000890 |
Total Data Volume | 1.5 TB (fastq) |
Comments (Policies) | NBDC policy |
Participants/Materials |
Systemic lupus erythematosus (ICD10: M329): 8 cases eHHV-6B-positive: 3 cases eHHV-6B-negative: 5 cases |
Targets | scRNA-seq |
Target Loci for Capture Methods | - |
Platform | Illumina [NovaSeq 6000] |
Library Source | RNAs extracted from PBMC |
Cell Lines | - |
Library Construction (kit name) | Chromium Next GEM Single Cell 5’ Library & Gel Bead Kit v1.1, Chromium Next GEM Chip G Single Cell Kit, Single Index Kit T Set A |
Fragmentation Methods | Enzymatic fragmentation |
Spot Type | Paired-end |
Read Length (without Barcodes, Adaptors, Primers, and Linkers) | 91 bp |
NBDC Dataset ID | JGAD000876 |
Total Data Volume | 181.3 GB (fastq) |
Comments (Policies) | NBDC policy |
Participants/Materials |
NMOSD (ICD10: G36.0): 240 cases control participants: 50,578 |
Targets | genome wide SNPs |
Target Loci for Capture Methods | - |
Platform | Illumina [Infinium Asian Screening Array] |
Library Source | DNAs extracted from peripheral blood cells |
Cell Lines | - |
Reagents (Kit, Version) | Infinium Asian Screening Array |
Genotype Call Methods (software) |
Genotyping: GenomeStudio Haplotype phasing: shapeit4 Imputation: minimac4 |
Association Analysis (software) | SAIGE |
Filtering Methods |
Sample QC: We excluded samples with low genotyping call rates (call rate < 98%) or potential sex chromosome aneuploidy. We included only the individuals of the estimated East Asian ancestry using the principal component (PC) analysis, and then further restricted to those in Japanese Hondo (the main island of Japan) clusters. Variant QC: We excluded variants with (1) genotyping call rate < 99%, (2) minor allele count < 5, (3) P-value for Hardy–Weinberg equilibrium < 1.0 × 10−10, and (4) > 5% allele frequency difference compared with the imputation reference panel or the allele frequency panel of Tohoku Medical Megabank Project and in-house reference panel. Post-imputation QC: We excluded imputed variants with Rsq < 0.7 and minor allele frequency < 0.5%. |
Marker Number (after QC) | 8,894,915 variants |
NBDC Dataset ID |
(Click the Dataset ID to download the file) |
Total Data Volume | 159 MB (txt) |
Comments (Policies) | NBDC policy |
Participants/Materials | NMOSD (ICD10: G36.0): 25 cases |
Targets | scRNA-seq |
Target Loci for Capture Methods | - |
Platform | Illumina [NovaSeq 6000] |
Library Source | RNAs extracted from PBMC |
Cell Lines | - |
Library Construction (kit name) | Chromium Next GEM Single Cell 5’ Library & Gel Bead Kit v1.1, Chromium Next GEM Chip G Single Cell Kit, Single Index Kit T Set A |
Fragmentation Methods | Enzymatic fragmentation |
Spot Type | Paired-end |
Read Length (without Barcodes, Adaptors, Primers, and Linkers) | 90 bp |
Mapping Methods | Cell Ranger 6.0.0 |
QC Methods | GRCh38 |
Gene Number | Cell Ranger 6.0.0 |
Genomic Expression Archive Data set ID | E-GEAD-887 |
Total Data Volume | 1,477 MB (rds) |
Comments (Policies) | NBDC policy |
Participants/Materials |
COVID-19 (ICD-10: U071): 88 cases healthy controls:146 individuals |
Targets | eQTL |
Target Loci for Capture Methods | - |
Platform |
single cell RNA-seq: Illumina [NovaSeq 6000] WGS: Illumina [HiSeq X, NovaSeq 6000] (EGAS00001008016) |
Library Source | Single-cell RNA-seq data of 40 immune cell types and whole genome sequencing data using genomic DNA exracted from whole blood |
Cell Lines | - |
Reagents (Kit, Version) |
single-cell RNA-seq: See JGAS000543 / JGAS000593 / JGAS000783 WGS: TruSeq DNA PCR-Free Library Preparation Kit |
Genotype Call / Detecting read count Methods (software) | |
Read count data: Cell Ranger (version 5.0.0, 10x Genomics, GRCh38) |
|
WGS: mapping: BWA-MEM (version 0.7.17, alt-aware mode, GRCh38) Duplicated reads removal: Picard (MarkDuplicates) Contamination estimation: VerifyBamID2 Base quality score recalibration: GATK (version 4.2.6.1) variant call: HaplotypeCaller joint-calling: GenotypeGVCFs SNV/indel Variant Quality Score Recalibration: GATK Best Practice |
|
Filtering Methods | |
Read count exclusion criteria: UMI< 1st percentile or >99 percentile gene expression < 200 reads from mitochondrial genes or Hemoglobin genes > 10% putative doublets removal for each sample: Scrublet (v0.2.1), scds (v1.10.0) |
|
WGS exclusion criteria: DP <5 or GQ <20 in the autosomes, female X chromosomes, or pseudoautosomal region in the male X chromosomes; DP <2 or GQ <10 in the non-pseudoautosomal region in the male X chromosomes; heterozygous calls in the non-pseudoautosomal region in the male X chromosomes. genotype call rate <0.90 Hardy–Weinberg P value <1.0 × 10-10; alleles <3 indels larger than 100bp spanning deletions In addition, we removed variants showing a significant difference in allele frequency (χ2 >60 by the chi-squared test) compared with the following representative reference datasets of Japanese ancestry: previously published Japanese WGS data (n = 1,037) and the public allele frequency panel of Tohoku Medical Megabank Project. Finally, we imputed missing genotypes at the QC-passing variants using SHAPEIT4 software version 4.2.1. |
|
Marker Number (after QC) | |
Read count data:Genes expressed in 1% or more of cells (for each cell type) | |
WGS: 6,001,990 SNPs | |
eQTL algorithm | tensorQTL |
Genomic Expression Archive ID | E-GEAD-1054 |
Total Data Volume | 2.1 TB (txt) |
Comments (Policies) | NBDC policy |
Participants/Materials |
COVID-19 (ICD-10: U071): 83 cases Healthy controls: 144 cases |
Targets | Protein expression (2932 proteins) |
Target Loci for Capture Methods | - |
Platform | Olink [Olink Explore 3072] |
Library Source | Plasma |
Cell Lines | - |
Library Construction (kit name) | Olink Explore 3072 |
Fragmentation Methods | - |
Spot Type | - |
Read Length (without Barcodes, Adaptors, Primers, and Linkers) | - |
Detecting Methods for Proteins (software) | Olink Analyze R package |
Normalization Methods | Normalized Protein eXpression (NPX) transformation |
Validation Methods | - |
Japanese Genotype-phenotype Archive Dataset ID | JGAD000925 |
Total Data Volume | 2.1 TB (txt) |
Comments (Policies) | NBDC policy |
Principal Investigator: Yukinori Okada
Affiliation: Department of Statistical Genetics, Osaka University Graduate School of Medicine
Project / Group Name: -
Funds / Grants (Research Project Number):
Name | Title | Project Number |
---|---|---|
Precursory Research for Innovative Medical care (PRIME), Advanced Research & Development Programs for Medical Innovation, Japan Agency for Medical Research and Development (AMED) | Crosstalk among microbiome, host, disease, and drug discovery enhanced by statistical genetics | JP19gm6010001 |
FORCE, Advanced Research & Development Programs for Medical Innovation, Japan Agency for Medical Research and Development (AMED) | Elucidation of disease-specific microbiota and personalized medicine by metagenome-wide association studies | JP20gm4010006 |
Practical Research Project for Rare / Intractable Diseases, Japan Agency for Medical Research and Development (AMED) | Biology and in silico drug repositioning of pulmonary alveolar proteinosis using trans-layer omics analysis | JP20ek0109413 |
Practical Research Project for Allergic Diseases and Immunology, Japan Agency for Medical Research and Development (AMED) | Nucleic genome drug discovery for autoimmune diseases through in-silico and patient-oriented screening utilizing large-scale disease genetics | JP19ek0410041 |
Practical Research Project for Allergic Diseases and Immunology, Japan Agency for Medical Research and Development (AMED) | Genomic prediction medicine of rheumatoid arthritis based on comprehensive immune-omics resources | JP21ek0410075 |
Platform Program for Promotion of Genome Medicine, Japan Agency for Medical Research and Development (AMED) | Implementation of genomic prediction medicine based on statistical genetics | JP21km0405211 |
Platform Program for Promotion of Genome Medicine, Japan Agency for Medical Research and Development (AMED) | Next-generation genomics analyses elucidates biology, personalized medicine, and drug discovery of psoriasis | JP21km0405217 |
KAKENHI Grant-in-Aid for Scientific Research (A) | Elucidation of disease biology and tissue specificity by trans-layer omics analysis and whole-genome sequencing | 19H01021 |
KAKENHI Grant-in-Aid for Scientific Research (A) | Elucidation of immune and allergic disease dynamics by integrative sequencing analysis | 22H00476 |
Title | DOI | Dataset ID | |
---|---|---|---|
1 | Metagenome-wide association study of gut microbiome revealed novel aetiology of rheumatoid arthritis in the Japanese population. | doi: 10.1136/annrheumdis-2019-215743 | JGAD000290 |
2 | Genetic determinants of risk in autoimmune pulmonary alveolar proteinosis. | doi: 10.1038/s41467-021-21011-y | hum0197.v2.gwas.v1 |
3 | A metagenome-wide association study of gut microbiome in patients with multiple sclerosis revealed novel disease pathology. | doi: 10.3389/fcimb.2020.585973 | JGAD000363 |
4 | A global atlas of genetic associations of 220 deep phenotypes | doi: 10.1101/2020.10.23.20213652 | hum0197.v3.gwas.v1 |
5 | Metagenome-wide association study revealed disease-specific landscape of the gut microbiome of systemic lupus erythematosus in Japanese | doi: 10.1136/annrheumdis-2021-220687 | JGAD000427 |
6 | Whole gut virome analysis of 476 Japanese revealed a link between phage and autoimmune disease | doi: 10.1136/annrheumdis-2021-221267 | JGAD000532 |
7 | Insights from complex trait fine-mapping across diverse populations | doi: 10.1101/2021.09.03.21262975 |
hum0197.v5.gwas.v1 hum0197.v5.finemap.v1 |
8 | Genetic architecture of microRNA expression and its link to complex diseases in the Japanese population. | doi: 10.1093/hmg/ddab361 |
JGAD000621 hum0197.v6.eqtl.v1 |
9 | Multi-trait and cross-population genome-wide association studies across autoimmune and allergic diseases identify shared and distinct genetic components. | doi: 10.1136/annrheumdis-2022-222460 | hum0197.v10.gwas.v1 |
10 | DOCK2 is involved in the host genetics and biology of severe COVID-19 | doi: 10.1038/s41586-022-05163-5 | JGAD000662 |
11 | Prokaryotic and viral genomes recovered from 787 Japanese gut metagenomes revealed microbial features linked to diets, populations, and diseases | doi: 10.1016/j.xgen.2022.100219 | hum0197.v12 |
12 | Reconstruction of the personal information from human genome reads in gut metagenome sequencing data | doi: 10.1038/s41564-023-01381-3 | JGAD000363 JGAD000427 JGAD000532 JGAD000650 JGAD000729 |
13 | Pan-cancer and cross-population genome-wide association studies dissect shared genetic backgrounds underlying carcinogenesis | doi: 10.1038/s41467-023-39136-7 | hum0197.v16.gwas.v1 |
14 | Single-cell analyses and host genetics highlight the role of innate immune cells in COVID-19 severity | doi: 10.1038/s41588-023-01375-1 | JGAD000662 JGAD000722 |
15 | Genome-wide association analysis identifies susceptibility loci within the major histocompatibility complex region for Hunner-type interstitial cystitis | doi: 10.1016/j.xcrm.2023.101114 | hum0197.v17.hic-gwas.v1 |
16 | Analysis of gut microbiome, host genetics, and plasma metabolites reveals gut microbiome-host interactions in the Japanese population | doi: 10.1016/j.celrep.2023.113324 | hum0197.v18.gwas.v1 |
17 | Body mass index stratification optimizes polygenic prediction of type 2 diabetes in cross-biobank analyses | doi: 10.1038/s41588-024-01782-y | hum0197.v19.prs.v1 |
18 | Common and rare genetic variants predisposing females to unexplained recurrent pregnancy loss | doi: 10.1038/s41467-024-49993-5 | hum0197.v20.gwas.v1 |
19 | Blood DNA virome associates with autoimmune diseases and COVID-19 | doi: 10.1038/s41588-024-02022-z | hum0197.v21.gwas-ehhv6.v1 JGAD000876 |
20 | Genetic legacy of ancient hunter-gatherer Jomon in Japanese populations | doi: 10.1038/s41467-024-54052-0 | hum0197.v21.gwas-jomon.v1 |
21 | Intratumor Heterogeneity of HPV Integration in HPV-associated Head and Neck Cancer | doi: 10.1038/s41467-025-56150-z | JGAD000890 |
22 | Contribution of germline and somatic mutations to risk of neuromyelitis optica spectrum disorder | doi: 10.1016/j.xgen.2025.100776 | hum0197.v23.gwas-nmosd.v1 E-GEAD-887 |
23 | Deciphering state-dependent immune features from multi-layer human omics data at single-cell resolution | E-GEAD-1054 JGAS000783 |
Principal Investigator | Affiliation | Country/Region | Research Title | Data in Use (Dataset ID) | Period of Data Use |
---|---|---|---|---|---|
Ilana Brito | Meinig School of Biomedical Engineering, Cornell University | United States of America | Comparative metagenomics of lupus patients' microbiomes | JGAD000290, JGAD000363, JGAD000427, JGAD000532 |
2022/05/12-2024/05/04 |
Yongxin Li | Department of Chemistry, The University of Hong Kong | Hong Kong | Comparison of gut bacterial diversity and composition in MS/EAE | JGAD000363 | 2022/09/19-2024/07/01 |
Tina Fuchs | Institute for Clinical Chemistry, Medical Faculty Mannheim, Heidelberg University | Germany | Investigating the clonality of VIREM cells in COVID-19 patients | JGAD000662, JGAD000772 | 2024/02/26-2024/12/31 |
Koichi Matsuda | Department of Computational Biology and Medical Sciences, Graduate school of Frontier Sciences, The University of Tokyo | Japan | Disease Cohort Research Network for Disease Marker Exploratory Studies | JGAD000290, JGAD000363, JGAD000427, JGAD000532, JGAD000649, JGAD000650, JGAD000662, JGAD000722, JGAD000729 |
2024/06/17-2029/03/31 |
hum0197 Release Note
Research ID | Release Date | Type of Data |
---|---|---|
hum0197.v24 | 2025/05/07 | Single-cell eQTL summary statistics of 40 immune cell types, scRNA-seq and plasma proteomics data for COVID-19 |
hum0197.v23 | 2024/12/18 | Summary statistics of the genome-wide meta-analysis of NMOSD (240 cases and 50,578 controls), Integrated single-cell object data of PBMCs from 25 NMOSD patients. |
hum0197.v22 | 2024/11/11 | NGS (WGS, RNA-seq) |
hum0197.v21 | 2024/10/28 | GWAS for autoimmune diseases, The presence or absence of endogenous herpesvirus 6 and anellovirus load calculated from NGS (WGS) for autoimmune diseases, scRNA-seq for eHHV-6-positive/negative SLE, GWAS for the individual Jomon proportions |
hum0197.v20 | 2024/05/30 | GWAS for recurrent pregnancy loss |
hum0197.v19 | 2024/05/29 | The weights of variants calculated from GWAS results on type 2 diabetes |
hum0197.v18 | 2023/10/02 | GWAS for gut microbiome, plasma metabolite, KEGG Gene Ortholog and KEGG Pathway |
hum0197.v17 | 2023/06/27 | GWAS for Hunner-type interstitial cystitis |
hum0197.v16 | 2023/06/06 | GWAS for 15 phenotypes |
hum0197.v15 | 2023/03/29 | Metagenome |
hum0197.v14 | 2023/02/16 | GWAS for 5 phenotypes |
hum0197.v13 | 2023/02/14 | Raw sequencing data of single-cell RNA-seq, clinical data |
hum0197.v12 | 2022/12/01 | Microbial genome: MAG, Viral genome, CRISPR spacers |
hum0197.v11 | 2022/07/21 | Raw sequencing data of single-cell RNA-seq |
hum0197.v10 | 2022/06/16 | GWAS for 9 phenotypes |
hum0197.v9 | 2022/06/10 | GWAS for Intracranial germ cell tumors |
hum0197.v8 | 2022/06/03 | Metagenome |
hum0197.v7 | 2022/05/23 | Metagenome |
hum0197.v6 | 2022/02/08 | Read count data of miRNA, eQTL summary data |
hum0197.v5 | 2021/12/21 | GWAS for 10 phenotypes, Fine-mapping for 79 phenotypes |
hum0197.v4 | 2021/12/10 | Metagenome |
hum0197.v3 | 2021/03/22 | GWAS for 215 phenotypes |
hum0197.v2 | 2020/11/27 | GWAS for autoimmune pulmonary alveolar proteinosis, Metagenome |
hum0197.v1 | 2019/11/15 | Metagenome |
- Single-cell eQTL summary statistics for 40 immune cell types using scRNA-seq analysis data from 234 Japanese individuals (88 COVID-19 cases and 146 healthy controls) and plasma proteome data (matrix data for each subject x each protein) from 227 individuals (83 COVID-19 cases and 144 healthy controls) are provided as text files.
- RNAs extracted from PBMCs of 87 Japanese individuals (15 COVID-19 patients and 72 healthy controls) were subjected to single-cell RNA-seq analysis. Fastq files are provided.
- DNAs extracted from peripheral blood of patients with 240 NMOSD patients and 50,578 controls were used for whole genome sequencing. Summary statistics of the genome-wide meta-analysis performed on WGS are provided (txt file).
- RNAs extracted from PBMCs of NMOSD patients were subjected to single-cell RNA-seq analysis. Integrated data are provided (rds file).
- DNAs extracted from tumor tissues and blood samples of 14 HPV-associated oropharyngeal cancer patients were subjected to whole-genome sequencing. Fastq files are provided.
- RNAs extracted from tumor tissues of 19 HPV-associated oropharyngeal cancer patients and 17 Non-HPV-associated oropharyngeal cancer patients and normal tonsils of two healthy controls were subjected to bulk RNA sequencing. Fastq files are provided.
- DNAs extracted from peripheral blood of patients with autoimmune diseases and control participants were used for whole genome sequencing. The presence or absence of endogenous herpesvirus 6 and anellovirus load calculated from WGS are provided (tsv file).
- DNAs extracted from peripheral blood cells of patients with eHHV-6B-positive/-negative autoimmune diseases were genotyped, imputed, and genome-wide association study was performed (tsv file).
- RNAs extracted from PBMCs of eHHV-6B-positive/-negative SLE patients were subjected to single-cell RNA-seq analysis. (fastq files).
- DNAs extracted from the Japanese populations were genotyped, imputed, and genome-wide association study for the individual Jomon proportions was performed (text file).
DNAs extracted from peripheral blood cells of patients with recurrent pregnancy loss were genotyped, imputed, and genome-wide association study was performed (text file).
The weights of variants existing in the target cohorts, Tohoku Medical Megabank and the second cohort of BBJ, were calculated from GWAS results on 27,642 type 2 diabetes cases and 70,242 controls from BioBank Japan and 27,642 type 2 diabetes cases and 70,242 controls from UK Biobank (text file).
DNAs extracted from the Japanese populations were genotyped, imputed, and genome-wide association studies for gut microbiome, plasma metabolite, KEGG Gene Ortholog and KEGG Pathway were performed (text file).
DNAs extracted from peripheral blood cells of patients with Hunner-type interstitial cystitis were genotyped, imputed, and genome-wide association study was performed (text file).
DNAs extracted from the Japanese and trans-ethnic populations were genotyped, imputed, and genome-wide association studies and meta analyses for 15 phenotypes were performed (text file).
Metagenome analyses of the gut microbiome in the Japanese population and healthy individuals were performed by utilizing whole-genome shotgun sequencing. Fastq files are provided.
- Shotgun sequencing data for 88 Japanese individuals
- Shotgun sequencing data for 73 healthy individuals (DNA extraction was performed with phenol-chloroform extraction). Among these individuals, 47 were also subjected to DNA extraction with DNeasy PowerSoil Pro kit and sequenced.
- Deep shotgun sequencing data for 5 Japanese individuals
DNAs extracted from the Japanese and trans-ethnic populations were genotyped, imputed, and genome-wide association studies and meta analyses for 5 phenotypes were performed (text file).
RNAs extracted from PBMCs of 43 COVID-19 patients and 44 healthy controls were subjected to single-cell RNA-seq analysis. Fastq files are provided. Clinical data for hum0197.v11 and hum0197.v13 are provided as xlsx files.
Assembled microbial genome sequences obteined from metagenome analysis at hum0197.v7 and hum0197.v8 are provided (fasta files).
RNAs extracted from PBMCs of 30 COVID-19 patients and 31 healthy controls were subjected to single-cell RNA-seq analysis. Fastq files are provided.
DNAs extracted from the Japanese and British populations were genotyped, imputed, genome-wide association studies, and meta analyses for 9 phenotypes were performed (text file).
DNAs extracted from patients with intracranial germ cell tumors were genotyped, imputed, and performed a genome-wide association study (text file).
Metagenome analysis of the gut microbiome in the Japanese population was performed by utilizing whole-genome shotgun sequencing. Fastq files are provided.
A metagenome-wide association study of gut microbiome in Inflammatory Bowel Disease (35 Ulcerative Colitis and 39 Crohn's disease) and 40 Healthy controls was performed by utilizing whole-genome shotgun sequencing. Fastq files are provided.
- RNAs extracted from PBMCs of 141 Japanese individuals were subjected to small RNA-seq analysis. Read count data of miRNAs is provided as a txt file.
- gDNAs extracted from whole blood of 141 Japanese individuals were subjected to whole genome sequencing analysis. miRNA-eQTL mapping was performed. eQTL summary data is provided as a txt file.
- DNAs extracted from the Japanese populations were genotyped, imputed, and genome-wide association study for 10 phenotypes were performed (text file).
- DNAs extracted from the Japanese populations were genotyped, imputed, and fine-mapping study for 79 phenotypes were performed (text file).
A metagenome-wide association study of gut microbiome in the Japanese population was performed by utilizing whole-genome shotgun sequencing. Fastq files are provided.
DNAs extracted from the Japanese and trans-ethnic populations were genotyped, imputed, and genome-wide association studies and meta analyses for 215 phenotypes were performed (text file).
- DNAs extracted from autoimmune pulmonary alveolar proteinosis patients were genotyped, imputed, and genome-wide association study was performed (text file).
- A metagenome-wide association study of gut microbiome in the Japanese population was performed by utilizing whole-genome shotgun sequencing. Fastq files are provided.
A metagenome-wide association study of gut microbiome in the Japanese population was performed by utilizing whole-genome shotgun sequencing. Fastq files are provided.
Note:
hum0362 Release Note
Research ID | Release Date | Type of Data |
---|---|---|
hum0362.v1 | 2025/04/01 | NGS (RNA-seq) |
RNAs extracted from frozen tissues from 97 thymoma samples and 1 normal thymus sample were used for RNA-sequencing analysis. Fastq files are provided.
Note:
NBDC Research ID: hum0362.v1
Aims: This study has following four objectives
1) We identify driver mutations, classify the disease by gene profiling, and evaluate biomarkers that predict treatment efficacy and genomic changes during recurrence or metastasis.
2) We identify biomarkers that predict therapeutic efficacy and toxicity by analyzing polymorphisms of genes related to pharmacokinetics and pharmacology in non-tumor tissues and genetic mutations related to familial tumors and by performing association analysis between germline gene profiling and clinical information.
3) We compare "Clinical Liquid Sequencing" using plasma-derived free DNA with tissue biopsy ("Clinical Sequencing") to evaluate its efficacy.
4) In thymic tumors, we will examine immune cell profiling. The relationship between thymic tumors and autoimmune diseases will also be examined.
Methods: RNA-seq analysis
Participants/Materials: 98 thymoma cases (97 thymoma samples and one normal thymus sample)
Dataset ID | Type of Data | Criteria | Release Date |
---|---|---|---|
JGAS000552 | NGS (RNA-seq) | Controlled-access (Type I) | 2025/04/01 |
* Data users need to apply an application for Using NBDC Human Data to reach the Controlled-access Data. Learn more
Participants/Materials |
Thymoma (ICD10: C37): 98 cases - thymoma: 97 samples - normal thymus: 1 sample |
Targets | RNA-seq |
Target Loci for Capture Methods | - |
Platform | Illumina [NovaSeq 6000] |
Library Source | RNAs extracted from frozen tissues |
Cell Lines | - |
Library Construction (kit name) | TruSeq Stranded Total RNA Library Prep Gold Kit |
Fragmentation Methods | Heat treatment |
Spot Type | Paired-end |
Read Length (without Barcodes, Adaptors, Primers, and Linkers) | 100bp x 2 |
Japanese Genotype-phenotype Archive Dataset ID | JGAD000672 |
Total Data Volume | 249.2 GB (fastq) |
Comments (Policies) | NBDC policy |
Principal Investigator: Shinichi Yachida
Affiliation: Affiliated institution and division, Osaka University Cancer Genome Informatics
Project / Group Name: The study on genomic profiling using clinical specimens (tissue, blood, etc.) from patients with lung and thymic tumors
Funds / Grants (Research Project Number):
-
Name | Title | Project Number |
---|---|---|
KAKENHI Grant-in-Aid for Scientific Research (B) | Metagenomic analysis of familial adenomatous polyposis and its implications for understanding the pathogenesis of colorectal cancer | JP20H033620 |
Practical Research for Innovative Cancer Control, Japan Agency for Medical Research and Developmstent (AMED) | Research aimed at elucidating alterations in the intestinal microenvironment associated with colorectal cancer development in high-risk individuals, and the development of personalized preventive strategies based on these findings | JP21ck0106546 |
Project for Cancer Research and Therapeutic Evolution, Japan Agency for Medical Research and Developmstent (AMED) | Research and development aimed at predicting the occurrence of adverse effects associated with anticancer drugs through comprehensive analysis of the gut microbiota | JP21cm0106477 |
Research Program on the challenges of Global Health issues, Japan Agency for Medical Research and Developmstent (AMED) | Investigation of infection-associated cancers conducted within the framework of the Japan–U.S. Medical Cooperation Program | JP20jk0210009 |
Project for Promotion of Cancer Research and Therapeutic Evolution, Japan Agency for Medical Research and Developmstent (AMED) | Development of biomarkers for early detection and susceptibility assessment of young-onset colorectal cancer through comprehensive analysis of the intestinal microenvironment | JP22ama221404 |
Practical Research for Innovative Cancer Control, Japan Agency for Medical Research and Developmstent (AMED) | Collection and analysis of whole-genome sequencing data and clinical information on treatment-resistant gastrointestinal cancers, such as esophageal cancer, with the aim of facilitating drug development and establishing a nationwide infrastructure for genome-informed medical care in Japan | JP22ck0106690 |
Practical Research for Innovative Cancer Control, Japan Agency for Medical Research and Developmstent (AMED) | Advancement of preventive, diagnostic, and therapeutic approaches through comprehensive genomic analysis of rare malignancies, including sarcomas and brain tumors | JP21ck0106693 |
Practical Research for Innovative Cancer Control, Japan Agency for Medical Research and Developmstent (AMED) | Development of novel strategies for prevention, diagnosis, and treatment through comprehensive genomic analysis of rare cancers, including sarcomas and brain tumors | JP24ck0106877 |
Practical Research for Innovative Cancer Control, Japan Agency for Medical Research and Developmstent (AMED) | Implementation of advanced cancer precision medicine through the timely and accurate return of patient-specific data, including whole-genome information, alongside the development of novel therapeutics | JP24ck0106872 |
the National Cancer Center Research and Development Fund | - | 2023-A-06 |
Japan Health Research Promotion Bureau Research Fund | - | JH2022-B-01 |
Integrated Frontier Research for Medical Science Division, Institute for Open and Transdisciplinary Research Initiatives, Osaka University | - | - |
Joint Research Project of the Institute Medical Science, the University of Tokyo | - | - |
the Takeda Science Foundation | - | - |
the Yasuda Medical Foundation | - | - |
the Mitsubishi Foundation | - | - |
the Princess Takamatsu Cancer Research Fund | - | - |
Takara Bio Inc. | - | - |
Title | DOI | Dataset ID | |
---|---|---|---|
1 | Characterizing the Tumor Immune Environment in Thymic Epithelial Tumors Using T-cell Receptor Repertoire Analysis and Gene Expression Profiling | doi: 10.1016/j.xjon.2025.03.008 | JGAD000672 |
2 |
Principal Investigator | Affiliation | Country/Region | Research Title | Data in Use (Dataset ID) | Period of Data Use |
---|---|---|---|---|---|
NBDC Research ID: hum0434.v1
Aims: The aim of this study is to identify somatic gene mutations associated with various diseases involving T cells or NK cells, and to perform integrated analyses with clinical data in order to elucidate gene sets that contribute to disease onset or treatment resistance. These findings are expected to provide fundamental data for the development of novel diagnostic criteria and therapeutic agents.
Methods: whole exome sequencing and target capture sequencing
Participants/Materials:
- whole exome sequencing: CD4-positive cells and CD8-positive cells were isolated from 10 patients with pure red cell aplasia were used.
- targeted capture sequencing: Peripheral blood mononuclear cells (PBMCs) from 53 patients with pure red cell aplasia, 10 patients with aplastic anemia, 2 healthy controls, and 55 patients with large granular lymphocytic leukemia were used. For 4 patients with pure red cell aplasia, PBMCs were collected at 2 different time points (Case 1: onset and 9y later, Case 2: onset and 5y later, Case 3: onset and 6y later, Case 4: onset and 3y later).
Dataset ID | Type of Data | Criteria | Release Date |
---|---|---|---|
JGAS000658 | Controlled-access (Type I) | 2025/04/17 | |
JGAS000709 | NGS (Target Capture) | Controlled-access (Type I) | 2025/04/17 |
* Data users need to apply an application for Using NBDC Human Data to reach the Controlled-access Data. Learn more
Participants/Materials |
pure red cell aplasia (ICD10: D600): 10 cases CD4-positive cells: 10 samples CD8-positive cells: 10 samples |
Targets | Exome |
Target Loci for Capture Methods | - |
Platform | Thermo Fisher Scientific [Ion Torrent S5] |
Library Source | DNAs extracted from CD4-positive cells and CD8-positive cells of patients |
Cell Lines | - |
Library Construction (kit name) | Ion AmpliSeq Exome RDY Kit |
Fragmentation Methods | - |
Spot Type | Single-end |
Read Length (without Barcodes, Adaptors, Primers, and Linkers) | 200 bp |
Mapping Methods | Torrent Suite software program |
Mapping Quality | MAPQ = 6.5 |
Reference Genome Sequence | hg19 |
Coverage (Depth) |
CD4-positive cells: 84x CD8-positive cells: 136x |
Japanese Genotype-phenotype Archive Dataset ID | JGAD000788 |
Total Data Volume | 701.6 GB (bam) |
Comments (Policies) | NBDC policy |
Participants/Materials |
[JGAS000658] pure red cell aplasia (ICD10: D600): 53 cases PBMCs were collected at 2 different time points from the 4 patients with pure red cell aplasia (C1: onset and 9y later, C2: onset and 5y later, C3: onset and 6y later, C4: onset and 3y later) aplastic anemia (ICD10: D61.3): 10 cases 2 healthy controls PBMCs: 69 samples Buccal swab, CD4-positive cells and neutrophils from 1 pure red cell aplasia case included above cases: total 3 samples [JGAS000709] large granular lymphocytic leukemia (ICD10: C917): 55 cases PBMCs: 55 samples |
Targets | Target Capture |
Target Loci for Capture Methods | 52 genes (Gene list) |
Platform | Thermo Fisher Scientific [Ion GeneStudio S5] |
Library Source | DNAs extracted from PBMCs, buccal swab, CD4-positive cells and neutrophils |
Cell Lines | - |
Library Construction (kit name) | Ion AmpliSeq Library Kit Plus |
Fragmentation Methods | - |
Spot Type | Single-end |
Read Length (without Barcodes, Adaptors, Primers, and Linkers) | 200 bp |
Japanese Genotype-phenotype Archive Dataset ID | |
Total Data Volume |
JGAD000788: 701.6 GB (fastq) JGAD000842: 169 GB (fastq) |
Comments (Policies) | NBDC policy |
Principal Investigator: Fumihiro Ishida
Affiliation: Department of Biomedical Laboratory Sciences, Shinshu University School of Medicine
Project / Group Name: -
Funds / Grants (Research Project Number):
Name | Title | Project Number |
---|---|---|
KAKENHI Grant-in-Aid for Scientific Research (C) | Mutational profiles of T cells in bone marrro failure syndrome as clinical markers | 20K08709 |
KAKENHI Grant-in-Aid for Young Scientists | Elucidation of immune abnormalities in thymoma and related autoimmune diseases through comprehensive genetic analysis of T cells | 21K16302 |
Title | DOI | Dataset ID | |
---|---|---|---|
1 | Mutational heterogeneities in STAT3 and clonal hematopoiesis-related genes in acquired pure red cell aplasia | doi: 10.1007/s00277-025-06356-4 | JGAD000788 JGAD000842 |
2 |
Principal Investigator | Affiliation | Country/Region | Research Title | Data in Use (Dataset ID) | Period of Data Use |
---|---|---|---|---|---|