Important Notice

The operation and URL of NBDC Human Database will change in Apr 2024. For details, please visit the link.

The review system of NBDC Human Database has changed. For details, please visit the link.

About NBDC Human Database

An enormous amount of human data is being generated with advances in next-generation sequencing and other analytical technologies. We therefore need rules and mechanisms for organizing and storing such data and for effectively utilizing them to make progress in the life sciences.

To promote sharing and utilization of human data while considering the protection of personal information, the Database Center for Life Science (DBCLS) of the Joint Support-Center for Data Science Research, Research Organization of Information and Systems (ROIS-DS) created a platform for sharing various data generated from human specimens, which are available for publicly access in cooperation with the DNA Data Bank of JapanDDBJ logo ddbj 2013.

You can apply to use or submit human data through this website.

Violators of the guidelines who have not submitted a report on the deletion of Controlled-access data shall be disclosed here.

NBDC Research ID: hum0014.v34

 

SUMMARY

Aims: Identify disease-related genes and mobile element variations in Japanese

Methods: Genomic DNA samples were genotyped by following methods: Human610-Quad BeadChip, HumanHap550v3 Genotyping BeadChip, HumanOmniExpress-12 BeadChip, HumanExome BeadChip, OmniExpressExome BeadChip (Illumina), high-density oligonucleotide arrays (Perlegen Sciences), or Invader (Hologic Japan). Genome-Wide Association Studies (GWAS) for myocardial infarction (MI) , type II diabetes mellitus (T2DM), Atopic dermatitis (AD), atrial fibrillation (AF), Body Mass Index (BMI), primary open-angle glaucoma (POAG), 58 quantitative traits, age at menarche / menopause, smoking behaviour, height, 42 diseases (among them, the samples of 4 diseases were partially overlapped with those of previous release), dietary habits, and coronary artery disease were performed using about 500-2700K variants. Meta analyses for T2DM with diabetic nephropathy and for T2DM were also performed. SNP array analysis for 51 diseases registered in Biobank Japan were performed. Whole-genome sequencing analyses for 1,026 + 1,007 patients, who were registered Bio Bank Japan from 2003 - 2007, 1,765 myocardial infarction patients, 199 dementia patients, 256 + 2,067 gastric cancer patients, 617 colorectal cancer patients and 2,162 diabetes patients were performed with Illumina HiSeq 2500/X Five. Target sequencing analyses of 11 hereditary breast cancer genes in 7,104 breast cancer patients and 23,731 controls, 8 hereditary prostate cancer genes in 7,636 prostate cancer patients and 12,366 controls, 23 genes related to clonal hematopoiesis in 11,234 subjects extracted from approximately 200,000 subjects registered in Biobank Japan between fiscal years 2003 to 2007, 27 cancer-predisposing genes in 1,009 pancreatic cancer patients, 12,606 colorectal cancer patients, 740 renal cell cancer patients, 1,982 lymphoma patients, 10,366 gastric cancer patients and 23,780 + 5,996 + 37,592 controls and 13 renal cell carcinoma-related genes in 740 renal cell cancer patients and 5,996 controls were performed with Illumina HiSeq 2500. SNP array analysis for 11,234 subjects was also performed. A new reference panel was build with WGS data of the biobank Japan project (N=1,037) and the 1KGP p3v5 ALL (N=2,504). Sex-stratified genome-wide association studies using a Cox proportional hazard model under the assumption of the additive genetic model were performed. Associations of genetic variants estimated by saddle point estimation using SPACox software were also evaluated. A mobile element variation (MEV) search tool, MEGAnE, was applied to 4,880 WGS conducted in BBJ and 24,933 MEVs were found. Genome-wide association study for atrial fibrillation was performed in 9,826 cases and 140,446 controls. A subsequent cross-ancestry meta-analysis with European GWAS (60,620 cases and 970,216 controls; http://csg.sph.umich.edu/willer/public/afib2018) and Finnish GWAS (7,244 cases and 56,378 controls; FinnGenn; https://www.finngen.fi/en) was performed (77,690 cases and 1,167,040 controls in total). Polygenic risk score was constructed based on the cross-ancestry meta-analysis of atrial fibrillation.

Participants/Materials: Participants for the Tailor-made Medical Treatment Program (BioBank Japan: BBJ)

URL: https://biobankjp.org/cohort_3rd/english/index.html

 

Dataset IDType of DataCriteriaRelease Date
hum0014.v1.freq.v1 GWAS for MI Unrestricted-access 2014/09/30
hum0014.v2.jsnp.934ctrl.v1

Genotype frequencies in 934 healthy individuals

(JSNP data)

Unrestricted-access 2015/12/28
35 Dieases

Genotype frequencies in each disease

(JSNP data)

Unrestricted-access 2015/12/28
hum0014.v2.jsnp.182ec.v1

Genotype frequencies in 182 esophageal cancer patients

(JSNP data)

Unrestricted-access 2015/12/28
hum0014.v2.jsnp.92als.v1

Genotype frequencies in 92 amyotrophic lateral sclerosis (ALS) patients

(JSNP data)

Unrestricted-access 2015/12/28
hum0014.v3.T2DM-1.v1 GWAS for T2DM [1] Unrestricted-access 2016/01/28
hum0014.v3.T2DM-2.v1 GWAS for T2DM [2] Unrestricted-access 2016/01/28
hum0014.v4.AD.v1 GWAS for AD Unrestricted-access 2016/02/02
hum0014.v5.AF.v1 GWAS for AF Unrestricted-access 2016/05/18
JGAS000101 Genotype and phenotype data for 8180 AF patients Controlled-access (Type I) 2016/05/18
hum0014.v6.158k.v1 GWAS for BMI Unrestricted-access 2017/09/08
JGAS000114

BMI data for 158,284 individuals

Genotype data for 182,505 individuals

Controlled-access (Type I) 2017/09/08
hum0014.v7.POAG.v1 GWAS for POAG Unrestricted-access 2018/04/04
hum0014.v8.58qt.v1 GWAS for 58 quantitative traits Unrestricted-access 2018/05/01
JGAS000114 58 quantitative traits data for 200,849 individuals Controlled-access (Type I) 2018/05/01

hum0014.v9.Men.v1

hum0014.v9.MP.v1

GWAS for age at menarche and menopause Unrestricted-access 2018/08/07
JGAS000114 WGS for 1,026 individuals Controlled-access (Type I) 2018/08/13
JGAS000140 target sequencing of 11 hereditary breast cancer genes in 7,104 breast cancer patients and 23,731 controls Controlled-access (Type I) 2018/10/16
hum0014.v12.T2DMwN.v1 meta analysis of 2 GWASs for T2DM with diabetic nephropathy Unrestricted-access 2018/12/10
hum0014.v13.T2DMmeta.v1 meta analysis of 4 GWASs for T2DM Unrestricted-access 2019/01/25
hum0014.v14.smok.v1 GWAS for smoking behaviour Unrestricted-access 2019/03/26
JGAS000114 a reference panel from WGS data of the biobank Japan project (N=1,037) and 1KGP p3v5 ALL (N=2,504) Controlled-access (Type I) 2019/09/27
hum0014.v15.ht.v1 GWAS for height Unrestricted-access 2019/09/27
JGAS000203 target sequencing of 8 hereditary prostate cancer genes in 7,636 prostate cancer patients and 12,366 controls Controlled-access (Type I) 2019/10/07
hum0014.v17 GWAS for 40 diseases Unrestricted-access 2019/10/08
hum0014.v18 GWAS for Breast cancer Unrestricted-access 2019/11/26
hum0014.v19 GWAS for dietary habits Unrestricted-access 2020/04/20
hum0014.v20.cad.v1 GWAS for coronary artery disease Unrestricted-access 2020/08/17
hum0014.v21 GWAS for coronary artery disease Unrestricted-access 2020/08/25
JGAS000293 target sequencing of 23 genes related to clonal hematopoiesis and SNP array in 11,234 subjects extracted from approximately 200,000 subjects registered in Biobank Japan between fiscal years 2003 to 2007 Controlled-access (Type I) 2021/05/21
JGAS000114 (Data addition) bam/gvcf data of WGS (JGAD000220) Controlled-access (Type I) 2021/07/13
JGAS000327 target sequencing of 27 cancer-predisposing genes in 1,005 pancreatic cancer patients Controlled-access (Type I) 2021/11/26
JGAS000346 target sequencing of 27 cancer-predisposing genes in 12,503 colorectal cancer patients and 23,705 controls Controlled-access (Type I) 2021/11/26
JGAS000381 WGS for 1,765 myocardial infarction patients and 199 dementia patients Controlled-access (Type I) 2022/01/25
JGAS000414 target sequencings of 27 cancer-predisposing genes and 13 renal cell carcinoma-related genes in 740 renal cell cancer patients and 5,996 controls Controlled-access (Type I) 2022/04/01
hum0014.v27.surv.v1 GWAS for survival time in 137,693 individuals from BBJ 1st cohort Unrestricted-access 2022/12/31
hum0014.v28.MEs.v1 mobile element variations in 4,880 individuals from BBJ 1st cohort Unrestricted-access 2023/04/05
hum0014.v29.AF.v1

GWAS for 9,826 AF patients and 140,446 controls from BBJ 1st cohort

GWAS meta-analysis for 77,690 AF patients and 1,167,040 controls

Unrestricted-access 2023/04/05
JGAS000347 target sequencing of 27 cancer-predisposing genes in 1,982 lymphoma patients Controlled-access (Type I) 2023/04/20
JGAS000592 target sequencing of 27 cancer-predisposing genes in 10,366 gastric cancer patients Controlled-access (Type I) 2023/04/20
JGAS000592 target sequencing of 27 cancer-predisposing genes in 37,592 controls Controlled-access (Type I) 2023/04/20
JGAD000690 Processed data of JGAD000220 (WGS for 1,026 individuals) by JGA (CRAM, gVCF) Controlled-access (Type I) 2023/08/31
JGAD000758 Processed data (joint call) of JGAD000220 (WGS for 1,026 individuals) by JGA (aggregate VCF) Controlled-access (Type I) 2023/08/31
JGAD000679 Processed data of JGAD000220 (reference panel) by JGA (data for the TogoImputation reference panel) Controlled-access (Type I) 2023/09/01
JGAS000647 WGS for 1,007 individuals Controlled-access (Type I) 2024/01/11
JGAS000698 WGS for 256 gastric cancer patients Controlled-access (Type I) 2024/05/27
JGAS000703 SNP array for 269,000 patients (51 diseases) in BBJ 1st and 2nd cohort Controlled-access (Type I) 2024/05/27
JGAS000699 WGS for 617 colorectal cancer patients Controlled-access (Type I) 2024/05/27
JGAS000700 low-depth WGS for 2,162 diabetes patients Controlled-access (Type I) 2024/05/27
JGAS000701 low-depth WGS for 2,067 gastric cancer patients Controlled-access (Type I) 2024/05/27
JGAD000867 Processed data of JGAD000220 (reference panel) by JGA (data for the TogoImputation reference panel) Controlled-access (Type I) 2024/09/19
JGAD000868 Processed data of JGAD000495 (reference panel) by JGA (data for the TogoImputation reference panel) Controlled-access (Type I) 2024/09/19

* Release Note

*Data users need to apply an application for Using NBDC Human Data to reach the Controlled-access Data. Learn more

* When the research results including the data which were downloaded from NHA/DRA, are published or presented somewhere, the data user must refer the papers which are related to the data, or include in the acknowledgment. Learn more

 

MOLECULAR DATA

hum0014.v1.freq.v1

Participants/Materials 1666 MI patients and 3198 controls
Targets genome wide SNVs
Target Loci for Capture Methods -
Platform Illumina [Human610-Quad BeadChip, HumanHap550v3 Genotyping BeadChip]
Source gDNA extracted from peripheral blood cells
Cell Lines -
Reagents (Kit, Version) Illumina Human610-Quad Beadchip
Genotype Call Methods (software) GenCall software (GenomeStudio)
Filtering Methods sample call rate ≧ 0.98, SNP call rate ≧ 0.99, HWE P ≧ 1 x 10^-6
Marker Number (after QC) 455,781 SNPs (hg18/GRCh36)
NBDC Dataset ID

hum0014.v1.freq.v1

(Click the Dataset ID to download the file)

Dictionary file

Total Data Volume 71.3 MB (xlsx)
Comments (Policies) NBDC policy

 

hum0014.v2.jsnp.934ctrl.v1

Participants/Materials 934 Japanese healthy individuals (JSNP)
Targets genome wide SNVs
Target Loci for Capture Methods -
Platform Illumina [HumanHap550v3 Genotyping BeadChip]
Source gDNA extracted from peripheral blood cells
Cell Lines -
Reagents (Kit, Version) Illumina HumanHap550v3 Genotyping BeadChip
Genotype Call Methods (software) GenCall software (GenomeStudio)
Filtering Methods sample call rate < 0.98, SNP call rate < 0.99, HWE P < 1 x 10^-6
Marker Number (after QC) 515,286 SNPs
NBDC Dataset ID

hum0014.v2.jsnp.934ctrl.v1

(Click the Dataset ID to download the file)

Dictionary file

Total Data Volume 32.2 M (zip [xls])
Comments (Policies) NBDC policy

 

35 Diseases (JSNP)

Participants/Materials

Cancer (Lung cancer, Breast cancer, Gastric cancer, Colorectal cancer, Prostate cancer)

Cardiovascular diseases (Heart failure, Myocardial infarction, Unstable angina, Stable angina, Cardiac arrhythmias, Arteriosclerosis obliterans)

Cerebrovascular disorders (Brain infarction, Intracranial aneurysm)

Respiratory tract diseases (Interstitial pneumonitis & pulmonary fibrosis, Pulmonary emphysema, Bronchial asthma)

Chronic liver diseases (Chronic hepatitis C, Liver cirrhosis)

Eye diseases (Cataract, Glaucoma)

Others (Epilepsy, Periodontal disease, Urolithiasis, Nephrotic syndrome, Uterine myoma, Endometriosis,

Osteoporosis, Rheumatoid arthritis, Amyotrophic lateral sclerosis, Hay fever, Atopic dermatitis,

Drug eruptions , Hyperlipidemias, Diabetes mellitus, Basedow disease )

 

about 190 patients in each disease set

Targets genome wide SNVs
Target Loci for Capture Methods -
Platform Perlegen Sciences [high-density oligonucleotide arrays]
Source gDNA extracted from peripheral blood cells
Cell Lines -
Reagents (Kit, Version) -
Genotype Call Methods (software) -
Filtering Methods -
Marker Number (after QC) About 200,000 SNPs (b129)
NBDC Dataset ID

Cancer (Lung cancer, Breast cancer, Gastric cancer, Colorectal cancer, Prostate cancer)

Cardiovascular diseases (Heart failure, Myocardial infarction, Unstable angina, Stable angina, Cardiac arrhythmias, Arteriosclerosis obliterans)

Cerebrovascular disorders (Brain infarction, Intracranial aneurysm)

Respiratory tract diseases (Interstitial pneumonitis & pulmonary fibrosis, Pulmonary emphysema, Bronchial asthma)

Chronic liver diseases (Chronic hepatitis C, Liver cirrhosis)

Eye diseases (Cataract, Glaucoma)

Others (Epilepsy, Periodontal disease, Urolithiasis, Nephrotic syndrome, Uterine myoma, Endometriosis,

Osteoporosis, Rheumatoid arthritis, Amyotrophic lateral sclerosis, Hay fever, Atopic dermatitis,

Drug eruptions, Hyperlipidemias, Diabetes mellitus, Basedow disease)

(Click the disease names to download the file)

Dictionary file

Comments (Policies) NBDC policy

*Chromosomal position of each SNP is based on dbSNP build 129. If you need other mapping information, please contact us.

 

hum0014.v2.jsnp.182ec.v1

Participants/Materials 182 esophageal cancer patients (JSNP)
Targets genome wide SNVs
Target Loci for Capture Methods -
Platform Illumina [HumanHap550v3 Genotyping BeadChip]
Source gDNA extracted from peripheral blood cells
Cell Lines -
Reagents (Kit, Version) Illumina HumanHap550v3 Genotyping BeadChip
Genotype Call Methods (software) GenCall software (GenomeStudio)
Filtering Methods sample call rate < 0.98, SNP call rate < 0.99, HWE P < 1 x 10^-6
Marker Number (after QC) 503,734 SNPs
NBDC Dataset ID

hum0014.v2.jsnp.182ec.v1

(Click the Dataset ID to download the file)

Dictionary file

Total Data Volume 6.6 MB (zip [txt])
Comments (Policies) NBDC policy

 

hum0014.v2.jsnp.92als.v1

Participants/Materials 92 ALS patients (JSNP)
Targets large-scale case-control association study
Target Loci for Capture Methods -
Platform Hologic Japan [Invader]
Source gDNA extracted from peripheral blood cells
Cell Lines -
Reagents (Kit, Version) Invader assay system (Third Wave Technologies)
Genotype Call Methods (software) ABI PRISM SDS versions 2.0 - 2.2
Filtering Methods SNP call rate ≥ 0.95, HWE P ≥1.0 x 10^-2
Marker Number (after QC) 48,939 SNPs
NBDC Dataset ID

hum0014.v2.jsnp.92als.v1

(Click the Dataset ID to download the file)

Dictionary file

Total Data Volume 3.2 MB (zip [txt])
Comments (Policies) NBDC policy

 

hum0014.v3.T2DM-1.v1

Participants/Materials

9817 T2DM patients

6763 controls (healthy individuals and patients with Intracranial aneurysm, Esophageal cancer, Uterine cancer, Pulmonary emphysema, or Glaucoma [without T2DM])

Targets genome wide SNVs
Target Loci for Capture Methods -
Platform Illumina [OmniExpressExome Beadchip]
Source gDNA extracted from peripheral blood cells
Cell Lines -
Reagents (Kit, Version) Illumina OmniExpressExome Beadchip kit
Genotype Call Methods (software) GenCall software (GenomeStudio)
Filtering Methods sample call rate < 0.98, SNP call rate < 0.99, MAF < 0.01, HWE P < 1 x 10^-6 in control
Marker Number (after QC) 552,915 SNPs (hg19)
NBDC Dataset ID

hum0014.v3.T2DM-1.v1

(Click the Dataset ID to download the file)

Dictionary file

Total Data Volume 84.0 MB (xlsx)
Comments (Policies) NBDC policy

 

hum0014.v3.T2DM-2.v1

Participants/Materials

5646 T2DM patients

19,420 controls (patients with Colorectal cancer, Breast cancer, Prostate cancer, Lung cancer, Gastric cancer,

Arteriosclerosis obliterans, Cardiac arrhythmias, Brain infarction, Myocardial infarction, Gallbladder cancer and Cholangiocarcinoma, Pancreatic cancer, Drug eruptions,

Rheumatoid arthritis, Amyotrophic lateral sclerosis, Liver cancer, Liver cirrhosis, Osteoporosis, or Uterine myoma [without T2DM])

Targets genome wide SNVs
Target Loci for Capture Methods -
Platform Illumina [Human610-Quad BeadChip]
Source gDNA extracted from peripheral blood cells
Cell Lines -
Reagents (Kit, Version) Illumina Human610-Quad Beadchip kit
Genotype Call Methods (software) GenCall software (GenomeStudio)
Filtering Methods sample call rate < 0.98, SNP call rate < 0.99, MAF < 0.01, HWE P < 1 x 10^-6 in control
Marker Number (after QC) 479,088 SNPs (hg18)
NBDC Dataset ID

hum0014.v3.T2DM-2.v1

(Click the Dataset ID to download the file)

Dictionary file

Total Data Volume 72.6 MB (xlsx)
Comments (Policies) NBDC policy

 

hum0014.v4.AD.v1

Participants/Materials

1472 AD patients

7966 controls (healthy individuals and patients with Intracranial aneurysm, Esophageal cancer,

         Uterine cancer, Pulmonary emphysema, or Glaucoma [without AD and Bronchial asthma])

Targets genome wide SNVs
Target Loci for Capture Methods -
Platform Illumina [HumanOmniExpress BeadChip]
Source gDNA extracted from peripheral blood cells
Cell Lines -
Reagents (Kit, Version) HumanOmniExpress BeadChip
Genotype Call Methods (software)

minimac [imputation (1000 genomes Phase I v3)]

Association Analysis (software) mach2dat [GWAS]
Filtering Methods

Genotyping QC: sample call rate < 0.98, SNV call rate < 0.99,

                           HWE P < 1 x 10^-6 in the control samples

Imputation QC: HWE P < 1 x 10^-6 or MAF < 0.01 in the reference panel

                         Differences of MAF between the GWAS dataset and the reference panel > 0.16

Marker Number (after QC) About 7,700,000 SNPs (hg19)
NBDC Dataset ID

hum0014.v4.AD.v1

(Click the Dataset ID to download the file)

Dictionary file

Total Data Volume

ADGWAS_auto.txt (525 MB)

ADGWAS_X_females.txt (17 MB)

ADGWAS_X_males.txt (15 MB)

Comments (Policies) NBDC policy

 

JGAS000101 / hum0014.v5.AF.v1

Participants/Materials

8180 atrial fibrillation patients and 28,612 controls

Targets genome wide SNVs
Target Loci for Capture Methods -
Platform Illumina [HumanOmniExpress, HumanExome, OmniExpressExome BeadChip]
Source DNA extracted from peripheral blood cells
Cell Lines -
Reagents (Kit, Version) HumanOmniExpress, HumanExome, OmniExpressExome BeadChip kit
Genotype Call Methods (software)

minimac [imputation (1000 genomes Phase I v3)]

GenCall software(GenomeStudio)

Association Analysis (software) mach2dat [GWAS]
Filtering Methods

Genotyping QC: sample call rate < 0.98, SNV call rate < 0.99,

                           HWE P < 1 x 10^-6 in the control samples

Imputation QC: HWE P < 1 x 10^-6 or MAF < 0.01 in the reference panel

                         Differences of MAF between the GWAS dataset and the reference panel > 0.16

                         R square < 0.9

Marker Number (after QC) About 5,000,000 SNVs
Phenotype Data Gender, Age

NBDC Dataset ID /

Japanese Genotype-phenotype

Archive Dataset ID

[GWAS stats]

hum0014.v5.AF.v1

(Click the Dataset ID to download the file)

Dictionary file

[Individual datasets]

Phenotype: JGAD000101

Genotype: JGAD000102

Total Data Volume

GWAS: 473 MB (txt)

Individual phenotype-genotype data: 1 GB (txt)

Comments (Policies) NBDC policy

 

JGAS000114 / hum0014.v6.158k.v1

Participants/Materials 182,505 individuals (158,284 individuals for BMI study)
Targets genome wide SNVs
Target Loci for Capture Methods -
Platform Illumina [HumanOmniExpress, HumanExome, OmniExpressExome BeadChip]
Source DNA extracted from peripheral blood cells
Cell Lines -
Reagents (Kit, Version) HumanOmniExpress, HumanExome, OmniExpressExome BeadChip kit
Genotype Call Methods (software)

minimac [imputation (1000 genomes Phase I v3)]

GenCall software (GenomeStudio)

Association Analysis (software) mach2qtl (v1.1.3)
Filtering Methods

Genotyping QC: sample call rate < 0.98, SNV call rate < 0.99, HWE P < 1 x 10^-6

QC for reference panel:

After excluding 11 closely related individuals, variants with HWE P < 1.0 x 10^-6, MAF < 0.01 were excluded.

QC after imputation:

Variants with imputation quality of Rsq < 0.7 were excluded.

Marker Number (after QC) About 6,000,000 and 150,000 SNVs on autosomes and X-chromosome, respectively.

NBDC Dataset ID /

Japanese Genotype-phenotype

Archive Dataset ID

[GWAS]

hum0014.v6.158k.v1

(Click the Dataset ID to download the file)

Dictionary file

[Individual datasets]

Phenotype data (BMI): JGAD000124

Genotype data: JGAD000123

Detailed information on genotyping array

Probe information (BLAST)

Total Data Volume

GWAS: 406 MB (zip)

Phenotype data (BMI): 3.32 MB (txt.gz)

Genotype data: 26.3GB (csv.gz)

Comments (Policies) NBDC policy

 

hum0014.v7.POAG.v1

Participants/Materials

3980 POAG patients (Male: 1,997, Female: 1,983)

18,815 controls (Male: 7,817, Female: 10,998)

Targets genome wide SNVs
Target Loci for Capture Methods -
Platform Illumina [HumanOmniExpress, HumanExome, OmniExpressExome BeadChip]
Source gDNA extracted from peripheral blood cells
Cell Lines -
Reagents (Kit, Version) HumanOmniExpress, HumanExome, OmniExpressExome BeadChip kit
Genotype Call Methods (software)

minimac(ver. 0.1.1) [imputation (1000 genomes Phase I v3)]

Association Analysis (software) mach2dat(ver. 1.0.19)
Filtering Methods

Genotyping QC: sample call rate < 0.98, SNV call rate < 0.99, HWE P < 1 x 10^-6

QC for reference panel: After excluding 11 closely related individuals, variants with HWE P < 1.0 x 10^-6, MAF < 0.01 were excluded.

QC after imputation: Variants with imputation quality of Rsq < 0.7 were excluded. We also excluded variants with |beta| > 4 in the uploaded files.

Marker Number (after QC)

autosomes: 5,961,428 SNPs (hg19)

male X-chromosome: 147,351 SNPs (hg19)

female X-chromosome: 147,353 SNPs (hg19)

NBDC Dataset ID

hum0014.v7.POAG.v1

(Click the Dataset ID to download the file)

Dictionary file

Total Data Volume 113 MB (txt.zip)
Comments (Policies) NBDC policy

 

JGAS000114 / hum0014.v8.58qt.v1

Participants/Materials 162,255 individuals for 58 quantitative traits
Targets genome wide SNVs
Target Loci for Capture Methods -
Platform Illumina [HumanOmniExpress, HumanExome, OmniExpressExome BeadChip]
Source DNA extracted from peripheral blood cells
Cell Lines -
Reagents (Kit, Version) HumanOmniExpress, HumanExome, OmniExpressExome BeadChip kit
Genotype Call Methods (software)

minimac [imputation (1000 genomes Phase I v3)]

GenCall software (GenomeStudio)

Association Analysis (software) mach2qtl (v1.1.3)
Filtering Methods

Genotyping QC: sample call rate < 0.98, SNV call rate < 0.99, HWE P < 1 x 10^-6

QC for reference panel:

After excluding 11 closely related individuals, variants with HWE P < 1.0 x 10^-6, MAF < 0.01 were excluded.

QC after imputation:

Variants with imputation quality of Rsq < 0.7 were excluded.

Marker Number (after QC) 5,961,600 and 147,353 SNVs on autosomes and X-chromosome, respectively.

NBDC Dataset ID /

Japanese Genotype-phenotype

Archive Dataset ID

Metabolic Total cholesterol JGAD000144
High density lipoprotein cholesterol JGAD000145
Low density lipoprotein cholesterol JGAD000146
Triglyceride JGAD000147
Blood sugar JGAD000148
Hemoglobin A1c JGAD000149
Protein Total protein JGAD000150
Albumin JGAD000151
Non-albumin protein JGAD000152
Albumin/globulin ratio JGAD000153
Kidney-related Blood urea nitrogen JGAD000154
Serum creatinine JGAD000155
Estimated glomerular filtration rate JGAD000156
Uric acid JGAD000157
Electrolyte Sodium JGAD000158
Potassium JGAD000159
Chlorine JGAD000160
Calcium JGAD000161
Phosphorus JGAD000162
Liver-related Total bilirubin JGAD000163
Zinc sulfate turbidity test JGAD000164
Aspartate aminotransferase JGAD000165
Alanine aminotransferase JGAD000166
Alkaline phosphatase JGAD000167
Gamma-glutamyl transferase JGAD000168
Other biochemical Activated partial thromboplastin time JGAD000169
Prothrombin time JGAD000170
Fibrinogen JGAD000171
Creatine kinase JGAD000172
Lactate dehydrogenase JGAD000173
C-reactive protein JGAD000174
Hematological White blood cell count JGAD000175
Neutrophil count JGAD000176
Eosinophil count JGAD000177
Basophil count JGAD000178
Monocyte count JGAD000179
Lymphocyte count JGAD000180
Red blood cell count JGAD000181
Hemoglobin JGAD000182
Hematocrit JGAD000183
Mean corpuscular volume JGAD000184
Mean corpuscular hemoglobin JGAD000185
Mean corpuscular hemoglobin concentration JGAD000186
Platelet count JGAD000187
Blood pressure Systolic blood pressure JGAD000188
Diastolic blood pressure JGAD000189
Mean arterial pressure JGAD000190
Pulse pressure JGAD000191
Echocardiographic Interventricular septum thickness JGAD000192
Posterior wall thickness JGAD000193
Left ventricular internal dimension in diastole JGAD000194
Left ventricular internal dimension in systole JGAD000195
Left ventricular mass JGAD000196
Left ventricular mass index JGAD000197
Relative wall thickness JGAD000198
Fractional shortening JGAD000199
Ejection fraction JGAD000200
E/A ratio JGAD000201

(Click the trait names to download the gwas summary statistics)

Dictionary file

Total Data Volume

GWAS: 123 MB (zip) on average

Phenotype data (58 quantitative traits): 2.4 MB (txt.gz) on average

Comments (Policies) NBDC policy

 

hum0014.v9.Men.v1 / hum0014.v9.MP.v1

Participants/Materials

67,029 females with information on age at menarche

43,861 females with information on age at menopause

Targets genome wide SNVs
Target Loci for Capture Methods -
Platform Illumina [HumanOmniExpress, HumanExome, OmniExpressExome BeadChip]
Source gDNA extracted from peripheral blood cells
Cell Lines -
Reagents (Kit, Version) HumanOmniExpress, HumanExome, OmniExpressExome BeadChip kit
Genotype Call Methods (software)

minimac [imputation (1000 genomes Phase I v3)]

GenCall software (GenomeStudio)

Association Analysis (software) mach2qtl (v1.1.3)
Filtering Methods

Genotyping QC: sample call rate < 0.98, SNV call rate < 0.99, HWE P < 1 x 10^-6

QC for reference panel: After excluding 11 closely related individuals, variants with HWE P < 1.0 x 10^-6, MAF < 0.01 were excluded.

QC after imputation: Variants with imputation quality of Rsq < 0.7 were excluded. We also excluded variants with |beta| > 4 in the uploaded files.

Marker Number (after QC) 9,296,729 SNPs (hg19)
NBDC Dataset ID

menarche: hum0014.v9.Men.v1

menopause: hum0014.v9.MP.v1

(Click the Dataset ID to download the file)

menarche: Dictionary file

menopause: Dictionary file

Total Data Volume

menarche: 181 MB (txt.gz)

menopause: 186 MB (txt.gz)

Comments (Policies) NBDC policy

 

JGAS000114 (JGAD000220 / JGAD000410 / JGAD000690 / JGAD000758)

Participants/Materials 1,026 individuals
Targets WGS
Target Loci for Capture Methods -
Platform Illumina [HiSeq 2500]
Library Source DNA extracted from peripheral blood cells
Cell Lines -
Library Construction (kit name) TruSeq Nano DNA Library Preparation Kit
Fragmentation Methods Ultrasonic fragmentation
Spot Type Paired-end
Read Length (without Barcodes, Adaptors, Primers, and Linkers) 160 bp
QC

Data with bad base quality and high %GC content were removed.

Alignment:

Data matched for the following conditions were removed.

- Low mapping rate

- Different insert size

- Gender information mismatch between meta-data and genotype data

- Suspected sex chromosome aberration

Genotyping:

GATK’s best practices include a variant filtering step following Variant Quality Score Recalibration (VQSR)

- DP/GP (DP < 5, GQ < 20, DP > 60, GQ < 95 )

- Heterozygosity (F>=0.05)

- Hardy-Weinberg equilibrium (p < 10^-6)

- Repeat & Low Complexity

Principal Component Analysis (PCA):

PCA was performed with individuals included in the 1000 genomes project and outliers from Japanese cluster were removed.

 

After these filtering steps, variants located in the regions listed as the HighConfidenceRegion (Genome-In-A-Bottle project) were flagged.

Deduplication Picard 2.10.6
Calibration for re-alignment and base quality GATK 3.7
Mapping Methods BWA mem 0.7.12
Mapping Quality Reads with MAPQ<20 were excluded at variant calling with GATK 3.7 HaplotypeCaller
Reference Genome Sequence GRCh37/hg19 (hs37d5)
Coverage (Depth) 31.8x
Detecting Methods for Variation GATK 3.7 HaplotypeCaller
SNV Numbers (after QC)

76,768,387 (Autosomal Chromosomes)

2,898,518 (X Chromosome)

INDEL Numbers (after QC)

10,202,908 (Autosomal Chromosomes)

410,435 (X Chromosome)

Japanese Genotype-phenotype Archive Dataset ID

JGAD000220 (fastq)

JGAD000410 (bam, vcf): Whole genome sequencing analyzed data included in the JGAD000117 were mapped to the GRCh37 reference genome sequence, and variant detection was carried out using the GATK (Genome Analysis Toolkit) standards. This project is an initiative of the GEnome Medical Alliance Japan (GEM Japan, GEM-J). Lean more..

Dataset ID of the Processed data by JGA

JGAD000690

JGAD000758 (joint call)

The way to Process

Total Data Volume

JGAD000220: 73 TB (fastq)

JGAD000410: 49 TB (bam, vcf)

JGAD000690: 52.1 TB (bam, bai, vcf, document)

JGAD000758: 203.8 GB (vcf_aggregate, tabix)

Comments (Policies) NBDC policy

* Summarized data is available at JENGER site.

 

JGAS000140

Participants/Materials 7,104 breast cancer patients and 23,731 controls
Targets Target Capture
Target Loci for Capture Methods 11 hereditary breast cancer genes (ATM, BRCA1, BRCA2, CDH1, CHEK2, NBN, NF1, PALB2, PTEN, STK11, TP53)
Platform Illumina [HiSeq 2500]
Library Source DNA extracted from peripheral blood cells
Cell Lines -
Library Construction (kit name) 1st PCR was performed with 2X Platinum Multiplex PCR Master Mix (Thermo Fisher Scientific) to amplify the target region, followed by the 2nd PCR with 8-bp barcode and adapter sequences added using KAPA HiFi HotStart DNA Polymerase (KAPA) *1
Fragmentation Methods -
Spot Type Paired-end
Read Length (without Barcodes, Adaptors, Primers, and Linkers) 150 bp x 2
Japanese Genotype-phenotype Archive Dataset ID JGAD000209
Total Data Volume 1 TB (fastq)
Comments (Policies) NBDC policy

*1 Hum Mol Genet. 25,:5027-5034 (2016)

 

hum0014.v12.T2DMwN.v1

Participants/Materials

[GWAS-1]

   - 2,380 T2DM with diabetic nephropathy patients

   - 5,234 T2DM without diabetic nephropathy patients

[GWAS-2]

   - 429 T2DM with diabetic nephropathy patients

   - 358 T2DM without diabetic nephropathy patients

Targets genome wide SNVs
Target Loci for Capture Methods -
Platform Illumina [OmniExpressExome Beadchip, Human610-Quad BeadChip]
Source gDNA extracted from peripheral blood cells
Cell Lines -
Reagents (Kit, Version)

Illumina OmniExpressExome Beadchip kit

Illumina Human610-Quad Beadchip kit

Genotype Call Methods (software)

MACH and Minimac (1000 Genomes phased JPT, CHB and Han Chinese South data n = 275, March 2012)

GenCall software (GenomeStudio)

Association Analysis (software) mach2dat
Filtering Methods

sample call rate < 0.98, SNV call rate < 0.99, MAF < 0.1%,

HWE P < 1 x 10-6 in control

Marker Number (after QC) 7,521,072 SNPs (hg19)
NBDC Dataset ID

hum0014.v12.T2DMw.v1

(Click the Dataset ID to download the file)

Dictionary file

Total Data Volume 310 MB (csv.zip)
Comments (Policies) NBDC policy

 

hum0014.v13.T2DMmeta.v1

Participants/Materials

[GWAS-1]

   - 9,804 T2DM patients (ICD-10: E11)

   - 6,728 controls

[GWAS-2]

   - 5,639 T2DM patients (ICD-10: E11)

   - 19,407 controls

[GWAS-3]

   - 18,688 T2DM patients (ICD-10: E11)

   - 121,950 controls

[GWAS-4]

   - 2,483 T2DM patients (ICD-10: E11)

   - 7,065 controls

Targets genome wide SNVs
Target Loci for Capture Methods -
Platform Illumina [HumanOmniExpress, HumanExome, OmniExpressExome, Human610-Quad BeadChip]
Source gDNA extracted from peripheral blood cells
Cell Lines -
Reagents (Kit, Version) HumanOmniExpress, HumanExome, OmniExpressExome, Human610-Quad BeadChip kit
Genotype Call Methods (software)

minimac [imputation(1000 genomes Phase 3)]

GenCall software (GenomeStudio)

Association Analysis (software) mach2dat (v1.0.24)
Filtering Methods

Genotyping QC:

exclusion criteria of GWAS1, GWAS3, GWAS4

(i) hetero count < 5

(ii) HWE P < 1.0 × 10^-6 on each chip

(iii) genotype concordance rate < 0.99 with in-house WGS data

(iv) SNV call rate < 0.99

 

exclusion criteria of GWAS2

(i) SNV call rate < 0.99

(ii) MAF < 0.01

(iii) differential missingness P < 1.0 × 10^-6

(iv) HWE P < 1.0 × 10^-6

 

Imputation QC:

HWE P < 1 × 10^-6 or MAF < 0.01 in the reference panel

Imputation quality (Rsq) < 0.3 in more than two GWAS

Marker Number (after QC) 12,557,761 SNPs (hg19)
NBDC Dataset ID

hum0014.v13.T2DMmeta.v1

(Click the Dataset ID to download the file)

Dictionary file

Total Data Volume 257 MB (txt)
Comments (Policies) NBDC policy

 

hum0014.v14.smok.v1

Participants/Materials 165,436 individuals whose smoking status is available
Targets genome wide SNVs
Target Loci for Capture Methods -
Platform Illumina [HumanOmniExpress, HumanExome, OmniExpressExome BeadChip]
Source gDNA extracted from peripheral blood cells
Cell Lines -
Reagents (Kit, Version) HumanOmniExpress, HumanExome, OmniExpressExome BeadChip kit
Genotype Call Methods (software)

minimac [imputation (1000 genomes Phase I v3)]

GenCall software (GenomeStudio)

Association Analysis (software)

BOLT-LMM (v2.2)

ProbABEL(v0.4.5; for X chromosome)

Filtering Methods

Genotyping QC: sample call rate < 0.98, SNV call rate < 0.99, HWE P < 1 x 10^-6

QC for reference panel:

After excluding 11 closely related individuals, variants with HWE P < 1.0 x 10^-6, MAF < 0.01 were excluded.

QC after imputation:

Variants with imputation quality of Rsq < 0.7 and MAF < 0.01 were excluded.

Marker Number (after QC)

autosomes: 5,961,480 SNVs (hg19)

male X-chromosome (Age of smoking initiation): 163,412 SNVs (hg19)

female X-chromosome (Age of smoking initiation): 146,130 SNVs (hg19)

male X-chromosome (Cigarettes per day): 166,111 SNVs (hg19)

female X-chromosome (Cigarettes per day): 146,114 SNVs (hg19)

male X-chromosome (Smoking initiation [Ever vs never smokers]): 166,138 SNVs (hg19)

female X-chromosome (Smoking initiation [Ever vs never smokers]): 146,146 SNVs (hg19)

male X-chromosome (Smoking cessation [Former vs current smokers]): 166,142 SNVs (hg19)

female X-chromosome (Smoking cessation [Former vs current smokers]): 146,118 SNVs (hg19)

NBDC Dataset ID

hum0014.v14.asi.v1.zip (Age of smoking initiation)

hum0014.v14.cpd.v1.zip (Cigarettes per day)

hum0014.v14.ens.v1.zip (Smoking initiation [Ever vs never smokers])

hum0014.v14.fcs.v1.zip (Smoking cessation [Former vs current smokers])

(Click the Dataset ID to download the file)

Dictionary file

Total Data Volume 1.9 GB (txt.gz)
Comments (Policies) NBDC policy

 

JGAS000114 reference panel

Participants/Materials

- WGS data (JGAD000220) of the biobank Japan project (N=1,037)

- WGS data of 1KGP p3v5 ALL (N=2,504) (ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/)

Targets

a reference panel from WGS data

(variants on autosomal chromosomes and X-chromosome)

Target Loci for Capture Methods -
QC*

We set exclusion criteria for genotypes as follows:

(1) DP < 5, (2) GQ < 20, or (3) DP > 60 and GQ < 95, and regarded these genotypes as missing.

Variants with call rates < 90% were excluded before variant quality score recalibration (VQSR).

After VQSR, we excluded variants located in low-complexity regions (LCR), as defined by mdust software were excluded.

Finally, we used BEAGLE to impute missing genotypes.

 

Deduplication* picard (versions 1.106)
Calibration for re-alignment and base quality* GATK (ver.3.2-2)
Mapping Methods* BWA-MEM (version 0.7.5a)
Mapping Quality* MAPQ < 20 were excluded (HaplotypeCaller)
Reference Genome Sequence* GRCh37/hg19, hs37d5
Coverage (Depth)* aimed at 30x depth
Detecting Methods for Variation* GATK HaplotypeCaller (version 3.2-2)
Method for merging vcf files

autosomal chromosomes: Impute2

X-chromosome: Beagle (male), Impute2 (female)

Variant Numbers in reference panel 61,608,817 variants (autosomal chromosomes: 59,387,070; X-chromosome: 2,221,747)
Japanese Genotype-phenotype Archive Dataset ID JGAD000220
Dataset ID of the Processed data by JGA

JGAD000679

The way to Process

Total Data Volume about 15 GB (vcf.gz)
Comments (Policies) NBDC policy

* These processes were performed only for biobank Japan project data.

 

hum0014.v15.ht.v1

Participants/Materials 159,095 individuals (Male: 86,257, Female: 72,838)
Targets genome wide variants
Target Loci for Capture Methods -
Platform Illumina [HumanOmniExpress, HumanExome, OmniExpressExome BeadChip]
Source gDNA extracted from peripheral blood cells
Cell Lines -
Reagents (Kit, Version) HumanOmniExpress, HumanExome, OmniExpressExome BeadChip kit
Genotype Call Methods (software) Minimac3 [imputation reference panel using WGS data of the biobank Japan project (N=1,037) and 1KGP p3v5 ALL (N=2,504)]
Association Analysis (software) BOLT-LMM (ver2.2), mach2qtl
Filtering Methods

Sample QCs: Exclusion criteria:

1) call rate < 98%,

2) closely related samples (PI_HAT > 0.175), and

3) outlier from Japanese cluster determined by PCA using GCTA.

QC after imputation: Variants with imputation quality of Rsq < 0.3 were excluded.

Marker Number (after QC)

autosomes: 27,211,524 variants (hg19)

male X-chromosome: 684,533 variants (hg19)

female X-chromosome: 684,533 variants (hg19)

NBDC Dataset ID

hum0014.v15.ht.v1

(Click the Dataset ID to download the file)

Dictionary file

Total Data Volume about 663 MB (txt.gz)
Comments (Policies) NBDC policy

 

JGAS000203

Participants/Materials 7,636 prostate cancer patients (ICD10:C61) and 12,366 controls
Targets Target Capture
Target Loci for Capture Methods 8 hereditary prostate cancer genes (ATM, BRCA1, BRCA2, BRIP1, CHEK2, HOXB13, NBN, PALB2)
Platform Illumina [HiSeq 2500]
Library Source DNA extracted from peripheral blood cells
Cell Lines -
Library Construction (kit name) 1st PCR was performed with 2X Platinum Multiplex PCR Master Mix (Thermo Fisher Scientific) to amplify the target region, followed by the 2nd PCR with 8-bp barcode and adapter sequences added using KAPA HiFi HotStart DNA Polymerase (KAPA) *1
Fragmentation Methods -
Spot Type Paired-end
Read Length (without Barcodes, Adaptors, Primers, and Linkers) 150 bp x 2
Japanese Genotype-phenotype Archive Dataset ID JGAD000288
Total Data Volume 2.2 TB (fastq)
Comments (Policies) NBDC policy

 

hum0014.v17 / hum0014.v18 / hum0014.v21

Participants/Materials

42 disease (ICD10 code)

Arrhythmia (I499), Bronchial asthma (J459), Atopic dermatitis (L209),

Gallbladder/Cholangiocarcinoma (C23, C240), Cataract (H269),

Cerebral aneurysm (I671), Cervical cancer (C539),

Chronic hepatitis B (B181), Chronic hepatitis C (B182),

Chronic obstructive pulmonary disease (J449), Liver cirrhosis (K746),

Colorectal cancer (C189, C20), Heart failure (I509, I500),

Drug eruption (L270), Uterine cancer (C549), Endometriosis (N809),

Epilepsy (G409), Esophageal cancer (C159), Gastric cancer (C169),

Glaucoma (H409), Graves' disease (E050), Hematopoietic tumor (C81-96),

Liver cancer (C220), Interstitial lung disease/Pulmonary fibrosis (J849, J841),

Cerebral infarction (I639), Keloid (L910), Lung cancer (C349),

Nephrotic syndrome (N049), Osteoporosis (M8199), Ovarian cancer (C56),

Pancreas cancer (C259), Periodontitis (K054),

Peripheral artery disease (I709), Hay fever (J301), Prostate cancer (C61),

Pulmonary tuberculosis (A169), Rheumatoid arthritis (M0690),

Diabetes mellitus (E14), Urolithiasis (N209), Uterine fibroids (D259), Breast cancer (C509)

Coronary artery disease (I200, I209, I219)

Targets genome wide variants
Target Loci for Capture Methods -
Platform Illumina [HumanOmniExpress, HumanExome, OmniExpressExome BeadChip]
Source gDNA extracted from peripheral blood cells
Cell Lines -
Reagents (Kit, Version) HumanOmniExpress, HumanExome, OmniExpressExome BeadChip kit
Genotype Call Methods (software)

Minimac3 [imputation (1000 genomes Phase 3 v5)]

GenCall software (GenomeStudio)

Association Analysis (software) SAIGE(v0.29.4.2)
Filtering Methods

QC after imputation:

Exclusion criteria: Variants with imputation quality of Rsq < 0.7

Marker Number (after QC)

autosomes: 8,712,794 variants (hg19)

X-chromosome: 207,198 variants (hg19)

NBDC Dataset ID Arrhythmia hum0014.v17.AR.v1
Bronchial asthma hum0014.v17.BA.v1
Atopic dermatitis* hum0014.v17.AD.v1
Gallbladder/Cholangiocarcinoma hum0014.v17.GCc.v1
Cataract hum0014.v17.Cat.v1
Cerebral aneurysm hum0014.v17.CA.v1
Cervical cancer hum0014.v17.CeC.v1
Chronic hepatitis B hum0014.v17.CHB.v1
Chronic hepatitis C hum0014.v17.CHC.v1
Chronic obstructive pulmonary disease hum0014.v17.COPD.v1
Liver cirrhosis hum0014.v17.Cir.v1
Colorectal cancer hum0014.v17.CC.v1
Heart failure* hum0014.v17.HF.v1
Drug eruption hum0014.v17.DE.v1
Uterine cancer hum0014.v17.UC.v1
Endometriosis hum0014.v17.EM.v1
Epilepsy hum0014.v17.Ep.v1
Esophageal cancer hum0014.v17.EC.v1
Gastric cancer hum0014.v17.GC.v1
Glaucoma* hum0014.v17.Gla.v1
Graves' disease hum0014.v17.GD.v1
Hematopoietic tumor hum0014.v17.HT.v1
Liver cancer hum0014.v17.LiC.v1
Interstitial lung disease/Pulmonary fibrosis hum0014.v17.IP.v1
Cerebral infarction hum0014.v17.CI.v1
Keloid hum0014.v17.Kel.v1
Lung cancer hum0014.v17.LuC.v1
Nephrotic syndrome hum0014.v17.NS.v1
Osteoporosis hum0014.v17.OP.v1
Ovarian cancer hum0014.v17.OC.v1
Pancreas cancer hum0014.v17.PaC.v1
Periodontitis hum0014.v17.PD.v1
Peripheral artery disease hum0014.v17.PAD.v1
Hay fever hum0014.v17.Hay.v1
Prostate cancer hum0014.v17.PrC.v1
Pulmonary tuberculosis hum0014.v17.PT.v1
Rheumatoid arthritis hum0014.v17.RA.v1
Diabetes mellitus* hum0014.v17.DM.v1
Urolithiasis hum0014.v17.Uro.v1
Uterine fibroids hum0014.v17.UF.v1
Breast cancer hum0014.v18.BC.v1
Coronary artery disease hum0014.v21.CAD.v1

(Click the Dataset ID to download the file)

Dictionary file

Sample size file

Total Data Volume

autosomes: about 0.8-1.3 GB each

X-chromosome: about 20-30 MB each

Comments (Policies) NBDC policy

* Data of 4 diseases were partially overlapped with those of previous releases (Glaucoma [hum0014.v7.POAG.v1], Atrial fibrillation [hum0014.v5.AF.v1], Atopic dermatitis [hum0014.v4.AD.v1], and Diabetes mellitus [hum0014.v3.T2DM-2.v1]).

 

hum0014.v19

Participants/Materials 165,084 individuals whose dietary habits status is available (13 dietary traits)
Targets genome wide SNVs
Target Loci for Capture Methods -
Platform Illumina [HumanOmniExpress, HumanExome, OmniExpressExome BeadChip]
Source gDNA extracted from peripheral blood cells
Cell Lines -
Reagents (Kit, Version) HumanOmniExpress, HumanExome, OmniExpressExome BeadChip kit
Genotype Call Methods (software)

GenCall software(GenomeStudio)

MACH

minimac (v.0.1.1) [imputation (1000 genomes Phase I v3)]

Association Analysis (software)

BOLT-LMM (v2.2) for autosomes

ProbABEL (v0.4.5) for X chromosome

Filtering Methods

Genotyping QC: sample call rate < 0.98, SNV call rate < 0.99, MAF < 0.005

QC for reference panel:

Variants with HWE P < 1.0 x 10^-6, MAF < 0.01 were excluded from the reference panel.

QC after imputation:

Variants with imputation quality of Rsq < 0.7 and MAF < 0.01 were excluded.

Marker Number (after QC)

autosomes: 5,961,480 variants (hg19)

X-chromosome: 148,568 variants for female, 170,117 variants for male (hg19)

NBDC Dataset ID Ever versus never drinker hum0014.v19.drink.v1.zip
Drinks per week hum0014.v19.dpw.v1.zip
Coffee consumption hum0014.v19.cafe.v1.zip
Tea consumption hum0014.v19.tea.v1.zip
Milk consumption hum0014.v19.milk.v1.zip
Yogurt consumption hum0014.v19.ygt.v1.zip
Cheese consumption hum0014.v19.cheese.v1.zip
Natto consumption hum0014.v19.natto.v1.zip
Tofu consumption hum0014.v19.tofu.v1.zip
Fish consumption hum0014.v19.fish.v1.zip
Small fish consumption hum0014.v19.sfish.v1.zip
Vegetable consumption hum0014.v19.vege.v1.zip
Meat consumption hum0014.v19.meat.v1.zip

(Click the Dataset ID to download the file)

Dictionary file

Total Data Volume 6.3 GB (txt.zip)
Comments (Policies) NBDC policy

 

hum0014.v20.cad.v1

Participants/Materials

25,892 coronary artery disease patients (ICD10: I20-25) and 142,336 controls

Targets genome wide variants
Target Loci for Capture Methods -
Platform Illumina [HumanOmniExpress, HumanExome, OmniExpressExome BeadChip]
Source gDNA extracted from peripheral blood cells
Cell Lines -
Reagents (Kit, Version) HumanOmniExpress, HumanExome, OmniExpressExome BeadChip kit
Genotype Call Methods (software)

GenCall software (GenomeStudio)

minimac3 (BBJ-CAD reference panel)

Association Analysis (software) PLINK2
Filtering Methods QC after imputation: Variants with imputation quality of Rsq < 0.3 and MAF < 0.0002 were excluded.
Marker Number (after QC) autosomes: 19,707,525 variants (hg19)
NBDC Dataset ID

hum0014.v20.gwas.v1 (summary statistics)

hum0014.v20.prs.v1 (polygenic risk score)

(Click the Dataset ID to download the file)

Dictionary file

Total Data Volume about 413 MB (txt.gz)
Comments (Policies) NBDC policy

 

JGAS000293 (Target Capture)

Participants/Materials 11,234 subjects extracted from approximately 200,000 subjects registered in Biobank Japan between fiscal years 2003 to 2007
Targets Target Capture
Target Loci for Capture Methods

23 genes related to clonal hematopoiesis

ASXL1, CBL, CEBPA, DDX41, DNMT3A, ETV6, EZH2, GATA2, GNAS, GNB1, IDH1, IDH2, JAK2, KRAS, MYD88, NRAS, PPM1D, RUNX1, SF3B1, SRSF2, TET2, TP53, U2AF1

Platform Illumina [HiSeq 2500]
Library Source DNA extracted from peripheral blood cells
Cell Lines -
Library Construction (kit name) Library was contructed as described in Momozawa, Y., et al. Low-frequency coding variants in CETP and CFB are associated with susceptibility of exudative age-related macular degeneration in the Japanese population. Hum Mol Genet 25, 5027-5034 (2016).
Fragmentation Methods -
Spot Type Paired-end
Read Length (without Barcodes, Adaptors, Primers, and Linkers) 150 bp
QC We selected samples in which ≥20 depth was achieved in ≥98% regions.
Mapping Methods bwa
Mapping Quality ≥40
Reference Genome Sequence hg19
Coverage (Depth) x800
Detecting Methods for Variation Genomon pipeline
Filtering Methods Genomon pipeline
Japanese Genotype-phenotype Archive Dataset ID JGAD000399
Total Data Volume 5.3 MB (txt)
Comments (Policies) NBDC policy

 

JGAS000293 (SNP array)

Participants/Materials 11,234 subjects extracted from approximately 200,000 subjects registered in Biobank Japan between fiscal years 2003 to 2007
Targets SNP array
Target Loci for Capture Methods -
Platform Illumina [human OmniExpress, human OmniExpressExome]
Library Source DNA extracted from peripheral blood cells
Cell Lines -
Reagents (Kit, Version) Illumina Infinium OmniExpress, Infinium OmniExpressExome v.1.0, or v.1.2
Genotype Call Methods (software) GenCall software(GenomeStudio)
Algorithm for detecting chromosome abnormality (software)

Haplotype-based detection of allelic imbalances.

Terao, C., et al. Chromosomal alterations among age-related haematopoietic clones in Japan. Nature (2020).

Filtering Methods SNPs examined in all of the three versions of array
Marker Numbers (after QC) 515,355
Japanese Genotype-phenotype Archive Dataset ID JGAD000400
Total Data Volume 5.3 MB (csv)
Comments (Policies) NBDC policy

 

JGAS000327 / JGAS000346 / JGAS000414 / JGAS000347 / JGAS000592

Participants/Materials

1,005 pancreatic cancer patients (ICD10: C25)

12,503 colorectal cancer patients (ICD10: C18, C19, C20)

740 renal cell cancer patients (ICD10: C64)

1,982 lymphoma patients (ICD10: C81, C82, C83, C84, C85, C86, C88, C91)

10,366 gastric cancer patients (ICD10: C16)

23,705 + 5,996 + 37,592 controls

Targets Target Capture
Target Loci for Capture Methods 27 cancer-predisposing genes (APC, ATM, BARD1, BMPR1A, BRCA1, BRCA2, BRIP1, CDK4, CDKN2A, CDH1, CHEK2, EPCAM, HOXB13, NBN, NF1, MLH1, MSH2, MSH6, MUTYH, PALB2, PMS2, PTEN, RAD51C, RAD51D, SMAD4, STK11, TP53)
Platform Illumina [HiSeq 2500]
Library Source DNA extracted from peripheral blood cells
Cell Lines -
Library Construction (kit name) 1st PCR was performed with 2X Platinum Multiplex PCR Master Mix (Thermo Fisher Scientific) to amplify the target region, followed by the 2nd PCR with 8-bp barcode and adapter sequences added using KAPA HiFi HotStart DNA Polymerase (KAPA) *1
Fragmentation Methods -
Spot Type Paired-end
Read Length (without Barcodes, Adaptors, Primers, and Linkers) 150 bp x 2
Japanese Genotype-phenotype Archive Dataset ID

pancreatic cancer: JGAD000438

colorectal cancer: JGAD000458

23,705 controls: JGAD000459

renal cell cancer and 5,996 controls: JGAD000531

lymphoma: JGAD000460

gastric cancer: JGAD000720

37,592 controls: JGAD000721

Total Data Volume

JGAD000438: 78 GB (fastq)

JGAD000458: 956 GB (fastq)

JGAD000459: 1.9 TB (fastq)

JGAD000531: 961.8 GB (fastq)

JGAD000460: 126 GB (fastq)

JGAD000720, JGAD000721: 3.4 TB (fastq)

Comments (Policies) NBDC policy

 

JGAS000414

Participants/Materials

740 renal cell cancer patients (ICD10:C64)

5,996 controls

Targets Target Capture
Target Loci for Capture Methods 13 renal cell carcinoma-related genes (VHL, BAP1, FH, FLCN, MET, TSC1, TSC2, MITF, SDHA, SDHB, SDHC, SDHD, CDC73)
Platform Illumina [HiSeq 2500]
Library Source DNA extracted from peripheral blood cells
Cell Lines -
Library Construction (kit name) 1st PCR was performed with 2X Platinum Multiplex PCR Master Mix (Thermo Fisher Scientific) to amplify the target region, followed by the 2nd PCR with 8-bp barcode and adapter sequences added using KAPA HiFi HotStart DNA Polymerase (KAPA) *1
Fragmentation Methods -
Spot Type Paired-end
Read Length (without Barcodes, Adaptors, Primers, and Linkers) 150 bp x 2
Japanese Genotype-phenotype Archive Dataset ID JGAD000531
Total Data Volume JGAD000531: 961.8 GB (fastq)
Comments (Policies) NBDC policy

 

JGAS000381

Participants/Materials 1,765 myocardial infarction patients (ICD10: I21) and 199 dementia patients (ICD10: F00-03)
Targets WGS
Target Loci for Capture Methods -
Platform Illumina [HiSeq X Five]
Library Source DNA extracted from peripheral blood cells
Cell Lines -
Library Construction (kit name) TruSeq DNA PCR-Free Prep kit
Fragmentation Methods Ultrasonic fragmentation
Spot Type Paired-end
Read Length (without Barcodes, Adaptors, Primers, and Linkers) 151 bp
QC

Data with bad base quality and high %GC content were removed.

Aligment:

Data matched for the following condition were removed.

- Low mapping rate

- Different insert size

- Gender information mismatch between meta-data and genotype data

- Suspected sex chromosome aberration

Genotyping:

GATK’s best practices includes a variant filtering step following Variant Quality Score Recalibration (VQSR)

- DP/GP (DP < 5, GQ < 20, DP > 60, GQ < 95 )

- Heterozygosity (F>=0.05)

- Hardy-Weinberg equilibrium (p < 10^-6)

- Repeat & Low Complexity

Principal Component Analysis (PCA):

PCA was performed with individuals included in the 1000 genomes project and outliers from Japanese cluster were removed.

After these filtering steps, variants located in the regions listed as the HighConfidenceRegion (Genome-In-A-Bottle project) were flagged.

Deduplication Picard 2.10.6
Calibration for re-alignment and base quality GATK 3.7
Mapping Methods BWA mem 0.7.12
Mapping Quality Reads with MAPQ<20 were excluded at variant calling with GATK 3.7 HaplotypeCaller
Reference Genome Sequence GRCh37/hg19 (hs37d5)
Coverage (Depth) myocardial infarction: 15.0x, dementia: 30.0x
Detecting Methods for Variation GATK 3.7 HaplotypeCaller
SNV Numbers (after QC)

76,768,387 (Autosomal Chromosomes)

2,898,518 (X Chromosome)

INDEL Numbers (after QC)

10,202,908 (Autosomal Chromosomes)

410,435 (X Chromosome)

Japanese Genotype-phenotype Archive Dataset ID

JGAD000495 (fastq)

JGAD000496 (bam, vcf): Whole genome sequencing analyzed data included in the JGAD000117 were mapped to the GRCh37 reference genome sequence, and variant detection was carried out using the GATK (Genome Analysis Toolkit) standards. This project is an initiative of the GEnome Medical alliance Japan (GEM Japan, GEM-J). Lean more..

Total Data Volume 188.4 TB (fastq, bam, vcf)
Comments (Policies) NBDC policy

 

hum0014.v27.surv.v1

Participants/Materials

137,693 individuals from BBJ 1st cohort

Targets genome wide variants
Target Loci for Capture Methods -
Platform Illumina [HumanOmniExpress, HumanExome, OmniExpressExome BeadChip]
Source gDNA extracted from peripheral blood cells
Cell Lines -
Reagents (Kit, Version) HumanOmniExpress, HumanExome, OmniExpressExome BeadChip kit
Genotype Call Methods (software)

minimac [imputation (1000 genomes Phase I v3)]

GenCall software (GenomeStudio)

Association Analysis (software)

mach2qtl (v1.1.3)

SPACox

Filtering Methods

Genotyping QC: sample call rate < 0.98, SNV call rate < 0.99, HWE P < 1 x 10^-6

QC for reference panel:

After excluding 11 closely related individuals, variants with HWE P < 1.0 x 10^-6, MAF < 0.01 were excluded.

QC after imputation:

Variants with imputation quality of Rsq < 0.7 were excluded.

Marker Number (after QC) 6,108,833 variants (hg19)
NBDC Dataset ID

hum0014.v27.surv.v1

(Click the Dataset ID to download the file)

Dictionary file

Total Data Volume 789 MB (txt.gz)
Comments (Policies) NBDC policy

 

hum0014.v28.MEs.v1

Participants/Materials

WGS data of 4,880 individuals from BBJ 1st cohort

     - WGS data (JGAD000220) of the biobank Japan project (N=1,037)

     - WGS data (JGAD000495) of 1,765 myocardial infarction patients and 199 dementia patients

     - WGS data (AGDD_000005) of 225 gastric cancer patients

- 1,007 individuals from Asian Genome Project

- 617 colorectal cancer patients

- One individual excluded from AGDD_000005 by QC

- 10 individuals excluded from JGAD000220 by QC

Targets mobile element variations
Target Loci for Capture Methods -
Platform Illumina [HiSeq X/2500]
Library Source gDNA extracted from peripheral blood cells
Cell Lines -
Library Construction (kit name) TruSeq DNA PCR-Free Sample Prep Kit, TruSeq Nano DNA HT Sample Prep Kit
Fragmentation Methods Ultrasonic fragmentation
Spot Type Paired-end
Read Length (without Barcodes, Adaptors, Primers, and Linkers) 151 bp (HiSeq X), 126 bp (HiSeq 2500)
QC -
Mapping Methods BWA-MEM
Mapping Quality -
Reference Genome Sequence GRCh37 (hs37d5)
Coverage (Depth) ≥15× (≥25× for 1,235 individuals)
Detecting Methods for mobile element MEGAnE *2
Mobile element Number

24,933 for 4,880 individuals

10,997 for 1,235 individuals

NBDC Dataset ID

hum0014.v28.MEs.v1

(Click the Dataset ID to download the file)

Dictionary file

Total Data Volume 1.1 MB (txt.gz)
Comments (Policies) NBDC policy

*2 doi: 10.1101/2022.03.25.485726

 

hum0014.v29.AF.v1

Participants/Materials

77,690 atrial fibrillation patients and 1,167,040 controls

    BBJ: 9,826 atrial fibrillation patients and 140,446 controls

    European: 60,620 atrial fibrillation patients and 970,216 controls

    FinnGen: 7,244 atrial fibrillation patients and 56,378 controls

Targets genome wide SNVs
Target Loci for Capture Methods -
Platform Illumina [HumanOmniExpress、HumanExome、OmniExpressExome BeadChip]
Source

DNA extracted from peripheral blood cells

European GWAS: http://csg.sph.umich.edu/willer/public/afib2018

FinnGen GWAS: https://www.finngen.fi/en

Cell Lines -
Reagents (Kit, Version) HumanOmniExpress, HumanExome, OmniExpressExome BeadChip kit
Genotype Call Methods (software)

GenCall software (GenomeStudio), minimac [imputation (1000 genomes Phase I v3 )]

Association Analysis (software) PLINK2
Filtering Methods

BBJ GWAS: variants with imputation quality (Rsq) < 0.3 or MAF < 0.001 were excluded

meta-analysis: variants with MAF < 1% were excluded

Calculation Methods for Polygenic Risk Score runing and thresholding method
Meta Analysis Methods MANTRA, METAL
Marker Number (after QC)

BBJ GWAS: 16,817,144 SNPs

meta-analysis : 5,158,449 SNPs

NBDC Dataset ID

hum0014.v29.AF.v1

(Click the Dataset ID to download the file)

Dictionary file

Total Data Volume

summary statistics of BBJ: 424 MB (txt)

summary statistics of meta analysis: 78 MB (txt)

polygenic risk score: 59 KB (txt)

Comments (Policies) NBDC policy

 

JGAS000647

Participants/Materials 1,007 individuals from BBJ 1st cohort
Targets WGS
Target Loci for Capture Methods -
Platform Illumina [HiSeq X Five]
Library Source DNA extracted from peripheral blood cells
Cell Lines -
Library Construction (kit name) TruSeq Nano DNA Library Preparation Kit
Fragmentation Methods Ultrasonic fragmentation
Spot Type Paired-end
Read Length (without Barcodes, Adaptors, Primers, and Linkers) 150 bp
QC

- Autosomal Chromosomes, X PAR, X NonPAR female

    - Set missing genotypes with DP < 2 or GQ < 20

    - call rate < 90% were excluded

- X NonPAR male, Y

    - Set missing genotypes with DP < 1 or GQ < 20

    - call rate < 90% were excluded

- X NomPAR

    HWE_P of female

Deduplication Picard 2.10.10
Calibration for re-alignment and base quality GATK 3.8
Mapping Methods BWA-MEM (version 0.7.13)
Mapping Quality Reads with MAPQ<20 were excluded at variant calling with GATK 3.8 HaplotypeCaller
Reference Genome Sequence hs37d5
Coverage (Depth) 19.93455
Detecting Methods for Variation GATK Haplotype Caller (version 3.8)
SNV Numbers (after QC)

Autosomal Chromosomes: 71,643,487

X PAR: 82,997

X nonPAR: 2,618,495

Y Chromosome: 171,271

*Record numbers including AC=0

Japanese Genotype-phenotype Archive Dataset ID JGAD000777
Total Data Volume 41.6 TB (fastq, vcf)
Comments (Policies) NBDC policy

 

JGAS000698

Participants/Materials 256 gastric cancer (ICD10: C16) patients registered in BBJ 1st cohort
Targets WGS
Target Loci for Capture Methods -
Platform Illumina [HiSeq 2500]
Library Source DNA extracted from peripheral blood cells
Cell Lines -
Library Construction (kit name) TruSeq Nano DNA Library Preparation Kit
Fragmentation Methods Ultrasonic fragmentation
Spot Type Paired-end
Read Length (without Barcodes, Adaptors, Primers, and Linkers) 2 x 125 bp
Japanese Genotype-phenotype Archive Dataset ID JGAD000831
Total Data Volume 21.5 TB (fastq)
Comments (Policies) NBDC policy

 

JGAS000703

Participants/Materials 269,000 patients (51 diseases) registered in BBJ 1st and 2nd cohort
Targets SNP array
Target Loci for Capture Methods -
Platform Illumina [HumanHap550, Human610-Quad v1, HumanExome-12 v1, Infinium OmniExpress-24, Infinium HumanOmniExpressExome-8, Infinium HumanOmniExpressExome-8 v1.2, Infinium HumanOmniExpressExome-8 v1.3, Infinium HumanOmniExpressExome-8 v1.4, Infinium HumanOmniExpressExome-8 v1.6]
Library Source DNA extracted from peripheral blood cells
Cell Lines -
Reagents (Kit, Version) HumanHap550 kit, Human610-Quad kit v1, HumanExome-12 kit v1, Infinium OmniExpress-24 kit, Infinium HumanOmniExpressExome-8 kit, Infinium HumanOmniExpressExome-8 kit v1.2, Infinium HumanOmniExpressExome-8 kit v1.3, Infinium HumanOmniExpressExome-8 kit v1.4, Infinium HumanOmniExpressExome-8 kit v1.6
Genotype Call Methods (software) -
Filtering Methods -
Marker Numbers (after QC) -
Japanese Genotype-phenotype Archive Dataset ID JGAD000836
Total Data Volume 3.1 TB (idat)
Comments (Policies) NBDC policy

 

JGAS000699

Participants/Materials 617 colorectal cancer (ICD10: C18, C19, C20) patients registered in BBJ 1st and 2nd cohort
Targets WGS
Target Loci for Capture Methods -
Platform Illumina [HiSeq 2500]
Library Source DNA extracted from peripheral blood cells
Cell Lines -
Library Construction (kit name) TruSeq Nano DNA Library Preparation Kit
Fragmentation Methods Ultrasonic fragmentation
Spot Type Paired-end
Read Length (without Barcodes, Adaptors, Primers, and Linkers) 2 x 125 bp
Japanese Genotype-phenotype Archive Dataset ID JGAD000832
Total Data Volume 25.5 TB (fastq)
Comments (Policies) NBDC policy

 

JGAS000700 / JGAS000701

Participants/Materials

2,162 diabetes (ICD10: E11) patients registered in BBJ 1st cohort

2,067 gastric cancer (ICD10: C16) patients registered in BBJ 1st cohort

Targets WGS
Target Loci for Capture Methods -
Platform Illumina [HiSeq 2500]
Library Source DNA extracted from peripheral blood cells
Cell Lines -
Library Construction (kit name) TruSeq Nano DNA Library Preparation Kit
Fragmentation Methods Ultrasonic fragmentation
Spot Type Paired-end
Read Length (without Barcodes, Adaptors, Primers, and Linkers) 2 x 125 bp
Japanese Genotype-phenotype Archive Dataset ID

diabetes: JGAD000833

gastric cancer: JGAD000834

Total Data Volume

JGAD000833: 20.0 TB (fastq)

JGAD000834: 18.7 TB (fastq)

Comments (Policies) NBDC policy

 

TogoImputation reference panel (JGAD000867 /JGAD000868)

Participants/Materials

[JGAD000867]

- WGS data (JGAD000220) of the biobank Japan project (N=1,026)

[JGAD000868]

- WGS data (JGAD000495) of the biobank Japan project (N=1,964)

Targets

a reference panel from WGS data

(variants on autosomal chromosomes and X-chromosome)

Target Loci for Capture Methods -
QC

Germline whole genome sequencing data were processed, and the aggregate VCF was calculated. Variants were then filtered based on the following conditions:

(1) Variants that did not pass the VQSR filter were excluded

(2) Multi-allelic sites were excluded

(3) Variants with a call rate below 95% were excluded

(4) Variants deviating from Hardy-Weinberg equilibrium (P < 1e-10) were excluded

(5) Variants with a minor allele count (MAC) less than 2 were excluded

 

Deduplication GATK MarkDuplicates (version 4.1.0.0)
Calibration for re-alignment and base quality -
Mapping Methods bwa mem (version 0.7.15)
Mapping Quality -
Reference Genome Sequence GRCh38
Coverage (Depth)

 

Mean ± Standard deviation

JGAD000867: 28.70 ± 4.16

JGAD000868: 19.61 ± 5.41

Detecting Methods for Variation

GATK HaplotypeCaller -ERC GVCF (version 4.1.0.0)

The ploidy for variant call was set as follows:

Autosomes and pseudoautosomal regions (PARs): ploidy=2

Non-PARs on the X chromosome: ploidy=2 (female) and ploidy=1 (male)

Non-PARs on the Y chromosome: ploidy=1 (male)

Method for phasing vcf files After quality control, phasing was performed using Beagle program version 5.2 (21Apr21.304).
Variant Numbers in reference panel

JGAD000867: 17,167,510 variants (autosomal chromosomes: 16,677,000 variants, X-chromosome: 490,510 variants)

JGAD000868: 21,596,248 variants (autosomal chromosomes: 21,157,732 variants, X-chromosome: 438,516 variants)

Japanese Genotype-phenotype Archive Dataset ID

JGAD000220

JGAD000495

Dataset ID of the Processed data by JGA

JGAD000867

JGAD000868

The way to Process

Total Data Volume

JGAD000867: 3.3 GB (vcf.gz)

JGAD000868: 6.2 GB (vcf.gz)

Comments (Policies) NBDC policy

 

DATA PROVIDER

Principal Investigator: Michiaki Kubo

Affiliation: RIKEN Center for Integrative Medical Sciences

Project / Group Name: Tailor-made Medical Treatment Program (Bio Bank Japan: BBJ)

URL: https://biobankjp.org/english/index.html

Funds / Grants (Research Project Number) :

NameTitleProject Number
Ministry of Education, Culture, Sports, Science and Technology in Japan Tailor-made Medical Treatment Program (the 3rd phase) -
Tailor-Made Medical Treatment with the BioBank Japan Project (BBJ), Japan Agency for Medical Research and Development (AMED) Generating large-scale data of genetic polymorphism to identify disease-related genes 17km0305002h0005
Project for Cancer Research and Therapeutic Evolution (P-CREATE), Japan Agency for Medical Research and Development (AMED) Exploration of special and temporal diversity in genome and epigenome of hematological malignancies based on large-scale sequencing analyses. JP19cm0106501
Core Research and Evolutional Science and Technology, Advanced Research & Development Programs for Medical Innovation, Japan Agency for Medical Research and Development (AMED-CREST) Research on altered tissue functions caused by clonal expansion and remodeling of apparently normal tissues related to normal aging or exposure to chronic inflammation and other lifestyles JP19gm1110011
KAKENHI Grant-in-Aid for Scientific Research (S) Comprehensive studies on the molecular basis of early development and clonal evolution in cancer using advanced genomics. 19H05656
Program for Promoting Platform of Genomics based Drug Discovery, Project for Genome and Health Related Data, Japan Agency for Medical Research and Development (AMED) Development of a large-scale database for effective drug treatment for breast, colorectal, and pancreas cancers JP19kk0305010
KAKENHI Grant-in-Aid for Early-Career Scientists Genome-wide association study integrating mobile genetic elements 22K15385
KAKENHI Grant-in-Aid for Scientific Research (B) Elucidation of genetic factors that define myocardial vulnerability as a basis for the development of heart failure 21H02919
KAKENHI Grant-in-Aid for Scientific Research (S) Genome immunity: elucidation of the antiviral activity of endogenous bornaviruses and their utilization as functional resources 20H05682
KAKENHI Grant-in-Aid for Scientific Research (B) Integration and reactivation of human herpesvirus 6: association with diseases 21H02972
Biobank - Construction and Utilization biobank for genomic medicine REalization (B-Cure), Japan Agency for Medical Research and Development (AMED) Management of the Japanese biobank JP19km0605001
Practical Research Project for Life-Style related Diseases including Cardiovascular Diseases and Diabetes Mellitus, Japan Agency for Medical Research and Development (AMED) Multi-layered and integrated research for prevention of atrial fibrillation and serious complications JP22ek0210164
Biobank - Construction and Utilization biobank for genomic medicine REalization, Japan Agency for Medical Research and Development (AMED) Understanding pathogenesis of atrial fibrillation and implementation of precision medicine by WGS and multi-omics JP21tm0724601
Biobank - Construction and Utilization biobank for genomic medicine REalization, Japan Agency for Medical Research and Development (AMED) Implementation of next-generation precision medicine for cardiovascular disease by multi-omics JP20km0405209
Practical Research Project for Rare / Intractable Diseases, Japan Agency for Medical Research and Development (AMED) Understanding pathology and implementation of precision medicine for intractable cardiovascular disease by multi-omics analysis JP20ek0109487

 

PUBLICATIONS

TitleDOIDataset ID
1 A genome-wide association study identifies PLCL2 and AP3D1-DOT1L-SF3A2 as new susceptibility loci for myocardial infarction in Japanese. doi:10.1038/ejhg.2014.110 hum0014.v1.freq.v1
2 A functional variant in ZNF512B is associated with susceptibility to amyotrophic lateral sclerosis in Japanese. doi:10.1093/hmg/ddr268 hum0014.v2.jsnp.92als.v1
3 Functional variants in ADH1B and ALDH2 coupled with alcohol and smoking synergistically enhance esophageal cancer risk. doi: 10.1053/j.gastro.2009.07.070 hum0014.v2.jsnp.182ec.v1
4 SNPs in KCNQ1 are associated with susceptibility to type 2 diabetes in East Asian and European populations. doi: 10.1038/ng.208 T2DM (JSNP)
5 Common variants in a novel gene, FONG on chromosome 2q33.1 confer risk of osteoporosis in Japanese. doi: 10.1371/journal.pone.0019641 Osteoporosis (JSNP)
6 Genome-wide association studies in the Japanese population identify seven novel loci for type 2 diabetes. doi: 10.1038/ncomms10531

hum0014.v3.T2DM-1.v1

hum0014.v3.T2DM-2.v1

7 Multi-ancestry genome-wide association study of 21,000 cases and 95,000 controls identifies new risk loci for atopic dermatitis. doi: 10.1038/ng.3424 hum0014.v4.AD.v1
8 Genome-wide association study identifies eight new susceptibility loci for atopic dermatitis in the Japanese population. doi: 10.1038/ng.2438 hum0014.v4.AD.v1
9 Identification of six new genetic loci associated with atrial fibrillation in the Japanese population. doi: 10.1038/ng.3842 hum0014.v5.AF.v1
10 Genome-wide association study identifies 112 new loci for body mass index in the Japanese population. doi:10.1038/ng.3951

hum0014.v6.158k.v1

JGAD000123

JGAD000124

11 Genome-wide association study identifies seven novel susceptibility loci for primary open-angle glaucoma. doi: 10.1093/hmg/ddy053 hum0014.v7.POAG.v1
12 Genetic analysis of quantitative traits in the Japanese population links cell types to complex human diseases. doi:10.1038/s41588-018-0047-6

hum0014.v8.58qt.v1

JGAD000144-JGAD000201

13 Elucidating the genetic architecture of reproductive ageing in the Japanese population doi: 10.1038/s41467-018-04398-z

hum0014.v9.Men.v1

hum0014.v9.MP.v1

14 Deep whole-genome sequencing reveals recent selection signatures linked to evolution and disease risk of Japanese. doi: 10.1038/s41467-018-03274-0 JGAD000220
15 Germline pathogenic variants of 11 breast cancer genes in 7,051 Japanese patients and 11,241 controls. doi: 10.1038/s41467-018-06581-8 JGAD000209
16 A Variant within the FTO confers susceptibility to diabetic nephropathy in Japanese patients with type 2 diabetes doi: 10.1371/journal.pone.0208654 hum0014.v12.T2DMwN.v1
17 Identification of 28 new susceptibility loci for type 2 diabetes in the Japanese population doi: 10.1038/s41588-018-0332-4 hum0014.v13.T2DMmeta.v1
18 GWAS of smoking behaviour in 165,436 Japanese people reveals seven new loci and shared genetic architecture. doi: 10.1038/s41562-019-0557-y

hum0014.v14.asi.v1

hum0014.v14.cpd.v1

hum0014.v14.ens.v1

hum0014.v14.fcs.v1

19 Characterizing rare and low-frequency height-asssociated variants in the Japanese population doi: 10.1038/s41467-019-12276-5

JGAD000220 (fastq)

JGAD000220 (reference panel)

hum0014.v15.ht.v1

20 Germline pathogenic variants in 7,636 Japanese patients with prostate cancer and 12,366 controls. doi: 10.1093/jnci/djz124 JGAD000288
21 Large-scale genome-wide association study in a Japanese population identifies novel susceptibility loci across different diseases doi: 10.1038/s41588-020-0640-3

hum0014.v17

hum0014.v18

hum0014.v21

22 GWAS of 165,084 Japanese individuals identified nine loci associated with dietary habits doi: 10.1038/s41562-019-0805-1 hum0014.v19
23 Population-specific and transethnic genome-wide analyses identify distinct and shared genetic risk loci for coronary artery disease. doi: 10.1038/s41588-020-0705-3 hum0014.v20.cad.v1
24 Genetic characterization of pancreatic cancer patients and prediction of carrier status of germline pathogenic variants in cancer-predisposing genes doi: 10.1016/j.ebiom.2020.103033 JGAD000438
25 Population-based Screening for Hereditary Colorectal Cancer Variants in Japan doi: 10.1016/j.cgh.2020.12.007

JGAD000458

JGAD000459

26 Genome-wide association study reveals BET1L associated with survival time in the 137,693 Japanese individuals doi: 10.1038/s42003-023-04491-0 hum0014.v27.surv.v1
27 Cross-ancestry genome-wide analysis of atrial fibrillation unveils disease biology and enables cardioembolic risk prediction doi: 10.1038/s41588-022-01284-9 hum0014.v29.AF.v1
28 Association between germline pathogenic variants in cancer-predisposing genes and lymphoma risk doi: 10.1111/cas.15522

JGAD000460

JGAD000721

29 Helicobacter pylori, Homologous-Recombination Genes, and Gastric Cancer doi: 10.1056/NEJMoa2211807

JGAD000720

JGAD000721

30 Germ line DDX41 mutations define a unique subtype of myeloid neoplasms doi: 10.1182/blood.2022018221

JGAD000399

JGAD000400

31 Combined landscape of single-nucleotide variants and copy number alterations in clonal hematopoiesis doi: 10.1038/s41591-021-01411-9

JGAD000399

JGAD000400

32 Characterizing rare and low-frequency height-associated variants in the Japanese population doi: 10.1038/s41467-019-12276-5 JGAD000777
33 Chromosomal alterations among age-related haematopoietic clones in Japan doi: 10.1038/s41586-020-2426-2

JGAD000777

JGAD000836

34 Detection of trait-associated structural variations using short-read sequencing doi: 10.1016/j.xgen.2023.100328 JGAD000831

 

USRES (Controlled-access Data)

Principal InvestigatorAffiliationCountry/RegionResearch TitleData in Use (Dataset ID)Period of Data Use
Mark Daly Broad Institute of MIT and Harvard JGAD000101, JGAD000102,
JGAD000123, JGAD000124,
JGAD000144-JGAD000201,
JGAD000220
2018/09/11-2023/07/31
Yukinori Okada Department of Statistical Genetics, Osaka University Graduate School of Medicine JGAD000101, JGAD000102,
JGAD000123, JGAD000124,
JGAD000144-JGAD000201,
JGAD000220
2018/09/20-2021/03/31
Shigeo Kamitsuji Statistical Analysis Division, StaGen Co., Ltd. JGAD000123 2018/10/04-2019/03/31
Katsushi Tokunaga Department of Human Genetics, Graduate School of Medicine, The University of Tokyo JGAD000123, JGAD000124,
JGAD000144-JGAD000201,
JGAD000220
2018/11/13-2026/11/08
Tatsuhiko Tsunoda Department of Medical Science Mathematics, Medical Research Institute, Tokyo Medical and Dental University Research on big data analysis for precision medicine JGAD000123, JGAD000124,
JGAD000144-JGAD000201,
JGAD000220
2018/12/18-2021/06/19
Liming Liang Harvard T.H. Chan School of Public Health, Department of Epidemiology JGAD000123 2019/01/21-2021/12/31
Masao Nagasaki Center for Genomic Medicine, Graduate School of Medicine Center for the Promotion of Interdisciplinary Education and Research, Kyoto University Development and application of bioinformatics methods to facilitate the detection of genes associated with multifactorial disorders based on large-scale whole genome sequencing data of Japanese individuals JGAD000123, JGAD000124,
JGAD000144-JGAD000201,
JGAD000220
2019/01/31-2027/03/31
Seishi Ogawa Department of Pathology and Tumor Biology, Graduate School of Medicine, Kyoto University JGAD000209 2019/02/04-2021/03/31
Shigeo Kamitsuji Statistical Analysis Division, StaGen Co., Ltd. JGAD000123 2019/03/13-2022/03/31
Takashi Kohno National Cancer Research Institute, Division of genome biology JGAD000123, JGAD000124,
JGAD000220
2019/04/15-2019/12/31
Shigeo Horie Department of Urology, Juntendo University, Graduate School of Medicine JGAD000123, JGAD000220 2019/05/14-2024/03/31
Tatsuhiko Tsunoda Department of Medical Science Mathematics, Medical Research Institute, Tokyo Medical and Dental University Research on sequence, image data analysis for precision medicine JGAD000123, JGAD000124,
JGAD000144-JGAD000201,
JGAD000220
2019/06/06-2023/08/31
Kengo Kinoshita Tohoku Medical Megabank Organization Construction of Japanese whole genome database JGAD000220 2019/06/24-2022/03/31
Kouya Shiraishi Division of Genome Biology, National Cancer Research Institute Elucidation of immune-system networks between host and tumor based on genomic analysis JGAD000124 2019/08/05-2023/03/31
Shigeo Kamitsuji Statistical Analysis Division, StaGen Co., Ltd. Mendelian randomization study using genetic markers of uric acid levels as an instrumental variable JGAD000123, JGAD000124,
JGAD000146, JGAD000148,
JGAD000149, JGAD000155,
JGAD000156, JGAD000157,
JGAD000174, JGAD000188
2019/08/16-2024/03/31
Shigeo Kamitsuji Statistical Analysis Division, StaGen Co., Ltd. Mendelian randomization study using 58 clinical laboratory tests and SNP genotype data. JGAD000123, JGAD000124,
JGAD000144-JGAD000201
2019/08/22-2024/03/31
Osamu Ogasawara Bioinformation and DDBJ Center, National Institute of Genetics Evaluation of human genome analysis workflow using JGA/AGD genome data. JGAD000123, JGAD000220 2019/10/11-2024/03/31
Seishi Ogawa Department of Medical Science, Kyoto University Comprehensive analysis of genetic alterations in hematological malignancies JGAD000102, JGAS000123,
JGAD000220
2019/11/14-2024/03/31
Yasushi Okazaki Diagnostics and Therapeutics of Intractable Diseases, Juntendo University Graduate School of Medicine Identification of disease biomarkers by disease cohort research network -Whole genome sequencing of epilepsy- JGAD000123 2020/06/04-2023/03/31
Chihiro Hata Bioinformation and DDBJ Center, National Institute of Genetics Identification of hypomorphic mutations in Japanese breast cancer patients JGAD000209 2020/06/04-2023/03/31
Yosuke Kawai Genome Medical Science Project, National Center for Global Health and Medicine Large scale genome analysis of modern human genomes to infer the origin of Yaponesians JGAD000123, JGAD000124,
JGAD000144-JGAD000201,
JGAD000220
2020/06/19-2022/03/31
Nakao Iwata Department of Psychiatry, Fujita Health University School of Medicine Research for investigating susceptibility of mental state, mental disorders, drug efficacy and side effects through genetic analysis JGAD000123, JGAD000124 2020/08/17-2022/12/31
Atray Dixit Coral Genomics, Inc. Derivation and Evaluation of Functional Response Scores JGAD000101, JGAD000123,
JGAD000124, JGAD000144-JGAD000201, JGAD000220
2020/08/24-2021/07/21
Hae Kyung Im Biological Sciences Division, University of Chicago Predicted Gene Expression: High Power, Mechanism, and Direction of Effect JGAD000123, JGAD000124,
JGAD000144-JGAD000201,
JGAD000220
2020/09/15-2023/06/23
Hongyu Zhao Department of Biostatistics, Yale School of Public Health Leveraging multi-ethnic data and functional annotations in causal variant identification, genetic correlation estimation, and genetic risk prediction JGAD000101, JGAD000102,
JGAD000123, JGAD000124,
JGAD000144-JGAD000201,
JGAD000220
2020/09/24-2024/03/01
Charleston Chiang Center for Genetic Epidemiology, Keck School of Medicine, University of Southern Califolnia Investigating the evolution of complex genetic architecture in participants of Biobank Japan JGAD000101, JGAD000102,
JGAD000123, JGAD000124,
JGAD000144-JGAD000201,
JGAD000220
2020/09/28-2025/07/01
Shigeo Kamitsuji Statistical Analysis Division, StaGen Co., Ltd. Identifying the genetic risk factors for Stent Thrombosis by genome-wide association study JGAD000123, JGAD000124,
JGAD000145, JGAD000146,
JGAD000149, JGAD000151,
JGAD000155, JGAD000156,
JGAD000158, JGAD000159,
JGAD000163,
JGAD000165-JGAD000167,
JGAD000170,
JGAD000172-JGAD000175,
JGAD000182,
JGAD000187-JGAD000189,
JGAD000192-JGAD000196,
JGAD000200, JGAD000201
2020/10/26-2025/03/31
Ali Torkamani Scripps Research Institute Genomics Deep Learning JGAD000101, JGAD000102,
JGAD000123, JGAD000124,
JGAD000144-JGAD000201,
JGAD000220
2020/11/16-2023/07/10
Kazuhiro Nakayama Department of Integrated Biosciences, Graduate School of Frontier Sciences, The University of Tokyo Investigation of genome variation influening activity of brown adipose tissues JGAD000123, JGAD000124 2020/11/26-2023/09/18
Keishi Fujio Department of Allergy and Rheumatology, Graduate School of Medicine, The University of Tokyo Integrative analysis of immune-cell eQTL data and large-scaled GWAS data in Japanese JGAD000101, JGAD000102,
JGAD000123, JGAD000124,
JGAD000144-JGAD000201,
JGAD000220
2020/12/14-2025/03/31
Masataka Kikuchi Department of Computational Biology and Medical Sciences, Graduate School of Frontier Science, The University of Tokyo Japan Imputation analysis using a Japanese reference panel JGAD000220 2020/12/14-2027/06/30
Fumihiko Matsuda Center for Genomic Medicine, Kyoto University Elucidation of Japanese genetic diversity JGAD000220 2021/01/05-2025/03/31
Emiko Noguchi Department of Medical Genetics, Faculty of Medicine, University of Tsukuba Exploratory study of genetic factors in allergic diseases JGAD000220 2021/03/16-2032/03/31
Noriko Sato Department of Molecular Epidemiology, Medical Research Institute, Tokyo Medical and Dental University Analysis of genetic and environmental risks of obesity and diabetes based on regional cohort longitudinal data JGAD000220 2021/04/09-2023/03/31
Takashi Kohno Division of Genome Biology, National Cancer Center Research Institute Identification of genetic risk factors in AYA(Adolescence and Young Adult) cancer JGAD000209, JGAD000220 2021/05/20-2025/03/31
Yasunobu Nagata Department of hematology, Nippon Medical School Identification of the mechanisms for pathogenesis of hematologic tumors based on novel genetic abnormalities JGAD000123, JGAD000124,
JGAD000144-JGAD000201,
JGAD000220
2021/05/26-2026/03/31
Atsushi Kawakami Department of Immunology and Rheumatology, Nagasaki University Hospital An exploratory study to determine the genetic polymorphisms or mutations associated with type 1 diabetes and interstitial lung disease induced by immune checkpoint inhibitor; nivolumab JGAD000220 2021/06/14-2022/03/30
Akihiro Fujimoto Department of Human Genetics, Graduate School of Medicine, The University of Tokyo Comprehensive analysis of mutations and genetic diversity by analyzing whole-genome sequence data JGAD000220, JGAD000410 2021/09/16-2024/11/30
Yoshihiro Asano Department of Cardiovascular Medicine Graduate School of Medicine, Osaka University Sensitive gene analysis of hereditary cardiovascular disease JGAD000123, JGAD000124, JGAD000144-JGAD000201, JGAD000220 2021/07/16-2022/05/31
Hironori Masuko Department of Pulmonary Medicine, University of Tsukuba Search for susceptibility genes for chronic inflammatory airway diseases JGAD000123 2021/08/11-2023/03/31
Takashi Kohno Division of Genome Biology, National Cancer Center Research Institute Identification of genetic risk factors in AYA(Adolescence and Young Adult) cancer JGAD000209, JGAD000220 2021/09/27-2025/03/31
Fumihiko Matsuda Center for Genomic Medicine, Kyoto University Development of personalized medicine JGAD000123, JGAD000220 2021/09/27-2025/03/31
Takashi Matsuda Advanced Informatics & Analytics, Astellas Pharma Inc. Investigation of the correlation between Liver cancer/Hepatitis B and polymorphism JGAD000102, JGAD000123 2021/11/05-2022/07/31
Emiko Noguchi Department of Medical Genetics, Faculty of Medicine, University of Tsukuba Identification of the pathogenic factors for food allergy JGAD000220 2021/12/03-2025/03/31
Yosuke Kawai Division of Molecular Pathology, The Institute of Medical Science, The University of Tokyo Population Genetic Analysis of the Origin of Japanese Populations JGAD000123 2021/12/08-2023/03/31
Toshiharu Ninomiya Department of Epidemiology and Public Health, Graduate School of Medical Sciences, Kyushu University Japan Prospective Studies Collaboration for Aging and Dementia (JPSC-AD) JGAD000220 2021/12/08-2026/2/28
Joshua Chiou Internal Medicine Research Unit, Pfizer Evaluating GWAS associations from Biobank Japan to Support Confidence in Rationale for Therapeutic Targets JGAD000123, JGAD000124, JGAD000144-JGAD000201, JGAD000220 2022/01/27-2022/12/31
Gil McVean Genomics plc United Kingdom of Great Britain and Northern Ireland Development of polygenic risk scores in diverse ancestries for diseases, traits and conditions JGAD000101, JGAD000102 2022/07/19-2025/07/01
Gil McVean Genomics plc United Kingdom of Great Britain and Northern Ireland Using large-scale reference panels for imputation and ancestry analysis to support target discovery and polygenic risk score models JGAD000220, JGAD000410 2022/08/04-2025/08/01
Hirofumi Nakaoka Department of Cancer Genome Research, Sasaki Institute Japan Analysis of hypomorphic variants in breast cancer-associated genes by using large-scale sequencing data sets JGAD000209 2022/08/17-2024/03/31
Emiko Noguchi Department of Medical Genetics, Faculty of Medicine, University of Tsukuba Japan Research on genetic predisposition to inflammatory lung disease JGAD000220 2022/09/19-2027/03/31
Nuria Lopez-Bigas Institute for Research in Biomedicine (IRB Barcelona) Spain Study of the genetic basis of clonal hematopoiesis JGAD000399, JGAD000400 2022/11/06-2025/08/01
Keiko Yamazaki Department of Public Health, Graduate School of Medicine, Chiba University Japan Prediction of effectiveness to molecular target drugs in Japanese patients with inflammatory bowel disease JGAD000220 2022/11/09-2025/03/31
Ryosuke Kitoh Department of Otorhinolaryngology-Head and Neck Surgery, Shinshu University School of Medicine Japan Genome-wide association study of the sudden sensorineural hearing loss JGAD000123 2022/12/19-2027/03/31
Shigeo Kamitsuji Statistical Analysis Division, StaGen Co., Ltd. Japan Genome-Wide Association Study to identify genetic factors for strabismus in Japanese population JGAD000123 2023/02/13-2027/02/28
Kei Yura Graduate School of Humanities and Sciences, Ochanomizu University Japan Phenotype Prediction of Cancer Suppressor Gene BRCA1 variants JGAD000220 2023/03/16-2025/03/31
Yoshihiro Onouchi Department of Public Health, Graduate School of Medicine, Chiba University Japan A Multicenter Study to Identify Genetic Factors in Kawasaki Disease JGAD000220 2023/03/30-2025/03/31
Yoshihiro Onouchi Department of Public Health, Graduate School of Medicine, Chiba University Japan A study of the genetic background of differences in antibody response to COVID-19 vaccine JGAD000220 2023/04/19-2025/12/31
Masaki Kato kansai medical university Japan Exploratory and validation study of genetic and biological factors for the development of precision medicine algorithms for psychiatric disorders JGAD000101, JGAD000102,
JGAD000123, JGAD000220
2023/08/25-2028/06/30
Hiroki Kimura Department of Psychiatry, Nagoya University Graduate school of medicine Japan Research on elucidation of susceptibility to brain and mental illness (vulnerability to disease onset) and efficacy and side effects of drugs (treatment responsiveness) through genetic analysis JGAD000220 2023/11/17-2025/10/28
Chikashi Terao Laboratory for Statistical and Translational Genetics, RIKEN Center for Integrative Medical Sciences Japan Research on personalized medicine based on genomics information JGAD000123, JGAD000124,
JGAD000144-JGAD000201,
JGAD000220
2023/11/21-2026/03/31
Yasuhiro Mochida Kidney Disease and Transplant center, Shonan Kamakura General Hospital Japan Association between Clonal hematopoiesis of indeterminate potential and Chronic Kidney Disease in Japanese cohort study JGAD000399, JGAD000400 2024/02/07-2027/03/31
Hiroyuki Mishima Department of Human Genetics, Atomic Bomb Disease Institute, Nagasaki University Japan Development of Methods to Mitigate Batch Effects in Human Whole Genome Sequencing JGAD000220 2024/04/17-2027/03/31
Chikashi Terao Clinical Research Center, Shizuoka General Hospital Japan Investigation of Genetic Factors Associated with Human Phenotypic Traits JGAD000220,
JGAD000495,
JGAD000777
2024/04/17-2028/12/03
Koichi Matsuda Department of Computational Biology and Medical Sciences, Graduate school of Frontier Sciences, The University of Tokyo Japan Disease Cohort Research Network for Disease Marker Exploratory Studies JGAD000209, JGAD000220,
JGAD000288, JGAD000438,
JGAD000458-JGAD000460,
JGAD000495, JGAD000531,
JGAD000720, JGAD000721,
JGAD000777,
JGAD000831-JGAD000834
2024/06/17-2029/03/31
Masao Nagasaki Division of Biomedical Information Analysis, Medical Research Center for High Depth Omics, Medical Institute of Bioregulation, Kyushu University Japan Development and application of bioinformatics methods to facilitate the detection of genes associated with multifactorial disorders based on large-scale whole genome sequencing data of Japanese individuals JGAD000123, JGAD000124,
JGAD000144-JGAD000201,
JGAD000220
2024/06/24-2027/03/31
Kouya Shiraishi Department of Clinical Genomics, National Cancer Center Research Institute Japan Search for genes involved in susceptibility to lung cancer JGAD000123 2024/06/24-2025/12/31
Chikashi Terao Laboratory for Statistical and Translational Genetics, RIKEN Center for Integrative Medical Sciences Japan Research on personalized medicine based on genomics information JGAD000836 2024/07/08-2026/03/31
Chikashi Terao Clinical Research Center, Shizuoka General Hospital Japan Investigation of Genetic Factors Associated with Human Phenotypic Traits JGAD000836 2024/07/16-2028/12/03
Taisei Mushiroda Laboratory for Pharmacogenomics, RIKEN Center for Integrative Medical Sciences Japan Search of genomic biomarkers associated with drug-induced eruptions JGAD000220 2024/08/01-2028/03/31
Masahiro Nakatochi Public Health Informatics, Department of Integrated Sciences, Nagoya University Graduate School of Medicine Japan Exploration of genetic factors involved in the onset, progression, and prognosis of amyotrophic lateral sclerosis JGAD000679 2024/08/27-2030/03/31

hum0014 Release Note

 

Research IDRelease DateType of Data
hum0014.v34 2024/09/19

Processed data of JGAD000220 (WGS for 1,026 individuals) by JGA (data for the use of TogoImputation)

Processed data of JGAD000495 (WGS for 1,964 individuals) by JGA (data for the use of TogoImputation)

hum0014.v33 2024/05/27

WGS for 256 gastric cancer patients

SNP array for 269,000 patients (51 diseases) in BBJ 1st and 2nd cohort

WGS for 617 colorectal cancer patients

WGS for 2,162 diabetes patients

WGS for 2,067 gastric cancer patients

hum0014.v32 2024/01/11 WGS for 1,007 individuals
hum0014.v31 2023/09/01

Processed data of JGAD000220 (WGS for 1,026 individuals) by JGA (CRAM, gVCF)

Processed data (joint call) of JGAD000220 (WGS for 1,026 individuals) by JGA (aggregate VCF)

Processed data of JGAD000220 (reference panel) by JGA (data for the use of TogoImputation)

hum0014.v30 2023/04/20 target sequencings of 27 cancer-predisposing genes in 1,982 lymphoma patients, 10,366 gastric cancer patients and 37,592 controls
hum0014.v29 2023/04/05

GWAS for 9,826 AF patients and 140,446 controls from BBJ 1st cohort

GWAS meta-analysis for 77,690 AF patients and 1,167,040 controls

hum0014.v28 2023/04/05 mobile element variations in 4,880 individuals from the BBJ 1st cohort
hum0014.v27 2022/12/31 GWAS for survival time in 137,693 individuals from the BBJ 1st cohort
hum0014.v26 2022/04/01 target sequencings of 27 cancer-predisposing genes and 13 renal cell carcinoma-related genes in 740 renal cell cancer patients and 5,996 controls
hum0014.v25 2022/01/25 WGS for 1,765 myocardial infarction patients and 199 dementia patients
hum0014.v24 2021/11/26 target sequencing of 27 cancer-predisposing genes in 1,005 pancreatic cancer patients, 12,503 colorectal cancer patients and 23,705 controls
hum0014.v23 2021/07/13 bam/gvcf data of WGS (JGAD000220)
hum0014.v22 2021/05/21 targeted sequencing of 23 genes related to clonal hematopoiesis and SNP-array in 11,234 subjects extracted from approximately 200,000 subjects registered in Biobank Japan between fiscal years 2003 to 2007
hum0014.v21 2020/08/25 GWAS for coronary artery disease
hum0014.v20 2020/08/17 GWAS for coronary artery disease
hum0014.v19 2020/04/20 GWAS for dietary habits
hum0014.v18 2019/11/26 GWAS for Breast cancer
hum0014.v17 2019/10/08 GWAS for 40 diseases
hum0014.v16 2019/10/07 target sequencing of 8 hereditary prostate cancer genes in 7,636 prostate cancer patients and 12,366 controls
hum0014.v15 2019/09/27

reference panel by using of WGS data of the biobank Japan project (N=1,037) and 1KGP p3v5 ALL (N=2,504)

GWAS for height

hum0014.v14 2019/03/26 GWAS for smoking behavior
hum0014.v13 2019/01/25 meta analysis of 4 GWASs for T2DM
hum0014.v12 2018/12/10 meta analysis of 2 GWASs for T2DM with diabetic nephropathy
hum0014.v11 2018/10/16 target sequencing of 11 hereditary breast cancer genes in 7,104 breast cancer patients and 23,731 controls
hum0014.v10 2018/08/13 WGS for 1,026 individuals
hum0014.v9 2018/08/07 GWAS for age at menarche and menopause
hum0014.v8 2018/05/01 GWAS for 58 quantitative traits
hum0014.v7 2018/04/04 GWAS for POAG
hum0014.v6 2017/09/08

GWAS for BMI

Genotype data for 182,505 individuals

hum0014.v5 2017/05/18

GWAS for AF

Genotype data for 8180 AFs

hum0014.v4 2016/02/02 GWAS for AD
hum0014.v3 2016/01/28 GWASs for T2DM
hum0014.v2 2015/12/28 Genotype frequencies of 934 healthy individuals, about 190 patients from each of 35 diseases, 182 esophageal cancer patients, and 92 ALS patients.
hum0014.v1 2014/09/30 GWAS for MI

 

hum0014.v34

The vcf (phased) files, tbi index files, the bref3 files, and a config file processed JGAD000220 (whole genome sequencing data for a total of 1,026 patients) and JGAD000495 (whole genome sequencing data for a total of 1,964 patients) in a certain workflow were provided. These files can be used in the TogoImputation. If you plan to use the data, please indicate both original data (JGAD000220 and JGAD000495) and processed data (JGAD000867 and JGAD000868) on the application form for data use.

 

hum0014.v33

Whole genome sequencing analysis for 256 gastric cancer patients registered in BBJ 1st cohort was performed. Fastq files are provided.

SNP array analysis for 269,000 patients (51 diseases) registered in BBJ 1st and 2nd cohort was performed. IDAT files are provided.

Whole genome sequencing analysis for 617 colorectal cancer patients registered in BBJ 1st and 2nd cohort was performed. Fastq files are provided.

Whole genome sequencing analysis for 2,162 diabetes patients registered in BBJ 1st cohort was performed. Fastq files are provided.

Whole genome sequencing analysis for 2,067 gastric cancer patients registered in BBJ 1st cohort was performed. Fastq files are provided.

 

hum0014.v32

Whole genome sequencing analysis for a total of 1,007 patients, who were registered Bio Bank Japan from 2003 - 2007 was performed. Fastq and vcf files are provided.

 

hum0014.v31

The alignment results [CRAM], variant call results per sample [gVCF] processed JGAD000220 (whole genome sequencing data for a total of 1,026 patients, who were registered BioBank Japan from 2003 - 2007) in a certain workflow were provided. If you plan to use the data, please indicate both original data (JGAD000220) and processed data (JGAD000690) on the application form for data use.

The variant call results per dataset [aggregated VCF, tabix] processed JGAD000220 (whole genome sequencing data for a total of 1,026 patients, who were registered BioBank Japan from 2003 - 2007) in a certain workflow were provided. If you plan to use the data, please indicate both original data (JGAD000220) and processed data (JGAD000758) on the application form for data use.

The tbi index file, the bref3 file, and a config file processed JGAD000220 (JGAS000114 reference panel) in a certain workflow were provided. These files can be used in the TogoImputation. If you plan to use the data, please indicate both original data (JGAD000220) and processed data (JGAD000679) on the application form for data use.

 

hum0014.v30

Target capture sequencing analyses of 27 cancer-predisposing genes in 1,982 lymphoma patients, 10,366 gastric cancer patients and 37,592 controls were performed. Fastq files are provided.

 

hum0014.v29

GWAS summary statistics file for 9,826 AF cases and 140,446 controls from the BBJ 1st cohort was added (txt).

Summary statistics file of cross-ancestry meta-analysis for 77,690 AF cases and 1,167,040 controls and polygenic risk score file calculated from the meta-analysis were added (txt).

 

hum0014.v28

Files for mobile element variations in 4,880 individuals from the BBJ 1st cohort were added (txt).

 

hum0014.v27

GWAS analysis files for 137,693 individuals from the BBJ 1st cohort were added (txt).

Sex-stratified genome-wide association studies using a Cox proportional hazard model under the assumption of the additive genetic model were performed. Associations of genetic variants estimated by saddle point estimation using SPACox software were also evaluated.

 

hum0014.v26

Target capture sequencing analyses of 27 cancer-predisposing genes and 13 renal cell carcinoma-related genes in 740 renal cell cancer patients and 5,996 controls were performed. Fastq files are provided.

 

hum0014.v25

Whole genome sequencing analysis for 1,765 myocardial infarction patients and 199 dementia patients were performed. Whole genome sequencing analyzed data were mapped to the GRCh37 reference genome sequence, and variant detection was carried out using the GATK (Genome Analysis Toolkit) standards. This project is an initiative of the GEnome Medical alliance Japan (GEM Japan, GEM-J). Fastq files, Bam files and gvcf files were provided.

 

hum0014.v24

A target capture sequencing analysis of 27 cancer-predisposing genes in 1,005 pancreatic cancer patients, 12,503 colorectal cancer patients and 23,705 controls was performed. Fastq files are provided.

 

hum0014.v23

Whole genome sequencing analyzed data included in the JGAD000220 were mapped to the GRCh37 reference genome sequence, and variant detection was carried out using the GATK (Genome Analysis Toolkit) standards. Bam files and gvcf files were added. This project is an initiative of the GEnome Medical alliance Japan (GEM Japan, GEM-J).

 

hum0014.v22

Target capture sequencing of 23 genes related to clonal hematopoiesis and SNP-array in 11,234 subjects extracted from approximately 200,000 subjects registered in Biobank Japan between fiscal years 2003 to 2007 were performed. A list of somatic SNVs/indels detected by targeted sequencing (txt) and a table of somatic copy-number alterations detected by analysis of SNP-array data (csv) are provided.

 

hum0014.v21

GWAS analysis files for coronary artery disease patients were added.

 

hum0014.v20

A total of 25,892 patients with coronary artery disease and 142,336 controls were genotyped by using of Illumina HumanOminiExpress, HumanExome, or OmniExpressExome BeadChip. A genotype imputation was carried out using Minimac3 algorithm with the BBJ-CAD reference panel. Then, genome-wide association study was performed using plink2 for 20 million variants.

 

hum0014.v19

A total of 165,084 participants whose dietary habits status is available were genotyped by using of Illumina HumanOminiExpress BeadChip, HumanExome, or OmniExpressExome BeadChip. A genotype imputation was carried out using Minimac algorithm with the 1000 genomes Phase I v3 release as a reference. Then, a genome-wide association study (about 6,000,000 imputed SNVs) was performed (BOLT-LMM and ProbABEL program).

 

hum0014.v18

GWAS analysis files for breast cancer patients were added.

 

hum0014.v17

A total of 212,453 patients of 40 diseases were genotyped by using of HumanOmniExpress/HumanExome/OmniExpressExome BeadChip(Illumina). A genotype imputation was carried out using Minimac3 algorithm with the 1000 genomes Phase 3 v5 release as a reference. Then, genome-wide association studies (8,919,992 imputed SNVs) were performed (SAIGE v0.29.4.2).

 

hum0014.v16

A target capture sequencing analysis of 8 hereditary prostate cancer genes in 7,636 prostate cancer patients and 12,366 controls was performed. Fastq files are provided.

 

hum0014.v15

- A new reference panel was build with WGS data of the biobank Japan project (N=1,037) and the 1KGP p3v5 ALL (N=2,504) .

- A total of 159,095 individuals were genotyped by using of Illumina HumanOminiExpress BeadChip, HumanExome, or OmniExpressExome BeadChip. A genotype imputation was carried out using Minimac algorithm with the reference panel using WGS data of the biobank Japan project (N=1,037) and 1KGP p3v5 ALL (N=2,504). Then, a genome-wide association study (about 20,000,000 imputed variants) for height was performed (BOLT-LMM [ver2.2] and mach2qtl).

 

hum0014.v14

A total of 165,436 participants whose smoking status is available were genotyped by using of Illumina HumanOminiExpress BeadChip, HumanExome, or OmniExpressExome BeadChip. A genotype imputation was carried out using Minimac algorithm with the 1000 genomes Phase I v3 release as a reference. Then, a genome-wide association study (about 6,000,000 imputed SNVs) was performed (BOLT-LMM and ProbABEL program).

 

hum0014.v13

Result of meta analysis of following 4 GWASs (txt file) .

     - 9,804 T2DM patients and 6,728 controls

     - 5,639 T2DM patients and 19,407 controls

     - 18,688 T2DM patients and 121,950 controls

     - 2,483 T2DM patients and 7,065 controls

Genotypes were determined by using of Illumina [HumanOmniExpress, HumanExome, OmniExpressExome, Human610-Quad BeadChip].

 

hum0014.v12

Result of meta analysis of following 2 GWASs (csv file).

     - 2,380 T2DM with diabetic nephropathy patients and 5,234 T2DM without diabetic nephropathy patients

     - 429 T2DM with diabetic nephropathy patients and 358 T2DM without diabetic nephropathy patients

Genotypes were determined by using of OmniExpressExome Beadchip or Human610-Quad BeadChip [Illumina].

 

hum0014.v11

A target capture sequencing analysis of 11 hereditary breast cancer genes in 7,104 breast cancer patients and 23,731 controls was performed. Fastq files are provided.

 

hum0014.v10

Whole genome sequencing analysis for a total of 1,026 patients, who were registered Bio Bank Japan from 2003 - 2007 was performed. Fastq files are provided.

 

hum0014.v9

A total of 67,029 females with information on age at menarche and 43,861 females with information on age at menopause were genotyped by using of Illumina HumanOminiExpress BeadChip, HumanExome, or OmniExpressExome BeadChip. A genotype imputation was carried out using Minimac algorithm with the 1000 genomes Phase I v3 release as a reference. Then, a genome-wide association study (about 10,000,000 imputed SNVs) was performed (mach2dat program).

 

hum0014.v8

A total of 162,255 patients, who were registered Bio Bank Japan from 2003 - 2007 were genotyped by using of Illumina HumanOminiExpress-12 BeadChip, HumanExome, or OmniExpressExome BeadChip. A genotype imputation was carried out using Minimac algorithm with the 1000 genomes Phase I v3 release as a reference. Then, genome-wide association studies (about 6,000,000 imputed SNVs) were performed (mach2qtl program).

 

hum0014.v7

A total of 3980 patients with primary open-angle glaucoma and 18,815 controls were genotyped by using of Illumina HumanOminiExpress BeadChip, HumanExome, or OmniExpressExome BeadChip. A genotype imputation was carried out using Minimac algorithm with the 1000 genomes Phase I v3 release as a reference. Then, a genome-wide association study (about 6,000,000 imputed SNVs) was performed (mach2dat program).

 

hum0014.v6

A total of 158,284 patients, who were registered Bio Bank Japan from 2003 - 2007 were genotyped by using of Illumina HumanOminiExpress-12 BeadChip, HumanExome, or OmniExpressExome BeadChip. A genotype imputation was carried out using Minimac algorithm with the 1000 genomes Phase I v3 release as a reference. Then, a genome-wide association study (about 6,000,000 imputed SNVs) was performed (mach2qtl program).

 

hum0014.v5

A total of 8180 patients with atrial fibrillation, who were registered Bio Bank Japan from 2003 - 2007, and 28,612 controls were genotyped by using of Illumina HumanOminiExpress-12 BeadChip, HumanExome, or OmniExpressExome BeadChip (900,000 SNVs). A genotype imputation was carried out using Minimac algorithm with the 1000 genomes Phase I v3 release as a reference. Then, a genome-wide association study (5,000,000 imputed SNVs) was performed (mach2dat program).

 

hum0014.v4

A total of 1472 patients with atopic dermatitis and 7966 controls were genotyped by using of Illumina HumanOminiExpress-12 BeadChip (606,164 SNVs). A genotype imputation was carried out using Minimac algorithm with the 1000 genomes Phase I v3 release as a reference. Then, a genome-wide association study (7,700,000 imputed SNVs) was performed using logic regression tests with imputed SNP dosage data (mach2dat program). 

 

hum0014.v3

GWASs for T2DM were performed using 552,915 SNPs (9817 T2DM patients vs 6763 controls) and 479,088 SNPs (5646 T2DM patients vs 19,420 controls). Allele frequencies of T2DM patients and controls were compared.

 Genotypes were determined by using of OmniExpressExome Beadchip or Human610-Quad BeadChip [Illumina], respectively.

 

hum0014.v2

Genotype frequencies of 934 healthy individuals, about 190 patients from each of 35 diseases, 182 esophageal cancer patients, and 92 ALS patients.

35 diseases:

Cancer (Lung cancer, Breast cancer, Gastric cancer, Colorectal cancer, Prostate cancer)

Cardiovascular diseases (Heart failure, Myocardial infarction, Unstable angina, Stable angina, Cardiac arrhythmias, Arteriosclerosis obliterans)

Cerebrovascular disorders (Brain infarction, Intracranial aneurysm)

Respiratory tract diseases (Interstitial pneumonitis & pulmonary fibrosis, Pulmonary emphysema, Bronchial asthma)

Chronic liver diseases (Chronic hepatitis C, Liver cirrhosis)

Eye diseases (Cataract, Glaucoma)

Others (Epilepsy, Periodontal disease, Urolithiasis, Nephrotic syndrome, Uterine myoma, Endometriosis, Osteoporosis, Rheumatoid arthritis, Amyotrophic lateral sclerosis, Hay fever, Atopic dermatitis, Drug eruptions , Hyperlipidemias, Diabetes mellitus, Basedow disease)

Illumina [Human610-Quad BeadChip, HumanHap550v3 Genotyping BeadChip], Perlegen Sciences [high-density oligonucleotide arrays], or Hologic Japan [Invader] were used for the genotyping.

 

hum0014.v1

A Genome-Wide Association Study (GWAS) for MI was performed using 455,781 SNPs (Illumina Human610-Quad BeadChip and HumanHap550v3 Genotyping BeadChip). Allele frequencies of 1666 MI patients and 3198 controls were compared.

 

 

Note:

hum0201 Release Note

Research IDRelease DateType of Data
hum0201.v8 2024/08/28 NGS (Exome, RNA-seq, scRNA-seq, ATAC-seq, ChIP-seq)
hum0201.v7 2022/12/27 Processed data of JGAD000335 by JGA
hum0201.v6 2022/08/05 NGS (scRNA-seq, RNA-seq, Exome)
hum0201.v5 2022/06/06 NGS (RNA-seq)
hum0201.v4 2021/11/19 NGS (RNA-seq, ChIP-seq)
hum0201.v3 2020/11/20 NGS (RNA-seq)
hum0201.v2 2020/10/06 NGS (WGS, RNA-seq)
hum0201.v1 2019/12/20 NGS (Exome)

 

hum0201.v8

DNAs/RNAs extracted from pancreatic cancer organoids or genetically engineered pancreatic duct organoids were used for the whole exome, RNA, scRNA, ATAC and ChIP sequencing analyses. Fastq and bed files are provided.

 

hum0201.v7

The alignment results [CRAM], variant call results per sample [gVCF], and variant call results per dataset [aggregated VCF] processed JGAD000335 in a certain workflow were provided. If you plan to use the data, please indicate both original data (JGAD000335) and processed data (JGAD000687) on the application form for data use.

 

hum0201.v6

RNAs extracted from normal colonic tissue from a neoplastic disease patient was used for the single cell RNA sequencing analysis. DNAs and RNAs extracted from organoids established from healthy normal colonic tissues of the patients were used for the whole genome and RNA sequencing analyses. Fastq files are provided.

 

hum0201.v5

RNAs extracted from organoids derived from colon cancer tissues from neoplastic disease patients were used for the RNA sequencing analysis. Fastq files are provided.

 

hum0201.v4

DNAs and RNAs extracted from organoids established from colon cancer tissues and a normal epithelial tissue of the patients were used for RNA sequencing and ChIP-seq analyses. Fastq files are provided.

 

hum0201.v3

Gene expression of 2D normal duodenum organoids with or without medium rotation was analyzed. Total RNA was extracted from organoids with or without a four-day medium rotation, and after 0, 1, 2 or 4 days of medium rotation, and sequenced with HiSeq X Ten. Fastq files are provided.

 

hum0201.v2

DNAs and RNAs extracted from peripheral blood cells and organoids established from normal and cancer tissues of the patients were used for the whole genome and RNA sequencing analysis. Fastq files are provided.

 

hum0201.v1

DNAs extracted from inflammatory/tumor tissues from gastrointestinal inflammatory disease patients, peripheral blood cells from patients/controls, and the organoids established from epithelial tissues from patients were used for the whole exome sequencing analysis. Fastq files are provided.

 

Note:

NBDC Research ID: hum0201.v8

 

SUMMARY

Aims: Search for differences in genomics and traits among normal cells, digestive non-tumor cells (inflammatory cells), and tumor cells

Methods:

JGAS000199: The organoids were established from epithelial tissues from patients, and cultured. DNAs were extracted from the organoids and peripheral blood cells. Whole exome sequencing analysis was performed to identify the somatic mutations.

JGAS000237: Organoids were established from tissue samples (normal and cancer) obtained from any cancer patients. The organoids were cultured, and DNA / RNA was extracted from the organoids for whole genome / RNA sequencing analysis to identify the somatic mutations. Blood samples were substituted when normal tissue samples were not available.

JGAS000256: RNA was extracted from the 2D-cultured normal duodenal organoids with or without medium rotation. RNA sequencing was performed to identify the changes in gene expression after culture medium rotation.

JGAS000378: RNA sequencing and Chip-seq analysis for the organoids derived from colon cancer tissues and a normal epithelial tissue from patients

JGAS000350: RNA sequencing analysis for the organoids derived from colon cancer tissues

JGAS000550: single cell RNA sequencing analysis for normal colonic tissue and RNA sequencing or Whole Exome sequencing analyses for the organoids derived from healthy normal colonic tissues in the neoplastic disease patient

JGAS000719: Organoids were derived from surgically resected specimens, endoscopic ultrasound-guided fine needle aspiration samples, brushing samples, pleural effusion, and ascites of patients with pancreatic cancer. Genetically engineered human pancreatic duct organoids were made by introducing cancer gene mutations in duct organoids using CRISPR-Cas9. DNAs/RNAs were extracted from the organoids for whole exome sequencing, RNA sequencing, single cell RNA sequencing, ATAC sequencing, and ChIP sequencing analyses.

Participants/Materials:

JGAS000199: Controls who underwent colonoscopy, gastrointestinal inflammatory disease patients, and neoplastic disease patients

JGAS000237: Neoplastic disease patients

JGAS000256: Normal duodenum tissue from a neoplastic disease patient

JGAS000378: Colon cancer tissues and a normal epithelial tissue from neoplastic disease patients

JGAS000350: Neoplastic disease patient

JGAS000550: Neoplastic disease patient

JGAS000719: Pancreatic cancer patients

 

Dataset IDType of DataCriteriaRelease Date
JGAS000199 NGS (Exome) Controlled-access (Type I) 2019/12/20
JGAS000237 NGS (WGS, RNA-seq) Controlled-access (Type I) 2020/10/06
JGAS000256 NGS (RNA-seq) Controlled-access (Type I) 2020/11/20
JGAS000378 NGS (RNA-seq, ChIP-seq) Controlled-access (Type I) 2021/11/19
JGAS000350 NGS (RNA-seq) Controlled-access (Type I) 2022/06/06
JGAS000550 NGS (scRNA-seq, RNA-seq, Exome) Controlled-access (Type I) 2022/08/05
JGAD000687 Processed data of JGAD000335 by JGA Controlled-access (Type I) 2022/12/27
JGAS000719 NGS(Exome, RNA-seq, scRNA-seq, ATAC-seq, ChIP-seq Controlled-access (Type I) 2024/08/28

*Release Note

*Data users need to apply an application for Using NBDC Human Data to reach the Controlled-access Data. Learn more

 

MOLECULAR DATA

Exome (JGAS000199/JGAS000550/JGAS000719)

Participants/Materials

gastrointestinal inflammatory disease (ICD10: K51): 29 cases

gastrointestinal inflammatory disease with neoplastic disease (ICD10: K51, C18, C19, C20): 26 cases

organoids derived from healthy normal colonic tissues with neoplastic disease (ICD10: C18): 2 cases

controls who underwent colonoscopy: 16 individuals

pancreatic cancer (ICD10: C25): 50 cases

 (organoids derived from tumor tissues)

Targets Exome
Target Loci for Capture Methods -
Platform Illumina [HiSeq 2500/4000, NovaSeq6000]
Library Source DNAs extracted from peripheral blood cells and organoids established from normal or inflammatory/tumor tissues of the patients or controls
Cell Lines -
Library Construction (kit name) SureSelect Human All Exon kit
Fragmentation Methods Ultrasonic fragmentation
Spot Type Paired-end
Read Length (without Barcodes, Adaptors, Primers, and Linkers) 200 bp
Japanese Genotype-phenotype Archive Dataset ID

JGAD000284

JGAD000669

JGAD000852

Total Data Volume

JGAD000284: 488 GB (fastq)

JGAD000669: 59.5 GB (fastq)

JGAD000852:846.3 GB(fastq)

Comments (Policies) NBDC policy

 

WGS (JGAS000237)

Participants/Materials

Neoplastic disease: 23 cases (55 samples)

    Small cell lung cancer (ICD10: C34)

    Biliary tract cancer (ICD10: C23, C24, C78)

    Colon cancer (ICD10: C18-20)

    Duodenal cancer (ICD10: C17)

    Esophageal cancer (ICD10: C15)

    Stomach cancer (ICD10: C16, C78)

    Liver cancer (ICD10: C22)

    Pancreatic cancer (ICD10: C25)

Targets WGS
Target Loci for Capture Methods -
Platform Illumina [HiSeq X Ten]
Library Source DNAs extracted from peripheral blood cells and organoids established from normal and cancer tissues of the patients
Cell Lines -
Library Construction (kit name) Library was constructed by BGI
Fragmentation Methods Ultrasonic fragmentation
Spot Type Paired-end
Read Length (without Barcodes, Adaptors, Primers, and Linkers) 300 bp
Japanese Genotype-phenotype Archive Dataset ID

JGAD000335

Dataset ID of the Processed data by JGA

JGAD000687

The way to Process

Total Data Volume 2.752 TB (fastq)
Comments (Policies) NBDC policy

 

RNA-seq (JGAS000237, JGAS000378)

Participants/Materials

Neoplastic disease: 23 + 4 cases (35+ 5 samples)

    Small cell lung cancer (ICD10: C34)

    Biliary tract cancer (ICD10: C23, C24, C78)

    Colon cancer (ICD10: C18-20)

    Duodenal cancer (ICD10: C17)

    Esophageal cancer (ICD10: C15)

    Stomach cancer (ICD10: C16, C78)

    Liver cancer (ICD10: C22)

    Pancreatic cancer (ICD10: C25)

Targets RNA-seq
Target Loci for Capture Methods -
Platform Illumina [HiSeq 2500/X Ten, NovaSeq 6000]
Library Source

RNAs extracted from organoids established from tumor tissues and a normal epithelial tissue of the patients

RNAs extracted from organoids with BET bromodomain inhibitor established from tumor tissues and a normal epithelial tissue of colon cancer patients

Cell Lines -
Library Construction (kit name)

TruSeq RNA Library Prep Kit v2

TruSeq Stranded mRNA Library Prep Kit

Fragmentation Methods Heat treatment
Spot Type Paired-end
Read Length (without Barcodes, Adaptors, Primers, and Linkers) 250 bp/150 bp
Japanese Genotype-phenotype Archive Dataset ID

JGAD000336

JGAD000492

Total Data Volume

JGAD000336: 2.752 TB (fastq)

JGAD000492: 190.5 GB (fastq)

Comments (Policies) NBDC policy

 

RNA-seq (JGAS000256)

Participants/Materials Neoplastic disease (ICD10: C16): 1 case (1 sample)
Targets RNA-seq
Target Loci for Capture Methods -
Platform Illumina [HiSeq X Ten]
Library Source Total RNA extracted from normal duodenum organoids with or without a four-day medium rotation, and after 0, 1, 2 or 4 days of medium rotation
Cell Lines -
Library Construction (kit name)

TruSeq RNA Library Prep Kit v2

TruSeq Stranded mRNA Library Prep Kit

Fragmentation Methods Heat treatment
Spot Type Paired-end
Read Length (without Barcodes, Adaptors, Primers, and Linkers) 250 bp/150 bp
Japanese Genotype-phenotype Archive Dataset ID JGAD000359
Total Data Volume 37 GB (fastq)
Comments (Policies) NBDC policy

 

RNA-seq (JGAS000350)

Participants/Materials Neoplastic disease (ICD10: C16): 2 case (11 sample)
Targets RNA-seq
Target Loci for Capture Methods -
Platform Illumina [HiSeq 4000]
Library Source RNAs extracted from organoids derived from colon cancer tissues
Cell Lines -
Library Construction (kit name)

TruSeq RNA Library Prep Kit v2

TruSeq Stranded mRNA Library Prep Kit

Fragmentation Methods Heat treatment
Spot Type Paired-end
Read Length (without Barcodes, Adaptors, Primers, and Linkers) 250 bp/150 bp
Japanese Genotype-phenotype Archive Dataset ID JGAD000464
Total Data Volume 108.9 GB (fastq)
Comments (Policies) NBDC policy

 

RNA-seq (JGAS000550)

Participants/Materials Neoplastic disease (ICD10: C18): 2 case (8 sample)
Targets RNA-seq
Target Loci for Capture Methods -
Platform Illumina [NovaSeq 6000]
Library Source Total RNA extracted from normal duodenum organoids with or without a four-day medium rotation, and after 0, 1, 2 or 4 days of medium rotation
Cell Lines -
Library Construction (kit name)

TruSeq RNA Library Prep Kit v2

TruSeq Stranded mRNA Library Prep Kit

Fragmentation Methods Heat treatment
Spot Type Paired-end
Read Length (without Barcodes, Adaptors, Primers, and Linkers) 150 x 2 bp
Japanese Genotype-phenotype Archive Dataset ID JGAD000669
Total Data Volume 46.6 GB (fastq)
Comments (Policies) NBDC policy

 

RNA-seq (JGAS000719)

Participants/Materials

Pancreatic cancer (ICD10: C25): 40 case (50 sample)

        organoids derived from tumor tissues: 38 cases (38 samples)

        genetically engineered pancreatic duct organoids: 2 cases (12 samples)

Targets RNA-seq
Target Loci for Capture Methods -
Platform Illumina [HiSeq 4000, NovaSeq 6000]
Library Source RNAs extracted from pancreatic cancer organoids or genetically engineered pancreatic duct organoids
Cell Lines -
Library Construction (kit name)

TruSeq RNA Library Prep Kit v2

TruSeq Stranded mRNA Library Prep Kit

Fragmentation Methods Heat treatment
Spot Type Paired-end
Read Length (without Barcodes, Adaptors, Primers, and Linkers) 150 x 2 bp
Japanese Genotype-phenotype Archive Dataset ID JGAD000852
Total Data Volume 846.3 GB (fastq)
Comments (Policies) NBDC policy

 

scRNA-seq (JGAS000550)

Participants/Materials Neoplastic disease (ICD10: C16): 1 case (1 sample)
Targets scRNA-seq
Target Loci for Capture Methods -
Platform Illumina [HiSeq 4000]
Library Source RNAs extracted from organoids from normal colonic tissue
Cell Lines -
Library Construction (kit name) Single Cell 3′ Library & Gel Bead Kit v2 and the A Chip Kit (10X Genomics)
Fragmentation Methods Enzymatic fragmentation
Spot Type Paired-end
Read Length (without Barcodes, Adaptors, Primers, and Linkers) 28+91 bp
Japanese Genotype-phenotype Archive Dataset ID JGAD000669
Total Data Volume 66.0 GB (fastq)
Comments (Policies) NBDC policy

 

scRNA-seq (JGAS000719)

Participants/Materials

Pancreatic cancer (ICD10: C25): 1 case (6 sample)

 (genetically engineered pancreatic duct organoids)

Targets scRNA-seq
Target Loci for Capture Methods -
Platform Illumina [HiSeq 4000]
Library Source RNAs extracted from genetically engineered pancreatic duct organoids
Cell Lines -
Library Construction (kit name) Chromium Next GEM Single Cell 3’ Reagent Kits v3.1
Fragmentation Methods Enzymatic fragmentation
Spot Type Paired-end
Read Length (without Barcodes, Adaptors, Primers, and Linkers) 28+91 bp
Japanese Genotype-phenotype Archive Dataset ID JGAD000852
Total Data Volume 846.3 GB (fastq)
Comments (Policies) NBDC policy

 

ChIP-seq (JGAS000378)

Participants/Materials

Neoplastic disease: 4 cases (5 samples)

    Colon cancer (ICD10: C18)

Targets ChIP-seq
Target Loci for Capture Methods -
Platform Illumina [HiSeq X Ten]
Library Source DNAs extracted from organoids derived from colon cancer tissues and a normal epithelial tissue of the patients, and immunoprecipitated with an anti-histone antibody (H3K27Ac)
Cell Lines -
Library Construction (kit name) Illumina Tagment DNA TDE1 Enzyme and Buffer Kits
Fragmentation Methods Ultrasonic fragmentation (Bioruptor Ⅱ)
Spot Type Paired-end
Read Length (without Barcodes, Adaptors, Primers, and Linkers) 150 bp
Japanese Genotype-phenotype Archive Dataset ID JGAD000492
Total Data Volume 190.5 GB (fastq)
Comments (Policies) NBDC policy

 

 

ChIP-seq (JGAS000719)

Participants/Materials

Pancreatic cancer (ICD10: C25): 5 case (13 sample)

        organoids derived from tumor tissues: 3 cases (3 samples)

        genetically engineered pancreatic duct organoids: 2 cases (10 samples)

Targets ChIP-seq
Target Loci for Capture Methods -
Platform Illumina [HiSeq X Ten]
Library Source DNAs extracted from pancreatic cancer organoids or genetically engineered pancreatic duct organoids, and immunoprecipitated with an anti-histone antibody (H3K27me3)
Cell Lines -
Library Construction (kit name) Illumina Tagment DNA TDE1 Enzyme and Buffer Kits
Fragmentation Methods Ultrasonic fragmentation (Bioruptor Ⅱ)
Spot Type Paired-end
Read Length (without Barcodes, Adaptors, Primers, and Linkers) 150 bp
QC FRiP score
Mapping Methods Bowtie2
Reference Genome Sequence hg38
Peak Calling Methods (software) MACS2
Japanese Genotype-phenotype Archive Dataset ID JGAD000852
Total Data Volume 882.8 GB (fastq, bed)
Comments (Policies) NBDC policy

 

ATAC-seq

Participants/Materials

Pancreatic cancer (ICD10: C25): 12 cases (16 samples)

        organoids derived from tumor tissues: 10 cases (12 samples)

        genetically engineered pancreatic duct organoids: 2 cases (4 samples)

Targets ATAC-seq
Target Loci for Capture Methods -
Platform Illumina [HiSeq X Ten]
Library Source DNAs extracted from pancreatic cancer organoids or genetically engineered pancreatic duct organoids
Cell Lines -
Library Construction (kit name) Omni-ATAC protocol
Fragmentation Methods Tn5 transposase
Spot Type Paired-end
Read Length (without Barcodes, Adaptors, Primers, and Linkers) 150 x 2 bp
Japanese Genotype-phenotype Archive Dataset ID JGAD000852
Total Data Volume 846.3 GB (fastq)
Comments (Policies) NBDC policy

 

DATA PROVIDER

Principal Investigator: Toshiro Sato

Affiliation: Department of Organoid Medicine, Keio University School of Medicine

Project / Group Name: -

Funds / Grants (Research Project Number):

NameTitleProject Number
Project for Elucidating and Controlling Mechanisms of Aging and Longevity, Japan Agency for Medical Research and Development (AMED) Understanding changes in aging traits aimed at controlling the onset of gastrointestinal diseases JP19gm5010002
Project for Cancer Research and Therapeutic Evolution (P-CREATE), Japan Agency for Medical Research and Development (AMED) Development of advanced drug discovery system based on understanding of cancer multi-level phenotype JP19cm0106206
Core Research and Evolutional Science and Technology, Advanced Research & Development Programs for Medical Innovation, Japan Agency for Medical Research and Development (AMED-CREST) Dissecting intestinal fibrogenic diseases by a newly developed 4D disease model system JP19gm1210001
KAKENHI Grant-in-Aid for Scientific Research (S) Gaining Integrative Understanding of Gastrointestinal Disease Phenotypes through Establishment of an Organoid Library 17H06176
KAKENHI Grant-in-Aid for Scientific Research (B) Functional analysis of small intestinal epithelial organoid-based transplant graft 20H03746
KAKENHI Grant-in-Aid for Scientific Research (S) Elucidating a role of niche construction in pathophysiological mechanism of human digestive diseases 22H04995

 

PUBLICATIONS

TitleDOIDataset ID
1 Somatic inflammatory gene mutations in human ulcerative colitis epithelium doi: 10.1038/s41586-019-1844-5 JGAD000284
2 An Organoid Biobank of Neuroendocrine Neoplasms Enables Genotype-Phenotype Mapping doi: 10.1016/j.cell.2020.10.023

JGAD000335

JGAD000336

3 An organoid-based organ repurposing approach to treat short bowel syndrome doi: 10.1038/s41586-021-03247-2 JGAD000359
4 Organoid screening reveals epigenetic vulnerabilities in human colorectal cancer doi: 10.1038/s41589-022-00984-x JGAD000492
5

 

USRES (Controlled-access Data)

Principal InvestigatorAffiliationCountry/RegionResearch TitleData in Use (Dataset ID)Period of Data Use
Klaus H. Kaestner Institue for Diabetes, Obesity & Metabolism at Unversity of Pennsylvania Somatic mutation analysis of the pancreas in type 1 diabetes JGAD000284 2020/04/13-2021/03/03
Ulf Leser Department of Mathematics and Computer Science, Humboldt-Universitaet zu Berlin MAPTor-NET: MAPK-mTOR network model driven individualized therapies of pancreatic neuro-endocrine tumors (pNETs) JGAD000335, JGAD000336 2021/06/03-2023/04/01
Nobuhiro Tanuma Miyagi Cancer Center Research Institute Japan Study on biological characters of pancreatic and gastrointestinal neuroendocrine tumors using patient-derived organoids. JGAD000336 2022/09/19-2023/07/31
Michiaki Hamada Faculty of Science and Engineering, Waseda University Japan Construction of RNA-targeted Drug Discovery Database JGAD000336, JGAD000359, JGAD000492 2022/12/26-2025/03/31

NBDC Research ID: hum0474.v1

 

SUMMARY

Aims: Autism Spectrum Disorders (ASDs) are psychiatric disorders with a high prevalence and a significant genetic component. The aetiology is still unknown and there are still no fundamental treatments. The aim of this study was to investigate the presence or absence of variants (mutations or polymorphisms) in genes associated with the onset and transition of the condition in children with autistic spectrum disorder and their families, to examine the relationship between genetic variants and the clinical phenotype (clinical condition), and to use this information for diagnosis, treatment and support.

Methods: Based on the SFARI gene database, 16 highly confident ASD-associated genes, one promoter region, and 20 intergenic regions containing ASD-associated SNPs were selected for the biotinylated oligonucleotide probe design. For sequencing, buccal mucosa was collected using a swab. Genomic DNA extracted from the buccal mucosa samples were used for Target Capture Sequencing analysis.

Participants/Materials: 32 children with ASD, 8 children with low birth weight, 3 children with sub-threshold ASD, 36 typically developing children

URL: https://kodomokokoro.w3.kanazawa-u.ac.jp/en/

 

Dataset IDType of DataCriteriaRelease Date
JGAS000731 NGS (Target Capture) Controlled-access (Type I) 2024/08/22

*Release Note

* Data users need to apply an application for Using NBDC Human Data to reach the Controlled-access Data. Learn more

 

MOLECULAR DATA

JGAS000731

Participants/Materials

32 children with ASD (ICD10: F849)

8 children with low birth weight

3 children with sub-threshold ASD

36 typically developing children

       buccal mucosa: each 1 sample (total 79 samples)

Targets Target Capture
Target Loci for Capture Methods

ASD-associated genes: POGZ, SCN1A, SCN2A, FOXP1, SLC6A1, ARID1B, SYNGAP1, CNTNAP2, KCNQ3, PTEN, SUV420H1, GRIN2B, CHD8, ADNP, DYRK1A, SHANK3

Promoter region: HTTLPR

SNPs in intergenic regions: rs1620977, rs34213746, rs1452075, rs16854048 , rs325506, rs2388334, rs111931861, rs7794745, rs10099100, rs11787216, rs2094530, rs10149470, rs113877277, rs6035856, rs6035857, rs6047381, rs6137325, rs6137326, rs71190156, rs910805

Platform Illumina [iSeq 100]
Library Source DNAs extracted from buccal mucosa samples
Cell Lines -
Library Construction (kit name) KAPA HyperPlus Kit
Fragmentation Methods Enzymatic fragmentation
Spot Type Paired-end
Read Length (without Barcodes, Adaptors, Primers, and Linkers) 150 bp
Mapping Methods Burrows-Wheeler Aligner v0.7.17
Mapping Quality -
Reference Genome Sequence GRCh38
Coverage (Depth) -
Japanese Genotype-phenotype Archive Dataset ID JGAD000864
Total Data Volume 4.5 GB (fastq, vcf)
Comments (Policies) NBDC policy

 

DATA PROVIDER

Principal Investigator: Shigeru Yokoyama

Affiliation: Research Center for Child Mental Development, Kanazawa University

Project / Group Name: Bambi Plan

URL: https://kodomokokoro.w3.kanazawa-u.ac.jp/en/

Funds / Grants (Research Project Number):

NameTitleProject Number
KAKENHI Grant-in-Aid for JSPS Fellows Establishment of diagnostic indices of sub-threshold autism spectrum disorder by brain function imaging and genomic analyses 22J14602
KAKENHI Grant-in-Aid for Scientific Research (B) A study of symptom variability corresponding to functional features of brain activity in children with autism spectrum disorder 20H03599
KAKENHI Grant-in-Aid for Scientific Research (B) study of biological investigation of autism spectrum disorder including sub-threshold 23K27526

 

PUBLICATIONS

 

USRES (Controlled-access Data)

TitleDOIDataset ID
1 Association of Genetic Variants with Autism Spectrum Disorder in Japanese Children Revealed by Targeted Sequencing doi: 10.3389/fgene.2024.1352480 JGAD000864
2
Principal InvestigatorAffiliationCountry/RegionResearch TitleData in Use (Dataset ID)Period of Data Use
Go to top