NBDC Research ID: hum0014.v35

SUMMARY

Aims: Identify disease-related genes and mobile element variations in Japanese/Development for Japanese population-specific reference panels

Methods: Genomic DNA samples were genotyped by following methods: Human610-Quad BeadChip, HumanHap550v3 Genotyping BeadChip, HumanOmniExpress-12 BeadChip, HumanExome BeadChip, OmniExpressExome BeadChip (Illumina), high-density oligonucleotide arrays (Perlegen Sciences), or Invader (Hologic Japan). Genome-Wide Association Studies (GWAS) for myocardial infarction (MI) , type II diabetes mellitus (T2DM), Atopic dermatitis (AD), atrial fibrillation (AF), Body Mass Index (BMI), primary open-angle glaucoma (POAG), 58 quantitative traits, age at menarche / menopause, smoking behaviour, height, 42 diseases (among them, the samples of 4 diseases were partially overlapped with those of previous release), dietary habits, and coronary artery disease were performed using about 500-2700K variants. Meta analyses for T2DM with diabetic nephropathy and for T2DM were also performed. SNP array analysis for 51 diseases registered in Biobank Japan were performed. Whole-genome sequencing analyses for 1,026 + 1,007 patients, who were registered Bio Bank Japan from 2003 - 2007, 1,765 myocardial infarction patients, 199 dementia patients, 256 + 2,067 gastric cancer patients, 617 colorectal cancer patients and 2,162 diabetes patients were performed with Illumina HiSeq 2500/X Five. Target sequencing analyses of 11 hereditary breast cancer genes in 7,104 breast cancer patients and 23,731 controls, 8 hereditary prostate cancer genes in 7,636 prostate cancer patients and 12,366 controls, 23 genes related to clonal hematopoiesis in 11,234 subjects extracted from approximately 200,000 subjects registered in Biobank Japan between fiscal years 2003 to 2007, 27 cancer-predisposing genes in 1,009 pancreatic cancer patients, 12,606 colorectal cancer patients, 740 renal cell cancer patients, 1,982 lymphoma patients, 10,366 gastric cancer patients and 23,780 + 5,996 + 37,592 controls and 13 renal cell carcinoma-related genes in 740 renal cell cancer patients and 5,996 controls were performed with Illumina HiSeq 2500. SNP array analysis for 11,234 subjects was also performed. A new reference panel was build with WGS data of the biobank Japan project (N=7,472) and the 1KGPp3v5 ALL (N=2,504). Sex-stratified genome-wide association studies using a Cox proportional hazard model under the assumption of the additive genetic model were performed. Associations of genetic variants estimated by saddle point estimation using SPACox software were also evaluated. A mobile element variation (MEV) search tool, MEGAnE, was applied to 4,880 WGS conducted in BBJ and 24,933 MEVs were found. Genome-wide association study for atrial fibrillation was performed in 9,826 cases and 140,446 controls. A subsequent cross-ancestry meta-analysis with European GWAS (60,620 cases and 970,216 controls; http://csg.sph.umich.edu/willer/public/afib2018) and Finnish GWAS (7,244 cases and 56,378 controls; FinnGenn; https://www.finngen.fi/en) was performed (77,690 cases and 1,167,040 controls in total). Polygenic risk score was constructed based on the cross-ancestry meta-analysis of atrial fibrillation.

Participants/Materials: Participants for the Tailor-made Medical Treatment Program (BioBank Japan: BBJ)

URL: https://biobankjp.org/en

Dataset ID	Type of Data	Criteria	Release Date
hum0014.v1.freq.v1	GWAS for MI	Unrestricted-access	2014/09/30
hum0014.v2.jsnp.934ctrl.v1	Genotype frequencies in 934 healthy individuals (JSNP data)	Unrestricted-access	2015/12/28
35 Dieases	Genotype frequencies in each disease (JSNP data)	Unrestricted-access	2015/12/28
hum0014.v2.jsnp.182ec.v1	Genotype frequencies in 182 esophageal cancer patients (JSNP data)	Unrestricted-access	2015/12/28
hum0014.v2.jsnp.92als.v1	Genotype frequencies in 92 amyotrophic lateral sclerosis (ALS) patients (JSNP data)	Unrestricted-access	2015/12/28
hum0014.v3.T2DM-1.v1	GWAS for T2DM [1]	Unrestricted-access	2016/01/28
hum0014.v3.T2DM-2.v1	GWAS for T2DM [2]	Unrestricted-access	2016/01/28
hum0014.v4.AD.v1	GWAS for AD	Unrestricted-access	2016/02/02
hum0014.v5.AF.v1	GWAS for AF	Unrestricted-access	2016/05/18
JGAS000101	Genotype and phenotype data for 8180 AF patients	Controlled-access (Type I)	2016/05/18
hum0014.v6.158k.v1	GWAS for BMI	Unrestricted-access	2017/09/08
JGAS000114	BMI data for 158,284 individuals Genotype data for 182,505 individuals	Controlled-access (Type I)	2017/09/08
hum0014.v7.POAG.v1	GWAS for POAG	Unrestricted-access	2018/04/04
hum0014.v8.58qt.v1	GWAS for 58 quantitative traits	Unrestricted-access	2018/05/01
JGAS000114	58 quantitative traits data for 200,849 individuals	Controlled-access (Type I)	2018/05/01
hum0014.v9.Men.v1 hum0014.v9.MP.v1	GWAS for age at menarche and menopause	Unrestricted-access	2018/08/07
JGAS000114	WGS for 1,026 individuals	Controlled-access (Type I)	2018/08/13
JGAS000140	target sequencing of 11 hereditary breast cancer genes in 7,104 breast cancer patients and 23,731 controls	Controlled-access (Type I)	2018/10/16
hum0014.v12.T2DMwN.v1	meta analysis of 2 GWASs for T2DM with diabetic nephropathy	Unrestricted-access	2018/12/10
hum0014.v13.T2DMmeta.v1	meta analysis of 4 GWASs for T2DM	Unrestricted-access	2019/01/25
hum0014.v14.smok.v1	GWAS for smoking behaviour	Unrestricted-access	2019/03/26
JGAS000114	a reference panel from WGS data of the biobank Japan project (N=1,037) and 1KGP p3v5 ALL (N=2,504)	Controlled-access (Type I)	2019/09/27
hum0014.v15.ht.v1	GWAS for height	Unrestricted-access	2019/09/27
JGAS000203	target sequencing of 8 hereditary prostate cancer genes in 7,636 prostate cancer patients and 12,366 controls	Controlled-access (Type I)	2019/10/07
hum0014.v17	GWAS for 40 diseases	Unrestricted-access	2019/10/08
hum0014.v18	GWAS for Breast cancer	Unrestricted-access	2019/11/26
hum0014.v19	GWAS for dietary habits	Unrestricted-access	2020/04/20
hum0014.v20.cad.v1	GWAS for coronary artery disease	Unrestricted-access	2020/08/17
hum0014.v21	GWAS for coronary artery disease	Unrestricted-access	2020/08/25
JGAS000293	target sequencing of 23 genes related to clonal hematopoiesis and SNP array in 11,234 subjects extracted from approximately 200,000 subjects registered in Biobank Japan between fiscal years 2003 to 2007	Controlled-access (Type I)	2021/05/21
JGAS000114 (Data addition)	bam/gvcf data of WGS (JGAD000220)	Controlled-access (Type I)	2021/07/13
JGAS000327	target sequencing of 27 cancer-predisposing genes in 1,005 pancreatic cancer patients	Controlled-access (Type I)	2021/11/26
JGAS000346	target sequencing of 27 cancer-predisposing genes in 12,503 colorectal cancer patients and 23,705 controls	Controlled-access (Type I)	2021/11/26
JGAS000381	WGS for 1,765 myocardial infarction patients and 199 dementia patients	Controlled-access (Type I)	2022/01/25
JGAS000414	target sequencings of 27 cancer-predisposing genes and 13 renal cell carcinoma-related genes in 740 renal cell cancer patients and 5,996 controls	Controlled-access (Type I)	2022/04/01
hum0014.v27.surv.v1	GWAS for survival time in 137,693 individuals from BBJ 1st cohort	Unrestricted-access	2022/12/31
hum0014.v28.MEs.v1	mobile element variations in 4,880 individuals from BBJ 1st cohort	Unrestricted-access	2023/04/05
hum0014.v29.AF.v1	GWAS for 9,826 AF patients and 140,446 controls from BBJ 1st cohort GWAS meta-analysis for 77,690 AF patients and 1,167,040 controls	Unrestricted-access	2023/04/05
JGAS000347	target sequencing of 27 cancer-predisposing genes in 1,982 lymphoma patients	Controlled-access (Type I)	2023/04/20
JGAS000592	target sequencing of 27 cancer-predisposing genes in 10,366 gastric cancer patients	Controlled-access (Type I)	2023/04/20
JGAS000592	target sequencing of 27 cancer-predisposing genes in 37,592 controls	Controlled-access (Type I)	2023/04/20
JGAD000690	Processed data of JGAD000220 (WGS for 1,026 individuals) by JGA (CRAM, gVCF)	Controlled-access (Type I)	2023/08/31
JGAD000758	Processed data (joint call) of JGAD000220 (WGS for 1,026 individuals) by JGA (aggregate VCF)	Controlled-access (Type I)	2023/08/31
JGAD000679	Processed data of JGAD000220 (reference panel) by JGA (data for the TogoImputation reference panel)	Controlled-access (Type I)	2023/09/01
JGAS000647	WGS for 1,007 individuals	Controlled-access (Type I)	2024/01/11
JGAS000698	WGS for 256 gastric cancer patients	Controlled-access (Type I)	2024/05/27
JGAS000703	SNP array for 269,000 patients (51 diseases) in BBJ 1st and 2nd cohort	Controlled-access (Type I)	2024/05/27
JGAS000699	WGS for 617 colorectal cancer patients	Controlled-access (Type I)	2024/05/27
JGAS000700	low-depth WGS for 2,162 diabetes patients	Controlled-access (Type I)	2024/05/27
JGAS000701	low-depth WGS for 2,067 gastric cancer patients	Controlled-access (Type I)	2024/05/27
JGAD000867	Processed data of JGAD000220 (reference panel) by JGA (data for the TogoImputation reference panel)	Controlled-access (Type I)	2024/09/19
JGAD000868	Processed data of JGAD000495 (reference panel) by JGA (data for the TogoImputation reference panel)	Controlled-access (Type I)	2024/09/19
JGAS000738	Imputation reference panel for 7,472 Japanese WGS and 2,504 1000 Genome Project data	Controlled-access (Type I)	2024/10/25

*Release Note

*Data users need to apply an application for Using NBDC Human Data to reach the Controlled-access Data. Learn more

* When the research results including the data which were downloaded from NHA/DRA, are published or presented somewhere, the data user must refer the papers which are related to the data, or include in the acknowledgment. Learn more

MOLECULAR DATA

hum0014.v1.freq.v1


Participants/Materials	1666 MI patients and 3198 controls
Targets	genome wide SNVs
Target Loci for Capture Methods	-
Platform	Illumina [Human610-Quad BeadChip, HumanHap550v3 Genotyping BeadChip]
Source	gDNA extracted from peripheral blood cells
Cell Lines	-
Reagents (Kit, Version)	Illumina Human610-Quad Beadchip
Genotype Call Methods (software)	GenCall software (GenomeStudio)
Filtering Methods	sample call rate ≧ 0.98, SNP call rate ≧ 0.99, HWE P ≧ 1 x 10^-6
Marker Number (after QC)	455,781 SNPs (hg18/GRCh36)
NBDC Dataset ID	hum0014.v1.freq.v1 (Click the Dataset ID to download the file) Dictionary file
Total Data Volume	71.3 MB (xlsx)
Comments (Policies)	NBDC policy

hum0014.v2.jsnp.934ctrl.v1


Participants/Materials	934 Japanese healthy individuals (JSNP)
Targets	genome wide SNVs
Target Loci for Capture Methods	-
Platform	Illumina [HumanHap550v3 Genotyping BeadChip]
Source	gDNA extracted from peripheral blood cells
Cell Lines	-
Reagents (Kit, Version)	Illumina HumanHap550v3 Genotyping BeadChip
Genotype Call Methods (software)	GenCall software (GenomeStudio)
Filtering Methods	sample call rate < 0.98, SNP call rate < 0.99, HWE P < 1 x 10^-6
Marker Number (after QC)	515,286 SNPs
NBDC Dataset ID	hum0014.v2.jsnp.934ctrl.v1 (Click the Dataset ID to download the file) Dictionary file
Total Data Volume	32.2 M (zip [xls])
Comments (Policies)	NBDC policy

35 Diseases (JSNP)


Participants/Materials	Cancer (Lung cancer, Breast cancer, Gastric cancer, Colorectal cancer, Prostate cancer) Cardiovascular diseases (Heart failure, Myocardial infarction, Unstable angina, Stable angina, Cardiac arrhythmias, Arteriosclerosis obliterans) Cerebrovascular disorders (Brain infarction, Intracranial aneurysm) Respiratory tract diseases (Interstitial pneumonitis & pulmonary fibrosis, Pulmonary emphysema, Bronchial asthma) Chronic liver diseases (Chronic hepatitis C, Liver cirrhosis) Eye diseases (Cataract, Glaucoma) Others (Epilepsy, Periodontal disease, Urolithiasis, Nephrotic syndrome, Uterine myoma, Endometriosis, Osteoporosis, Rheumatoid arthritis, Amyotrophic lateral sclerosis, Hay fever, Atopic dermatitis, Drug eruptions , Hyperlipidemias, Diabetes mellitus, Basedow disease ) about 190 patients in each disease set
Targets	genome wide SNVs
Target Loci for Capture Methods	-
Platform	Perlegen Sciences [high-density oligonucleotide arrays]
Source	gDNA extracted from peripheral blood cells
Cell Lines	-
Reagents (Kit, Version)	-
Genotype Call Methods (software)	-
Filtering Methods	-
Marker Number (after QC)	About 200,000 SNPs (b129)
NBDC Dataset ID	Cancer (Lung cancer, Breast cancer, Gastric cancer, Colorectal cancer, Prostate cancer) Cardiovascular diseases (Heart failure, Myocardial infarction, Unstable angina, Stable angina, Cardiac arrhythmias, Arteriosclerosis obliterans) Cerebrovascular disorders (Brain infarction, Intracranial aneurysm) Respiratory tract diseases (Interstitial pneumonitis & pulmonary fibrosis, Pulmonary emphysema, Bronchial asthma) Chronic liver diseases (Chronic hepatitis C, Liver cirrhosis) Eye diseases (Cataract, Glaucoma) Others (Epilepsy, Periodontal disease, Urolithiasis, Nephrotic syndrome, Uterine myoma, Endometriosis, Osteoporosis, Rheumatoid arthritis, Amyotrophic lateral sclerosis, Hay fever, Atopic dermatitis, Drug eruptions, Hyperlipidemias, Diabetes mellitus, Basedow disease) (Click the disease names to download the file) Dictionary file
Comments (Policies)	NBDC policy

*Chromosomal position of each SNP is based on dbSNP build 129. If you need other mapping information, please contact us.

hum0014.v2.jsnp.182ec.v1


Participants/Materials	182 esophageal cancer patients (JSNP)
Targets	genome wide SNVs
Target Loci for Capture Methods	-
Platform	Illumina [HumanHap550v3 Genotyping BeadChip]
Source	gDNA extracted from peripheral blood cells
Cell Lines	-
Reagents (Kit, Version)	Illumina HumanHap550v3 Genotyping BeadChip
Genotype Call Methods (software)	GenCall software (GenomeStudio)
Filtering Methods	sample call rate < 0.98, SNP call rate < 0.99, HWE P < 1 x 10^-6
Marker Number (after QC)	503,734 SNPs
NBDC Dataset ID	hum0014.v2.jsnp.182ec.v1 (Click the Dataset ID to download the file) Dictionary file
Total Data Volume	6.6 MB (zip [txt])
Comments (Policies)	NBDC policy

hum0014.v2.jsnp.92als.v1


Participants/Materials	92 ALS patients (JSNP)
Targets	large-scale case-control association study
Target Loci for Capture Methods	-
Platform	Hologic Japan [Invader]
Source	gDNA extracted from peripheral blood cells
Cell Lines	-
Reagents (Kit, Version)	Invader assay system (Third Wave Technologies)
Genotype Call Methods (software)	ABI PRISM SDS versions 2.0 - 2.2
Filtering Methods	SNP call rate ≥ 0.95, HWE P ≥1.0 x 10^-2
Marker Number (after QC)	48,939 SNPs
NBDC Dataset ID	hum0014.v2.jsnp.92als.v1 (Click the Dataset ID to download the file) Dictionary file
Total Data Volume	3.2 MB (zip [txt])
Comments (Policies)	NBDC policy

hum0014.v3.T2DM-1.v1


Participants/Materials	9817 T2DM patients 6763 controls (healthy individuals and patients with Intracranial aneurysm, Esophageal cancer, Uterine cancer, Pulmonary emphysema, or Glaucoma [without T2DM])
Targets	genome wide SNVs
Target Loci for Capture Methods	-
Platform	Illumina [OmniExpressExome Beadchip]
Source	gDNA extracted from peripheral blood cells
Cell Lines	-
Reagents (Kit, Version)	Illumina OmniExpressExome Beadchip kit
Genotype Call Methods (software)	GenCall software (GenomeStudio)
Filtering Methods	sample call rate < 0.98, SNP call rate < 0.99, MAF < 0.01, HWE P < 1 x 10^-6 in control
Marker Number (after QC)	552,915 SNPs (hg19)
NBDC Dataset ID	hum0014.v3.T2DM-1.v1 (Click the Dataset ID to download the file) Dictionary file
Total Data Volume	84.0 MB (xlsx)
Comments (Policies)	NBDC policy

hum0014.v3.T2DM-2.v1


Participants/Materials	5646 T2DM patients 19,420 controls (patients with Colorectal cancer, Breast cancer, Prostate cancer, Lung cancer, Gastric cancer, Arteriosclerosis obliterans, Cardiac arrhythmias, Brain infarction, Myocardial infarction, Gallbladder cancer and Cholangiocarcinoma, Pancreatic cancer, Drug eruptions, Rheumatoid arthritis, Amyotrophic lateral sclerosis, Liver cancer, Liver cirrhosis, Osteoporosis, or Uterine myoma [without T2DM])
Targets	genome wide SNVs
Target Loci for Capture Methods	-
Platform	Illumina [Human610-Quad BeadChip]
Source	gDNA extracted from peripheral blood cells
Cell Lines	-
Reagents (Kit, Version)	Illumina Human610-Quad Beadchip kit
Genotype Call Methods (software)	GenCall software (GenomeStudio)
Filtering Methods	sample call rate < 0.98, SNP call rate < 0.99, MAF < 0.01, HWE P < 1 x 10^-6 in control
Marker Number (after QC)	479,088 SNPs (hg18)
NBDC Dataset ID	hum0014.v3.T2DM-2.v1 (Click the Dataset ID to download the file) Dictionary file
Total Data Volume	72.6 MB (xlsx)
Comments (Policies)	NBDC policy

hum0014.v4.AD.v1


Participants/Materials	1472 AD patients 7966 controls (healthy individuals and patients with Intracranial aneurysm, Esophageal cancer, Uterine cancer, Pulmonary emphysema, or Glaucoma [without AD and Bronchial asthma])
Targets	genome wide SNVs
Target Loci for Capture Methods	-
Platform	Illumina [HumanOmniExpress BeadChip]
Source	gDNA extracted from peripheral blood cells
Cell Lines	-
Reagents (Kit, Version)	HumanOmniExpress BeadChip
Genotype Call Methods (software)	minimac [imputation (1000 genomes Phase I v3)]
Association Analysis (software)	mach2dat [GWAS]
Filtering Methods	Genotyping QC: sample call rate < 0.98, SNV call rate < 0.99, HWE P < 1 x 10^-6 in the control samples Imputation QC: HWE P < 1 x 10^-6 or MAF < 0.01 in the reference panel Differences of MAF between the GWAS dataset and the reference panel > 0.16
Marker Number (after QC)	About 7,700,000 SNPs (hg19)
NBDC Dataset ID	hum0014.v4.AD.v1 (Click the Dataset ID to download the file) Dictionary file
Total Data Volume	ADGWAS_auto.txt (525 MB) ADGWAS_X_females.txt (17 MB) ADGWAS_X_males.txt (15 MB)
Comments (Policies)	NBDC policy

JGAS000101 / hum0014.v5.AF.v1


Participants/Materials	8180 atrial fibrillation patients and 28,612 controls
Targets	genome wide SNVs
Target Loci for Capture Methods	-
Platform	Illumina [HumanOmniExpress, HumanExome, OmniExpressExome BeadChip]
Source	DNA extracted from peripheral blood cells
Cell Lines	-
Reagents (Kit, Version)	HumanOmniExpress, HumanExome, OmniExpressExome BeadChip kit
Genotype Call Methods (software)	minimac [imputation (1000 genomes Phase I v3)] GenCall software（GenomeStudio）
Association Analysis (software)	mach2dat [GWAS]
Filtering Methods	Genotyping QC: sample call rate < 0.98, SNV call rate < 0.99, HWE P < 1 x 10^-6 in the control samples Imputation QC: HWE P < 1 x 10^-6 or MAF < 0.01 in the reference panel Differences of MAF between the GWAS dataset and the reference panel > 0.16 R square < 0.9
Marker Number (after QC)	About 5,000,000 SNVs
Phenotype Data	Gender, Age
NBDC Dataset ID / Japanese Genotype-phenotype Archive Dataset ID	[GWAS stats] hum0014.v5.AF.v1 (Click the Dataset ID to download the file) Dictionary file [Individual datasets] Phenotype: JGAD000101 Genotype: JGAD000102
Total Data Volume	GWAS: 473 MB (txt) Individual phenotype-genotype data: 1 GB (txt)
Comments (Policies)	NBDC policy

JGAS000114 / hum0014.v6.158k.v1


Participants/Materials	182,505 individuals (158,284 individuals for BMI study)
Targets	genome wide SNVs
Target Loci for Capture Methods	-
Platform	Illumina [HumanOmniExpress, HumanExome, OmniExpressExome BeadChip]
Source	DNA extracted from peripheral blood cells
Cell Lines	-
Reagents (Kit, Version)	HumanOmniExpress, HumanExome, OmniExpressExome BeadChip kit
Genotype Call Methods (software)	minimac [imputation (1000 genomes Phase I v3)] GenCall software (GenomeStudio)
Association Analysis (software)	mach2qtl (v1.1.3)
Filtering Methods	Genotyping QC: sample call rate < 0.98, SNV call rate < 0.99, HWE P < 1 x 10^-6 QC for reference panel: After excluding 11 closely related individuals, variants with HWE P < 1.0 x 10^-6, MAF < 0.01 were excluded. QC after imputation: Variants with imputation quality of Rsq < 0.7 were excluded.
Marker Number (after QC)	About 6,000,000 and 150,000 SNVs on autosomes and X-chromosome, respectively.
NBDC Dataset ID / Japanese Genotype-phenotype Archive Dataset ID	[GWAS] hum0014.v6.158k.v1 (Click the Dataset ID to download the file) Dictionary file [Individual datasets] Phenotype data (BMI): JGAD000124 Genotype data: JGAD000123 Detailed information on genotyping array Probe information (BLAST)
Total Data Volume	GWAS: 406 MB (zip) Phenotype data (BMI): 3.32 MB (txt.gz) Genotype data: 26.3GB (csv.gz)
Comments (Policies)	NBDC policy

hum0014.v7.POAG.v1


Participants/Materials	3980 POAG patients (Male: 1,997, Female: 1,983) 18,815 controls (Male: 7,817, Female: 10,998)
Targets	genome wide SNVs
Target Loci for Capture Methods	-
Platform	Illumina [HumanOmniExpress, HumanExome, OmniExpressExome BeadChip]
Source	gDNA extracted from peripheral blood cells
Cell Lines	-
Reagents (Kit, Version)	HumanOmniExpress, HumanExome, OmniExpressExome BeadChip kit
Genotype Call Methods (software)	minimac（ver. 0.1.1） [imputation (1000 genomes Phase I v3)]
Association Analysis (software)	mach2dat（ver. 1.0.19）
Filtering Methods	Genotyping QC: sample call rate < 0.98, SNV call rate < 0.99, HWE P < 1 x 10^-6 QC for reference panel: After excluding 11 closely related individuals, variants with HWE P < 1.0 x 10^-6, MAF < 0.01 were excluded. QC after imputation: Variants with imputation quality of Rsq < 0.7 were excluded. We also excluded variants with \|beta\| > 4 in the uploaded files.
Marker Number (after QC)	autosomes: 5,961,428 SNPs (hg19) male X-chromosome: 147,351 SNPs (hg19) female X-chromosome: 147,353 SNPs (hg19)
NBDC Dataset ID	hum0014.v7.POAG.v1 (Click the Dataset ID to download the file) Dictionary file
Total Data Volume	113 MB (txt.zip)
Comments (Policies)	NBDC policy

JGAS000114 / hum0014.v8.58qt.v1


Participants/Materials	162,255 individuals for 58 quantitative traits
Targets	genome wide SNVs
Target Loci for Capture Methods	-
Platform	Illumina [HumanOmniExpress, HumanExome, OmniExpressExome BeadChip]
Source	DNA extracted from peripheral blood cells
Cell Lines	-
Reagents (Kit, Version)	HumanOmniExpress, HumanExome, OmniExpressExome BeadChip kit
Genotype Call Methods (software)	minimac [imputation (1000 genomes Phase I v3)] GenCall software (GenomeStudio)
Association Analysis (software)	mach2qtl (v1.1.3)
Filtering Methods	Genotyping QC: sample call rate < 0.98, SNV call rate < 0.99, HWE P < 1 x 10^-6 QC for reference panel: After excluding 11 closely related individuals, variants with HWE P < 1.0 x 10^-6, MAF < 0.01 were excluded. QC after imputation: Variants with imputation quality of Rsq < 0.7 were excluded.
Marker Number (after QC)	5,961,600 and 147,353 SNVs on autosomes and X-chromosome, respectively.
NBDC Dataset ID / Japanese Genotype-phenotype Archive Dataset ID	Metabolic	Total cholesterol	JGAD000144
		High density lipoprotein cholesterol	JGAD000145
		Low density lipoprotein cholesterol	JGAD000146
		Triglyceride	JGAD000147
		Blood sugar	JGAD000148
		Hemoglobin A1c	JGAD000149
	Protein	Total protein	JGAD000150
		Albumin	JGAD000151
		Non-albumin protein	JGAD000152
		Albumin/globulin ratio	JGAD000153
	Kidney-related	Blood urea nitrogen	JGAD000154
		Serum creatinine	JGAD000155
		Estimated glomerular filtration rate	JGAD000156
		Uric acid	JGAD000157
	Electrolyte	Sodium	JGAD000158
		Potassium	JGAD000159
		Chlorine	JGAD000160
		Calcium	JGAD000161
		Phosphorus	JGAD000162
	Liver-related	Total bilirubin	JGAD000163
		Zinc sulfate turbidity test	JGAD000164
		Aspartate aminotransferase	JGAD000165
		Alanine aminotransferase	JGAD000166
		Alkaline phosphatase	JGAD000167
		Gamma-glutamyl transferase	JGAD000168
	Other biochemical	Activated partial thromboplastin time	JGAD000169
		Prothrombin time	JGAD000170
		Fibrinogen	JGAD000171
		Creatine kinase	JGAD000172
		Lactate dehydrogenase	JGAD000173
		C-reactive protein	JGAD000174
	Hematological	White blood cell count	JGAD000175
		Neutrophil count	JGAD000176
		Eosinophil count	JGAD000177
		Basophil count	JGAD000178
		Monocyte count	JGAD000179
		Lymphocyte count	JGAD000180
		Red blood cell count	JGAD000181
		Hemoglobin	JGAD000182
		Hematocrit	JGAD000183
		Mean corpuscular volume	JGAD000184
		Mean corpuscular hemoglobin	JGAD000185
		Mean corpuscular hemoglobin concentration	JGAD000186
		Platelet count	JGAD000187
	Blood pressure	Systolic blood pressure	JGAD000188
		Diastolic blood pressure	JGAD000189
		Mean arterial pressure	JGAD000190
		Pulse pressure	JGAD000191
	Echocardiographic	Interventricular septum thickness	JGAD000192
		Posterior wall thickness	JGAD000193
		Left ventricular internal dimension in diastole	JGAD000194
		Left ventricular internal dimension in systole	JGAD000195
		Left ventricular mass	JGAD000196
		Left ventricular mass index	JGAD000197
		Relative wall thickness	JGAD000198
		Fractional shortening	JGAD000199
		Ejection fraction	JGAD000200
		E/A ratio	JGAD000201
	(Click the trait names to download the gwas summary statistics) Dictionary file
Total Data Volume	GWAS: 123 MB (zip) on average Phenotype data (58 quantitative traits): 2.4 MB (txt.gz) on average
Comments (Policies)	NBDC policy

hum0014.v9.Men.v1 / hum0014.v9.MP.v1


Participants/Materials	67,029 females with information on age at menarche 43,861 females with information on age at menopause
Targets	genome wide SNVs
Target Loci for Capture Methods	-
Platform	Illumina [HumanOmniExpress, HumanExome, OmniExpressExome BeadChip]
Source	gDNA extracted from peripheral blood cells
Cell Lines	-
Reagents (Kit, Version)	HumanOmniExpress, HumanExome, OmniExpressExome BeadChip kit
Genotype Call Methods (software)	minimac [imputation (1000 genomes Phase I v3)] GenCall software (GenomeStudio)
Association Analysis (software)	mach2qtl (v1.1.3)
Filtering Methods	Genotyping QC: sample call rate < 0.98, SNV call rate < 0.99, HWE P < 1 x 10^-6 QC for reference panel: After excluding 11 closely related individuals, variants with HWE P < 1.0 x 10^-6, MAF < 0.01 were excluded. QC after imputation: Variants with imputation quality of Rsq < 0.7 were excluded. We also excluded variants with \|beta\| > 4 in the uploaded files.
Marker Number (after QC)	9,296,729 SNPs (hg19)
NBDC Dataset ID	menarche: hum0014.v9.Men.v1 menopause: hum0014.v9.MP.v1 (Click the Dataset ID to download the file) menarche: Dictionary file menopause: Dictionary file
Total Data Volume	menarche: 181 MB (txt.gz) menopause: 186 MB (txt.gz)
Comments (Policies)	NBDC policy

JGAS000114 (JGAD000220 / JGAD000410 / JGAD000690 / JGAD000758)


Participants/Materials	1,026 individuals
Targets	WGS
Target Loci for Capture Methods	-
Platform	Illumina [HiSeq 2500]
Library Source	DNA extracted from peripheral blood cells
Cell Lines	-
Library Construction (kit name)	TruSeq Nano DNA Library Preparation Kit
Fragmentation Methods	Ultrasonic fragmentation
Spot Type	Paired-end
Read Length (without Barcodes, Adaptors, Primers, and Linkers)	160 bp
QC	Data with bad base quality and high %GC content were removed. Alignment: Data matched for the following conditions were removed. - Low mapping rate - Different insert size - Gender information mismatch between meta-data and genotype data - Suspected sex chromosome aberration Genotyping: GATK’s best practices include a variant filtering step following Variant Quality Score Recalibration (VQSR) - DP/GP (DP < 5, GQ < 20, DP > 60, GQ < 95 ) - Heterozygosity (F>=0.05) - Hardy-Weinberg equilibrium (p < 10^-6） - Repeat & Low Complexity Principal Component Analysis (PCA): PCA was performed with individuals included in the 1000 genomes project and outliers from Japanese cluster were removed. After these filtering steps, variants located in the regions listed as the HighConfidenceRegion (Genome-In-A-Bottle project) were flagged.
Deduplication	Picard 2.10.6
Calibration for re-alignment and base quality	GATK 3.7
Mapping Methods	BWA mem 0.7.12
Mapping Quality	Reads with MAPQ<20 were excluded at variant calling with GATK 3.7 HaplotypeCaller
Reference Genome Sequence	GRCh37/hg19 (hs37d5)
Coverage (Depth)	31.8x
Detecting Methods for Variation	GATK 3.7 HaplotypeCaller
SNV Numbers (after QC)	76,768,387 (Autosomal Chromosomes) 2,898,518 (X Chromosome)
INDEL Numbers (after QC)	10,202,908 (Autosomal Chromosomes) 410,435 (X Chromosome)
Japanese Genotype-phenotype Archive Dataset ID	JGAD000220 (fastq) JGAD000410 (bam, vcf): Whole genome sequencing analyzed data included in the JGAD000117 were mapped to the GRCh37 reference genome sequence, and variant detection was carried out using the GATK (Genome Analysis Toolkit) standards. This project is an initiative of the GEnome Medical Alliance Japan (GEM Japan, GEM-J). Lean more..
Dataset ID of the Processed data by JGA	JGAD000690 JGAD000758 (joint call) The way to Process
Total Data Volume	JGAD000220: 73 TB (fastq) JGAD000410: 49 TB (bam, vcf) JGAD000690: 52.1 TB (bam, bai, vcf, document) JGAD000758: 203.8 GB (vcf_aggregate, tabix)
Comments (Policies)	NBDC policy

* Summarized data is available at JENGER site.

JGAS000140


Participants/Materials	7,104 breast cancer patients and 23,731 controls
Targets	Target Capture
Target Loci for Capture Methods	11 hereditary breast cancer genes (ATM, BRCA1, BRCA2, CDH1, CHEK2, NBN, NF1, PALB2, PTEN, STK11, TP53)
Platform	Illumina [HiSeq 2500]
Library Source	DNA extracted from peripheral blood cells
Cell Lines	-
Library Construction (kit name)	1st PCR was performed with 2X Platinum Multiplex PCR Master Mix (Thermo Fisher Scientific) to amplify the target region, followed by the 2nd PCR with 8-bp barcode and adapter sequences added using KAPA HiFi HotStart DNA Polymerase (KAPA) ^*1
Fragmentation Methods	-
Spot Type	Paired-end
Read Length (without Barcodes, Adaptors, Primers, and Linkers)	150 bp x 2
Japanese Genotype-phenotype Archive Dataset ID	JGAD000209
Total Data Volume	1 TB (fastq)
Comments (Policies)	NBDC policy

*1 Hum Mol Genet. 25,:5027-5034 (2016)

hum0014.v12.T2DMwN.v1


Participants/Materials	[GWAS-1] - 2,380 T2DM with diabetic nephropathy patients - 5,234 T2DM without diabetic nephropathy patients [GWAS-2] - 429 T2DM with diabetic nephropathy patients - 358 T2DM without diabetic nephropathy patients
Targets	genome wide SNVs
Target Loci for Capture Methods	-
Platform	Illumina [OmniExpressExome Beadchip, Human610-Quad BeadChip]
Source	gDNA extracted from peripheral blood cells
Cell Lines	-
Reagents (Kit, Version)	Illumina OmniExpressExome Beadchip kit Illumina Human610-Quad Beadchip kit
Genotype Call Methods (software)	MACH and Minimac (1000 Genomes phased JPT, CHB and Han Chinese South data n = 275, March 2012) GenCall software (GenomeStudio)
Association Analysis (software)	mach2dat
Filtering Methods	sample call rate < 0.98, SNV call rate < 0.99, MAF < 0.1%, HWE P < 1 x 10^-6 in control
Marker Number (after QC)	7,521,072 SNPs (hg19)
NBDC Dataset ID	hum0014.v12.T2DMw.v1 (Click the Dataset ID to download the file) Dictionary file
Total Data Volume	310 MB (csv.zip)
Comments (Policies)	NBDC policy

hum0014.v13.T2DMmeta.v1


Participants/Materials	[GWAS-1] - 9,804 T2DM patients (ICD-10: E11) - 6,728 controls [GWAS-2] - 5,639 T2DM patients (ICD-10: E11) - 19,407 controls [GWAS-3] - 18,688 T2DM patients (ICD-10: E11) - 121,950 controls [GWAS-4] - 2,483 T2DM patients (ICD-10: E11) - 7,065 controls
Targets	genome wide SNVs
Target Loci for Capture Methods	-
Platform	Illumina [HumanOmniExpress, HumanExome, OmniExpressExome, Human610-Quad BeadChip]
Source	gDNA extracted from peripheral blood cells
Cell Lines	-
Reagents (Kit, Version)	HumanOmniExpress, HumanExome, OmniExpressExome, Human610-Quad BeadChip kit
Genotype Call Methods (software)	minimac [imputation(1000 genomes Phase 3)] GenCall software (GenomeStudio)
Association Analysis (software)	mach2dat (v1.0.24)
Filtering Methods	Genotyping QC: exclusion criteria of GWAS1, GWAS3, GWAS4 (i) hetero count < 5 (ii) HWE P < 1.0 × 10^-6 on each chip (iii) genotype concordance rate < 0.99 with in-house WGS data (iv) SNV call rate < 0.99 exclusion criteria of GWAS2 (i) SNV call rate < 0.99 (ii) MAF < 0.01 (iii) differential missingness P < 1.0 × 10^-6 (iv) HWE P < 1.0 × 10^-6 Imputation QC: HWE P < 1 × 10^-6 or MAF < 0.01 in the reference panel Imputation quality (Rsq) < 0.3 in more than two GWAS
Marker Number (after QC)	12,557,761 SNPs (hg19)
NBDC Dataset ID	hum0014.v13.T2DMmeta.v1 (Click the Dataset ID to download the file) Dictionary file
Total Data Volume	257 MB (txt)
Comments (Policies)	NBDC policy

hum0014.v14.smok.v1


Participants/Materials	165,436 individuals whose smoking status is available
Targets	genome wide SNVs
Target Loci for Capture Methods	-
Platform	Illumina [HumanOmniExpress, HumanExome, OmniExpressExome BeadChip]
Source	gDNA extracted from peripheral blood cells
Cell Lines	-
Reagents (Kit, Version)	HumanOmniExpress, HumanExome, OmniExpressExome BeadChip kit
Genotype Call Methods (software)	minimac [imputation (1000 genomes Phase I v3)] GenCall software (GenomeStudio)
Association Analysis (software)	BOLT-LMM (v2.2) ProbABEL(v0.4.5; for X chromosome)
Filtering Methods	Genotyping QC: sample call rate < 0.98, SNV call rate < 0.99, HWE P < 1 x 10^-6 QC for reference panel: After excluding 11 closely related individuals, variants with HWE P < 1.0 x 10^-6, MAF < 0.01 were excluded. QC after imputation: Variants with imputation quality of Rsq < 0.7 and MAF < 0.01 were excluded.
Marker Number (after QC)	autosomes: 5,961,480 SNVs (hg19) male X-chromosome (Age of smoking initiation): 163,412 SNVs (hg19) female X-chromosome (Age of smoking initiation): 146,130 SNVs (hg19) male X-chromosome (Cigarettes per day): 166,111 SNVs (hg19) female X-chromosome (Cigarettes per day): 146,114 SNVs (hg19) male X-chromosome (Smoking initiation [Ever vs never smokers]): 166,138 SNVs (hg19) female X-chromosome (Smoking initiation [Ever vs never smokers]): 146,146 SNVs (hg19) male X-chromosome (Smoking cessation [Former vs current smokers]): 166,142 SNVs (hg19) female X-chromosome (Smoking cessation [Former vs current smokers]): 146,118 SNVs (hg19)
NBDC Dataset ID	hum0014.v14.asi.v1.zip (Age of smoking initiation) hum0014.v14.cpd.v1.zip (Cigarettes per day) hum0014.v14.ens.v1.zip (Smoking initiation [Ever vs never smokers]) hum0014.v14.fcs.v1.zip (Smoking cessation [Former vs current smokers]) (Click the Dataset ID to download the file) Dictionary file
Total Data Volume	1.9 GB (txt.gz)
Comments (Policies)	NBDC policy

JGAS000114 reference panel


Participants/Materials	- WGS data (JGAD000220) of the biobank Japan project (N=1,037) - WGS data of 1KGP p3v5 ALL (N=2,504) (ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/)
Targets	a reference panel from WGS data (variants on autosomal chromosomes and X-chromosome)
Target Loci for Capture Methods	-
QC*	We set exclusion criteria for genotypes as follows: (1) DP < 5, (2) GQ < 20, or (3) DP > 60 and GQ < 95, and regarded these genotypes as missing. Variants with call rates < 90% were excluded before variant quality score recalibration (VQSR). After VQSR, we excluded variants located in low-complexity regions (LCR), as defined by mdust software were excluded. Finally, we used BEAGLE to impute missing genotypes.
Deduplication*	picard (versions 1.106)
Calibration for re-alignment and base quality*	GATK (ver.3.2-2)
Mapping Methods*	BWA-MEM (version 0.7.5a)
Mapping Quality*	MAPQ < 20 were excluded (HaplotypeCaller)
Reference Genome Sequence*	GRCh37/hg19, hs37d5
Coverage (Depth)*	aimed at 30x depth
Detecting Methods for Variation*	GATK HaplotypeCaller (version 3.2-2)
Method for merging vcf files	autosomal chromosomes: Impute2 X-chromosome: Beagle (male), Impute2 (female)
Variant Numbers in reference panel	61,608,817 variants (autosomal chromosomes: 59,387,070; X-chromosome: 2,221,747)
Japanese Genotype-phenotype Archive Dataset ID	JGAD000220
Dataset ID of the Processed data by JGA	JGAD000679 The way to Process
Total Data Volume	about 15 GB (vcf.gz)
Comments (Policies)	NBDC policy

* These processes were performed only for biobank Japan project data.

hum0014.v15.ht.v1


Participants/Materials	159,095 individuals (Male: 86,257, Female: 72,838)
Targets	genome wide variants
Target Loci for Capture Methods	-
Platform	Illumina [HumanOmniExpress, HumanExome, OmniExpressExome BeadChip]
Source	gDNA extracted from peripheral blood cells
Cell Lines	-
Reagents (Kit, Version)	HumanOmniExpress, HumanExome, OmniExpressExome BeadChip kit
Genotype Call Methods (software)	Minimac3 [imputation reference panel using WGS data of the biobank Japan project (N=1,037) and 1KGP p3v5 ALL (N=2,504)]
Association Analysis (software)	BOLT-LMM (ver2.2), mach2qtl
Filtering Methods	Sample QCs: Exclusion criteria: 1) call rate < 98%, 2) closely related samples (PI_HAT > 0.175), and 3) outlier from Japanese cluster determined by PCA using GCTA. QC after imputation: Variants with imputation quality of Rsq < 0.3 were excluded.
Marker Number (after QC)	autosomes: 27,211,524 variants (hg19) male X-chromosome: 684,533 variants (hg19) female X-chromosome: 684,533 variants (hg19)
NBDC Dataset ID	hum0014.v15.ht.v1 (Click the Dataset ID to download the file) Dictionary file
Total Data Volume	about 663 MB (txt.gz)
Comments (Policies)	NBDC policy

JGAS000203


Participants/Materials	7,636 prostate cancer patients (ICD10：C61) and 12,366 controls
Targets	Target Capture
Target Loci for Capture Methods	8 hereditary prostate cancer genes (ATM, BRCA1, BRCA2, BRIP1, CHEK2, HOXB13, NBN, PALB2)
Platform	Illumina [HiSeq 2500]
Library Source	DNA extracted from peripheral blood cells
Cell Lines	-
Library Construction (kit name)	1st PCR was performed with 2X Platinum Multiplex PCR Master Mix (Thermo Fisher Scientific) to amplify the target region, followed by the 2nd PCR with 8-bp barcode and adapter sequences added using KAPA HiFi HotStart DNA Polymerase (KAPA) ^*1
Fragmentation Methods	-
Spot Type	Paired-end
Read Length (without Barcodes, Adaptors, Primers, and Linkers)	150 bp x 2
Japanese Genotype-phenotype Archive Dataset ID	JGAD000288
Total Data Volume	2.2 TB (fastq)
Comments (Policies)	NBDC policy

hum0014.v17 / hum0014.v18 / hum0014.v21


Participants/Materials	42 disease (ICD10 code) Arrhythmia (I499), Bronchial asthma (J459), Atopic dermatitis (L209), Gallbladder/Cholangiocarcinoma (C23, C240), Cataract (H269), Cerebral aneurysm (I671), Cervical cancer (C539), Chronic hepatitis B (B181), Chronic hepatitis C (B182), Chronic obstructive pulmonary disease (J449), Liver cirrhosis (K746), Colorectal cancer (C189, C20), Heart failure (I509, I500), Drug eruption (L270), Uterine cancer (C549), Endometriosis (N809), Epilepsy (G409), Esophageal cancer (C159), Gastric cancer (C169), Glaucoma (H409), Graves' disease (E050), Hematopoietic tumor (C81-96), Liver cancer (C220), Interstitial lung disease/Pulmonary fibrosis (J849, J841), Cerebral infarction (I639), Keloid (L910), Lung cancer (C349), Nephrotic syndrome (N049), Osteoporosis (M8199), Ovarian cancer (C56), Pancreas cancer (C259), Periodontitis (K054), Peripheral artery disease (I709), Hay fever (J301), Prostate cancer (C61), Pulmonary tuberculosis (A169), Rheumatoid arthritis (M0690), Diabetes mellitus (E14), Urolithiasis (N209), Uterine fibroids (D259), Breast cancer (C509) Coronary artery disease (I200, I209, I219)
Targets	genome wide variants
Target Loci for Capture Methods	-
Platform	Illumina [HumanOmniExpress, HumanExome, OmniExpressExome BeadChip]
Source	gDNA extracted from peripheral blood cells
Cell Lines	-
Reagents (Kit, Version)	HumanOmniExpress, HumanExome, OmniExpressExome BeadChip kit
Genotype Call Methods (software)	Minimac3 [imputation (1000 genomes Phase 3 v5)] GenCall software (GenomeStudio)
Association Analysis (software)	SAIGE(v0.29.4.2)
Filtering Methods	QC after imputation: Exclusion criteria: Variants with imputation quality of Rsq < 0.7
Marker Number (after QC)	autosomes: 8,712,794 variants (hg19) X-chromosome: 207,198 variants (hg19)
NBDC Dataset ID	Arrhythmia	hum0014.v17.AR.v1
	Bronchial asthma	hum0014.v17.BA.v1
	Atopic dermatitis*	hum0014.v17.AD.v1
	Gallbladder/Cholangiocarcinoma	hum0014.v17.GCc.v1
	Cataract	hum0014.v17.Cat.v1
	Cerebral aneurysm	hum0014.v17.CA.v1
	Cervical cancer	hum0014.v17.CeC.v1
	Chronic hepatitis B	hum0014.v17.CHB.v1
	Chronic hepatitis C	hum0014.v17.CHC.v1
	Chronic obstructive pulmonary disease	hum0014.v17.COPD.v1
	Liver cirrhosis	hum0014.v17.Cir.v1
	Colorectal cancer	hum0014.v17.CC.v1
	Heart failure*	hum0014.v17.HF.v1
	Drug eruption	hum0014.v17.DE.v1
	Uterine cancer	hum0014.v17.UC.v1
	Endometriosis	hum0014.v17.EM.v1
	Epilepsy	hum0014.v17.Ep.v1
	Esophageal cancer	hum0014.v17.EC.v1
	Gastric cancer	hum0014.v17.GC.v1
	Glaucoma*	hum0014.v17.Gla.v1
	Graves' disease	hum0014.v17.GD.v1
	Hematopoietic tumor	hum0014.v17.HT.v1
	Liver cancer	hum0014.v17.LiC.v1
	Interstitial lung disease/Pulmonary fibrosis	hum0014.v17.IP.v1
	Cerebral infarction	hum0014.v17.CI.v1
	Keloid	hum0014.v17.Kel.v1
	Lung cancer	hum0014.v17.LuC.v1
	Nephrotic syndrome	hum0014.v17.NS.v1
	Osteoporosis	hum0014.v17.OP.v1
	Ovarian cancer	hum0014.v17.OC.v1
	Pancreas cancer	hum0014.v17.PaC.v1
	Periodontitis	hum0014.v17.PD.v1
	Peripheral artery disease	hum0014.v17.PAD.v1
	Hay fever	hum0014.v17.Hay.v1
	Prostate cancer	hum0014.v17.PrC.v1
	Pulmonary tuberculosis	hum0014.v17.PT.v1
	Rheumatoid arthritis	hum0014.v17.RA.v1
	Diabetes mellitus*	hum0014.v17.DM.v1
	Urolithiasis	hum0014.v17.Uro.v1
	Uterine fibroids	hum0014.v17.UF.v1
	Breast cancer	hum0014.v18.BC.v1
	Coronary artery disease	hum0014.v21.CAD.v1
	(Click the Dataset ID to download the file) Dictionary file Sample size file
Total Data Volume	autosomes: about 0.8-1.3 GB each X-chromosome: about 20-30 MB each
Comments (Policies)	NBDC policy

* Data of 4 diseases were partially overlapped with those of previous releases (Glaucoma [hum0014.v7.POAG.v1], Atrial fibrillation [hum0014.v5.AF.v1], Atopic dermatitis [hum0014.v4.AD.v1], and Diabetes mellitus [hum0014.v3.T2DM-2.v1]).

hum0014.v19


Participants/Materials	165,084 individuals whose dietary habits status is available (13 dietary traits)
Targets	genome wide SNVs
Target Loci for Capture Methods	-
Platform	Illumina [HumanOmniExpress, HumanExome, OmniExpressExome BeadChip]
Source	gDNA extracted from peripheral blood cells
Cell Lines	-
Reagents (Kit, Version)	HumanOmniExpress, HumanExome, OmniExpressExome BeadChip kit
Genotype Call Methods (software)	GenCall software（GenomeStudio） MACH minimac (v.0.1.1) [imputation (1000 genomes Phase I v3)]
Association Analysis (software)	BOLT-LMM (v2.2) for autosomes ProbABEL (v0.4.5) for X chromosome
Filtering Methods	Genotyping QC: sample call rate < 0.98, SNV call rate < 0.99, MAF < 0.005 QC for reference panel: Variants with HWE P < 1.0 x 10^-6, MAF < 0.01 were excluded from the reference panel. QC after imputation: Variants with imputation quality of Rsq < 0.7 and MAF < 0.01 were excluded.
Marker Number (after QC)	autosomes: 5,961,480 variants (hg19) X-chromosome: 148,568 variants for female, 170,117 variants for male (hg19)
NBDC Dataset ID	Ever versus never drinker	hum0014.v19.drink.v1.zip
	Drinks per week	hum0014.v19.dpw.v1.zip
	Coffee consumption	hum0014.v19.cafe.v1.zip
	Tea consumption	hum0014.v19.tea.v1.zip
	Milk consumption	hum0014.v19.milk.v1.zip
	Yogurt consumption	hum0014.v19.ygt.v1.zip
	Cheese consumption	hum0014.v19.cheese.v1.zip
	Natto consumption	hum0014.v19.natto.v1.zip
	Tofu consumption	hum0014.v19.tofu.v1.zip
	Fish consumption	hum0014.v19.fish.v1.zip
	Small fish consumption	hum0014.v19.sfish.v1.zip
	Vegetable consumption	hum0014.v19.vege.v1.zip
	Meat consumption	hum0014.v19.meat.v1.zip
	(Click the Dataset ID to download the file) Dictionary file
Total Data Volume	6.3 GB (txt.zip)
Comments (Policies)	NBDC policy

hum0014.v20.cad.v1


Participants/Materials	25,892 coronary artery disease patients (ICD10: I20-25) and 142,336 controls
Targets	genome wide variants
Target Loci for Capture Methods	-
Platform	Illumina [HumanOmniExpress, HumanExome, OmniExpressExome BeadChip]
Source	gDNA extracted from peripheral blood cells
Cell Lines	-
Reagents (Kit, Version)	HumanOmniExpress, HumanExome, OmniExpressExome BeadChip kit
Genotype Call Methods (software)	GenCall software (GenomeStudio) minimac3 (BBJ-CAD reference panel)
Association Analysis (software)	PLINK2
Filtering Methods	QC after imputation: Variants with imputation quality of Rsq < 0.3 and MAF < 0.0002 were excluded.
Marker Number (after QC)	autosomes: 19,707,525 variants (hg19)
NBDC Dataset ID	hum0014.v20.gwas.v1 (summary statistics) hum0014.v20.prs.v1 (polygenic risk score) (Click the Dataset ID to download the file) Dictionary file
Total Data Volume	about 413 MB (txt.gz)
Comments (Policies)	NBDC policy

JGAS000293 (Target Capture)


Participants/Materials	11,234 subjects extracted from approximately 200,000 subjects registered in Biobank Japan between fiscal years 2003 to 2007
Targets	Target Capture
Target Loci for Capture Methods	23 genes related to clonal hematopoiesis ASXL1, CBL, CEBPA, DDX41, DNMT3A, ETV6, EZH2, GATA2, GNAS, GNB1, IDH1, IDH2, JAK2, KRAS, MYD88, NRAS, PPM1D, RUNX1, SF3B1, SRSF2, TET2, TP53, U2AF1
Platform	Illumina [HiSeq 2500]
Library Source	DNA extracted from peripheral blood cells
Cell Lines	-
Library Construction (kit name)	Library was contructed as described in Momozawa, Y., et al. Low-frequency coding variants in CETP and CFB are associated with susceptibility of exudative age-related macular degeneration in the Japanese population. Hum Mol Genet 25, 5027-5034 (2016).
Fragmentation Methods	-
Spot Type	Paired-end
Read Length (without Barcodes, Adaptors, Primers, and Linkers)	150 bp
QC	We selected samples in which ≥20 depth was achieved in ≥98% regions.
Mapping Methods	bwa
Mapping Quality	≥40
Reference Genome Sequence	hg19
Coverage (Depth)	x800
Detecting Methods for Variation	Genomon pipeline
Filtering Methods	Genomon pipeline
Japanese Genotype-phenotype Archive Dataset ID	JGAD000399
Total Data Volume	5.3 MB (txt)
Comments (Policies)	NBDC policy

JGAS000293 (SNP array)


Participants/Materials	11,234 subjects extracted from approximately 200,000 subjects registered in Biobank Japan between fiscal years 2003 to 2007
Targets	SNP array
Target Loci for Capture Methods	-
Platform	Illumina [human OmniExpress, human OmniExpressExome]
Library Source	DNA extracted from peripheral blood cells
Cell Lines	-
Reagents (Kit, Version)	Illumina Infinium OmniExpress, Infinium OmniExpressExome v.1.0, or v.1.2
Genotype Call Methods (software)	GenCall software（GenomeStudio）
Algorithm for detecting chromosome abnormality (software)	Haplotype-based detection of allelic imbalances. Terao, C., et al. Chromosomal alterations among age-related haematopoietic clones in Japan. Nature (2020).
Filtering Methods	SNPs examined in all of the three versions of array
Marker Numbers (after QC)	515,355
Japanese Genotype-phenotype Archive Dataset ID	JGAD000400
Total Data Volume	5.3 MB (csv)
Comments (Policies)	NBDC policy

JGAS000327 / JGAS000346 / JGAS000414 / JGAS000347 / JGAS000592


Participants/Materials	1,005 pancreatic cancer patients (ICD10: C25) 12,503 colorectal cancer patients (ICD10: C18, C19, C20) 740 renal cell cancer patients (ICD10: C64) 1,982 lymphoma patients (ICD10: C81, C82, C83, C84, C85, C86, C88, C91) 10,366 gastric cancer patients (ICD10: C16) 23,705 + 5,996 + 37,592 controls
Targets	Target Capture
Target Loci for Capture Methods	27 cancer-predisposing genes (APC, ATM, BARD1, BMPR1A, BRCA1, BRCA2, BRIP1, CDK4, CDKN2A, CDH1, CHEK2, EPCAM, HOXB13, NBN, NF1, MLH1, MSH2, MSH6, MUTYH, PALB2, PMS2, PTEN, RAD51C, RAD51D, SMAD4, STK11, TP53)
Platform	Illumina [HiSeq 2500]
Library Source	DNA extracted from peripheral blood cells
Cell Lines	-
Library Construction (kit name)	1st PCR was performed with 2X Platinum Multiplex PCR Master Mix (Thermo Fisher Scientific) to amplify the target region, followed by the 2nd PCR with 8-bp barcode and adapter sequences added using KAPA HiFi HotStart DNA Polymerase (KAPA) ^*1
Fragmentation Methods	-
Spot Type	Paired-end
Read Length (without Barcodes, Adaptors, Primers, and Linkers)	150 bp x 2
Japanese Genotype-phenotype Archive Dataset ID	pancreatic cancer: JGAD000438 colorectal cancer: JGAD000458 23,705 controls: JGAD000459 renal cell cancer and 5,996 controls: JGAD000531 lymphoma: JGAD000460 gastric cancer: JGAD000720 37,592 controls: JGAD000721
Total Data Volume	JGAD000438: 78 GB (fastq) JGAD000458: 956 GB (fastq) JGAD000459: 1.9 TB (fastq) JGAD000531: 961.8 GB (fastq) JGAD000460: 126 GB (fastq) JGAD000720, JGAD000721: 3.4 TB (fastq)
Comments (Policies)	NBDC policy

JGAS000414


Participants/Materials	740 renal cell cancer patients (ICD10：C64) 5,996 controls
Targets	Target Capture
Target Loci for Capture Methods	13 renal cell carcinoma-related genes (VHL, BAP1, FH, FLCN, MET, TSC1, TSC2, MITF, SDHA, SDHB, SDHC, SDHD, CDC73)
Platform	Illumina [HiSeq 2500]
Library Source	DNA extracted from peripheral blood cells
Cell Lines	-
Library Construction (kit name)	1st PCR was performed with 2X Platinum Multiplex PCR Master Mix (Thermo Fisher Scientific) to amplify the target region, followed by the 2nd PCR with 8-bp barcode and adapter sequences added using KAPA HiFi HotStart DNA Polymerase (KAPA) ^*1
Fragmentation Methods	-
Spot Type	Paired-end
Read Length (without Barcodes, Adaptors, Primers, and Linkers)	150 bp x 2
Japanese Genotype-phenotype Archive Dataset ID	JGAD000531
Total Data Volume	JGAD000531: 961.8 GB (fastq)
Comments (Policies)	NBDC policy

JGAS000381


Participants/Materials	1,765 myocardial infarction patients (ICD10: I21) and 199 dementia patients (ICD10: F00-03)
Targets	WGS
Target Loci for Capture Methods	-
Platform	Illumina [HiSeq X Five]
Library Source	DNA extracted from peripheral blood cells
Cell Lines	-
Library Construction (kit name)	TruSeq DNA PCR-Free Prep kit
Fragmentation Methods	Ultrasonic fragmentation
Spot Type	Paired-end
Read Length (without Barcodes, Adaptors, Primers, and Linkers)	151 bp
QC	Data with bad base quality and high %GC content were removed. Aligment: Data matched for the following condition were removed. - Low mapping rate - Different insert size - Gender information mismatch between meta-data and genotype data - Suspected sex chromosome aberration Genotyping: GATK’s best practices includes a variant filtering step following Variant Quality Score Recalibration (VQSR) - DP/GP (DP < 5, GQ < 20, DP > 60, GQ < 95 ) - Heterozygosity (F>=0.05) - Hardy-Weinberg equilibrium (p < 10^-6） - Repeat & Low Complexity Principal Component Analysis (PCA): PCA was performed with individuals included in the 1000 genomes project and outliers from Japanese cluster were removed. After these filtering steps, variants located in the regions listed as the HighConfidenceRegion (Genome-In-A-Bottle project) were flagged.
Deduplication	Picard 2.10.6
Calibration for re-alignment and base quality	GATK 3.7
Mapping Methods	BWA mem 0.7.12
Mapping Quality	Reads with MAPQ<20 were excluded at variant calling with GATK 3.7 HaplotypeCaller
Reference Genome Sequence	GRCh37/hg19 (hs37d5)
Coverage (Depth)	myocardial infarction: 15.0x, dementia: 30.0x
Detecting Methods for Variation	GATK 3.7 HaplotypeCaller
SNV Numbers (after QC)	76,768,387 (Autosomal Chromosomes) 2,898,518 (X Chromosome)
INDEL Numbers (after QC)	10,202,908 (Autosomal Chromosomes) 410,435 (X Chromosome)
Japanese Genotype-phenotype Archive Dataset ID	JGAD000495 (fastq) JGAD000496 (bam, vcf): Whole genome sequencing analyzed data included in the JGAD000117 were mapped to the GRCh37 reference genome sequence, and variant detection was carried out using the GATK (Genome Analysis Toolkit) standards. This project is an initiative of the GEnome Medical alliance Japan (GEM Japan, GEM-J). Lean more..
Total Data Volume	188.4 TB (fastq, bam, vcf)
Comments (Policies)	NBDC policy

hum0014.v27.surv.v1


Participants/Materials	137,693 individuals from BBJ 1st cohort
Targets	genome wide variants
Target Loci for Capture Methods	-
Platform	Illumina [HumanOmniExpress, HumanExome, OmniExpressExome BeadChip]
Source	gDNA extracted from peripheral blood cells
Cell Lines	-
Reagents (Kit, Version)	HumanOmniExpress, HumanExome, OmniExpressExome BeadChip kit
Genotype Call Methods (software)	minimac [imputation (1000 genomes Phase I v3)] GenCall software (GenomeStudio)
Association Analysis (software)	mach2qtl (v1.1.3) SPACox
Filtering Methods	Genotyping QC: sample call rate < 0.98, SNV call rate < 0.99, HWE P < 1 x 10^-6 QC for reference panel: After excluding 11 closely related individuals, variants with HWE P < 1.0 x 10^-6, MAF < 0.01 were excluded. QC after imputation: Variants with imputation quality of Rsq < 0.7 were excluded.
Marker Number (after QC)	6,108,833 variants (hg19)
NBDC Dataset ID	hum0014.v27.surv.v1 (Click the Dataset ID to download the file) Dictionary file
Total Data Volume	789 MB (txt.gz)
Comments (Policies)	NBDC policy

hum0014.v28.MEs.v1


Participants/Materials	WGS data of 4,880 individuals from BBJ 1st cohort - WGS data (JGAD000220) of the biobank Japan project (N=1,037) - WGS data (JGAD000495) of 1,765 myocardial infarction patients and 199 dementia patients - WGS data (AGDD_000005) of 225 gastric cancer patients - 1,007 individuals from Asian Genome Project - 617 colorectal cancer patients - One individual excluded from AGDD_000005 by QC - 10 individuals excluded from JGAD000220 by QC
Targets	mobile element variations
Target Loci for Capture Methods	-
Platform	Illumina [HiSeq X/2500]
Library Source	gDNA extracted from peripheral blood cells
Cell Lines	-
Library Construction (kit name)	TruSeq DNA PCR-Free Sample Prep Kit, TruSeq Nano DNA HT Sample Prep Kit
Fragmentation Methods	Ultrasonic fragmentation
Spot Type	Paired-end
Read Length (without Barcodes, Adaptors, Primers, and Linkers)	151 bp (HiSeq X), 126 bp (HiSeq 2500)
QC	-
Mapping Methods	BWA-MEM
Mapping Quality	-
Reference Genome Sequence	GRCh37 (hs37d5)
Coverage (Depth)	≥15× (≥25× for 1,235 individuals)
Detecting Methods for mobile element	MEGAnE ^*2
Mobile element Number	24,933 for 4,880 individuals 10,997 for 1,235 individuals
NBDC Dataset ID	hum0014.v28.MEs.v1 (Click the Dataset ID to download the file) Dictionary file
Total Data Volume	1.1 MB (txt.gz)
Comments (Policies)	NBDC policy

*2 doi: 10.1101/2022.03.25.485726

hum0014.v29.AF.v1


Participants/Materials	77,690 atrial fibrillation patients and 1,167,040 controls BBJ: 9,826 atrial fibrillation patients and 140,446 controls European: 60,620 atrial fibrillation patients and 970,216 controls FinnGen: 7,244 atrial fibrillation patients and 56,378 controls
Targets	genome wide SNVs
Target Loci for Capture Methods	-
Platform	Illumina [HumanOmniExpress、HumanExome、OmniExpressExome BeadChip]
Source	DNA extracted from peripheral blood cells European GWAS: http://csg.sph.umich.edu/willer/public/afib2018 FinnGen GWAS: https://www.finngen.fi/en
Cell Lines	-
Reagents (Kit, Version)	HumanOmniExpress, HumanExome, OmniExpressExome BeadChip kit
Genotype Call Methods (software)	GenCall software (GenomeStudio), minimac [imputation (1000 genomes Phase I v3 )]
Association Analysis (software)	PLINK2
Filtering Methods	BBJ GWAS: variants with imputation quality (Rsq) < 0.3 or MAF < 0.001 were excluded meta-analysis: variants with MAF < 1% were excluded
Calculation Methods for Polygenic Risk Score	runing and thresholding method
Meta Analysis Methods	MANTRA, METAL
Marker Number (after QC)	BBJ GWAS: 16,817,144 SNPs meta-analysis : 5,158,449 SNPs
NBDC Dataset ID	hum0014.v29.AF.v1 (Click the Dataset ID to download the file) Dictionary file
Total Data Volume	summary statistics of BBJ: 424 MB (txt) summary statistics of meta analysis: 78 MB (txt) polygenic risk score: 59 KB (txt)
Comments (Policies)	NBDC policy

JGAS000647


Participants/Materials	1,007 individuals from BBJ 1st cohort
Targets	WGS
Target Loci for Capture Methods	-
Platform	Illumina [HiSeq X Five]
Library Source	DNA extracted from peripheral blood cells
Cell Lines	-
Library Construction (kit name)	TruSeq Nano DNA Library Preparation Kit
Fragmentation Methods	Ultrasonic fragmentation
Spot Type	Paired-end
Read Length (without Barcodes, Adaptors, Primers, and Linkers)	150 bp
QC	- Autosomal Chromosomes, X PAR, X NonPAR female - Set missing genotypes with DP < 2 or GQ < 20 - call rate < 90% were excluded - X NonPAR male, Y - Set missing genotypes with DP < 1 or GQ < 20 - call rate < 90% were excluded - X NomPAR HWE_P of female
Deduplication	Picard 2.10.10
Calibration for re-alignment and base quality	GATK 3.8
Mapping Methods	BWA-MEM (version 0.7.13)
Mapping Quality	Reads with MAPQ<20 were excluded at variant calling with GATK 3.8 HaplotypeCaller
Reference Genome Sequence	hs37d5
Coverage (Depth)	19.93455
Detecting Methods for Variation	GATK Haplotype Caller (version 3.8)
SNV Numbers (after QC)	Autosomal Chromosomes: 71,643,487 X PAR: 82,997 X nonPAR: 2,618,495 Y Chromosome: 171,271 *Record numbers including AC=0
Japanese Genotype-phenotype Archive Dataset ID	JGAD000777
Total Data Volume	41.6 TB (fastq, vcf)
Comments (Policies)	NBDC policy

JGAS000698


Participants/Materials	256 gastric cancer (ICD10: C16) patients registered in BBJ 1st cohort
Targets	WGS
Target Loci for Capture Methods	-
Platform	Illumina [HiSeq 2500]
Library Source	DNA extracted from peripheral blood cells
Cell Lines	-
Library Construction (kit name)	TruSeq Nano DNA Library Preparation Kit
Fragmentation Methods	Ultrasonic fragmentation
Spot Type	Paired-end
Read Length (without Barcodes, Adaptors, Primers, and Linkers)	2 x 125 bp
Japanese Genotype-phenotype Archive Dataset ID	JGAD000831
Total Data Volume	21.5 TB (fastq)
Comments (Policies)	NBDC policy

JGAS000703


Participants/Materials	269,000 patients (51 diseases) registered in BBJ 1st and 2nd cohort
Targets	SNP array
Target Loci for Capture Methods	-
Platform	Illumina [HumanHap550, Human610-Quad v1, HumanExome-12 v1, Infinium OmniExpress-24, Infinium HumanOmniExpressExome-8, Infinium HumanOmniExpressExome-8 v1.2, Infinium HumanOmniExpressExome-8 v1.3, Infinium HumanOmniExpressExome-8 v1.4, Infinium HumanOmniExpressExome-8 v1.6]
Library Source	DNA extracted from peripheral blood cells
Cell Lines	-
Reagents (Kit, Version)	HumanHap550 kit, Human610-Quad kit v1, HumanExome-12 kit v1, Infinium OmniExpress-24 kit, Infinium HumanOmniExpressExome-8 kit, Infinium HumanOmniExpressExome-8 kit v1.2, Infinium HumanOmniExpressExome-8 kit v1.3, Infinium HumanOmniExpressExome-8 kit v1.4, Infinium HumanOmniExpressExome-8 kit v1.6
Genotype Call Methods (software)	-
Filtering Methods	-
Marker Numbers (after QC)	-
Japanese Genotype-phenotype Archive Dataset ID	JGAD000836
Total Data Volume	3.1 TB (idat)
Comments (Policies)	NBDC policy

JGAS000699


Participants/Materials	617 colorectal cancer (ICD10: C18, C19, C20) patients registered in BBJ 1st and 2nd cohort
Targets	WGS
Target Loci for Capture Methods	-
Platform	Illumina [HiSeq 2500]
Library Source	DNA extracted from peripheral blood cells
Cell Lines	-
Library Construction (kit name)	TruSeq Nano DNA Library Preparation Kit
Fragmentation Methods	Ultrasonic fragmentation
Spot Type	Paired-end
Read Length (without Barcodes, Adaptors, Primers, and Linkers)	2 x 125 bp
Japanese Genotype-phenotype Archive Dataset ID	JGAD000832
Total Data Volume	25.5 TB (fastq)
Comments (Policies)	NBDC policy

JGAS000700 / JGAS000701


Participants/Materials	2,162 diabetes (ICD10: E11) patients registered in BBJ 1st cohort 2,067 gastric cancer (ICD10: C16) patients registered in BBJ 1st cohort
Targets	WGS
Target Loci for Capture Methods	-
Platform	Illumina [HiSeq 2500]
Library Source	DNA extracted from peripheral blood cells
Cell Lines	-
Library Construction (kit name)	TruSeq Nano DNA Library Preparation Kit
Fragmentation Methods	Ultrasonic fragmentation
Spot Type	Paired-end
Read Length (without Barcodes, Adaptors, Primers, and Linkers)	2 x 125 bp
Japanese Genotype-phenotype Archive Dataset ID	diabetes: JGAD000833 gastric cancer: JGAD000834
Total Data Volume	JGAD000833: 20.0 TB (fastq) JGAD000834: 18.7 TB (fastq)
Comments (Policies)	NBDC policy

TogoImputation reference panel (JGAD000867 /JGAD000868)


Participants/Materials	[JGAD000867] - WGS data (JGAD000220) of the biobank Japan project (N=1,026) [JGAD000868] - WGS data (JGAD000495) of the biobank Japan project (N=1,964)
Targets	a reference panel from WGS data (variants on autosomal chromosomes and X-chromosome)
Target Loci for Capture Methods	-
QC	Germline whole genome sequencing data were processed, and the aggregate VCF was calculated. Variants were then filtered based on the following conditions: (1) Variants that did not pass the VQSR filter were excluded (2) Multi-allelic sites were excluded (3) Variants with a call rate below 95% were excluded (4) Variants deviating from Hardy-Weinberg equilibrium (P < 1e-10) were excluded (5) Variants with a minor allele count (MAC) less than 2 were excluded
Deduplication	GATK MarkDuplicates (version 4.1.0.0)
Calibration for re-alignment and base quality	-
Mapping Methods	bwa mem (version 0.7.15)
Mapping Quality	-
Reference Genome Sequence	GRCh38
Coverage (Depth)	Mean ± Standard deviation JGAD000867: 28.70 ± 4.16 JGAD000868: 19.61 ± 5.41
Detecting Methods for Variation	GATK HaplotypeCaller -ERC GVCF (version 4.1.0.0) The ploidy for variant call was set as follows: Autosomes and pseudoautosomal regions (PARs): ploidy=2 Non-PARs on the X chromosome: ploidy=2 (female) and ploidy=1 (male) Non-PARs on the Y chromosome: ploidy=1 (male)
Method for phasing vcf files	After quality control, phasing was performed using Beagle program version 5.2 (21Apr21.304).
Variant Numbers in reference panel	JGAD000867: 17,167,510 variants (autosomal chromosomes: 16,677,000 variants, X-chromosome: 490,510 variants) JGAD000868: 21,596,248 variants (autosomal chromosomes: 21,157,732 variants, X-chromosome: 438,516 variants)
Japanese Genotype-phenotype Archive Dataset ID	JGAD000220 JGAD000495
Dataset ID of the Processed data by JGA	JGAD000867 JGAD000868 The way to Process
Total Data Volume	JGAD000867: 3.3 GB (vcf.gz) JGAD000868: 6.2 GB (vcf.gz)
Comments (Policies)	NBDC policy

JGAS000738


Participants/Materials	[WGS] Subjects registered in BioBank Japan: 7,472 individuals 15-30x depth: 3,256 individuals, 3x depth: 4,216 individuals including followed data - 1,026 individuals (JGAD000220) - 256 gastric cancer patients registered in BBJ 1st cohort (JGAD000831) - 1,765 myocardial infarction patients and 199 dementia patients (JGAD000495) - 2,157 diabetes patients registered in BBJ 1st cohort (JGAD000833) - 2,059 gastric cancer patients registered in BBJ 1st cohort (JGAD000834) [reference panel] - WGS data of the BBJ 1st cohort (N=7,472) - WGS data of 1KGP p3v5 ALL (N=2,504) (ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/)
Targets	WGS a reference panel from WGS data (variants on autosomal chromosomes)
Target Loci for Capture Methods	-
Platform	Illumina [HiSeq 2500/X Five]
Library Source	DNA extracted from peripheral blood cells
Cell Lines	-
Library Construction (kit name)	TruSeq Nano DNA Library Preparation Kit
Fragmentation Methods	Ultrasonic fragmentation
Spot Type	Paired-end
Read Length (without Barcodes, Adaptors, Primers, and Linkers)	125 bp x 2, 126 bp x 2, 151 bp x 2, 161 bp x 2
QC	We set exclusion criteria for genotypes sequenced at high depth (30x and 15x) as follows: (1) DP < 5, (2) GQ < 20, or (3) DP > 60 and GQ < 95, and regarded these genotypes as missing. Variants with call rates < 90% were excluded before variant quality score recalibration (VQSR). After VQSR, variants located in low-complexity regions (LCR), as defined by mdust software, were excluded in the high depth (30x) dataset. Finally, we used BEAGLE to impute missing genotypes.
Deduplication	15-30x depth: Picard 2.10.6 3x depth: Picard 1.106, 2.5.0
Calibration for re-alignment and base quality	GATK v.3.2-2
Mapping Methods	15-30x depth: BWA mem 0.7.12 3x depth: BWA mem 0.7.5a
Mapping Quality	15-30x depth: Reads with MAPQ<20 were excluded at variant calling with GATK 3.7 HaplotypeCaller 3x depth: Reads with MAPQ<20 were excluded at variant calling with GotCloud
Reference Genome Sequence	GRCh37
Coverage (Depth)	30x, 15x, 3x
Detecting Methods for Variation	15-30x depth: GATK 3.7 HaplotypeCaller 3x depth: GotCloud v1.17.5
SNV Numbers (after QC)	80,753,886
INDEL Numbers (after QC)	4,535,276
Method for merging vcf files	Impute2
Variant Numbers in reference panel	85,328,475 variants
Japanese Genotype-phenotype Archive Dataset ID	JGAD000873
Total Data Volume	40.5 GB (vcf)
Comments (Policies)	NBDC policy

DATA PROVIDER

Principal Investigator: Michiaki Kubo

Affiliation: RIKEN Center for Integrative Medical Sciences

Project / Group Name: Tailor-made Medical Treatment Program (Bio Bank Japan: BBJ)

URL: https://biobankjp.org/en

Funds / Grants (Research Project Number) :

Name	Title	Project Number
Ministry of Education, Culture, Sports, Science and Technology in Japan	Tailor-made Medical Treatment Program (the 3rd phase)	-
Tailor-Made Medical Treatment with the BioBank Japan Project (BBJ), Japan Agency for Medical Research and Development (AMED)	Generating large-scale data of genetic polymorphism to identify disease-related genes	17km0305002h0005
Project for Cancer Research and Therapeutic Evolution (P-CREATE), Japan Agency for Medical Research and Development (AMED)	Exploration of special and temporal diversity in genome and epigenome of hematological malignancies based on large-scale sequencing analyses.	JP19cm0106501
Core Research and Evolutional Science and Technology, Advanced Research & Development Programs for Medical Innovation, Japan Agency for Medical Research and Development (AMED-CREST)	Research on altered tissue functions caused by clonal expansion and remodeling of apparently normal tissues related to normal aging or exposure to chronic inflammation and other lifestyles	JP19gm1110011
KAKENHI Grant-in-Aid for Scientific Research (S)	Comprehensive studies on the molecular basis of early development and clonal evolution in cancer using advanced genomics.	19H05656
Program for Promoting Platform of Genomics based Drug Discovery, Project for Genome and Health Related Data, Japan Agency for Medical Research and Development (AMED)	Development of a large-scale database for effective drug treatment for breast, colorectal, and pancreas cancers	JP19kk0305010
KAKENHI Grant-in-Aid for Early-Career Scientists	Genome-wide association study integrating mobile genetic elements	22K15385
KAKENHI Grant-in-Aid for Scientific Research (B)	Elucidation of genetic factors that define myocardial vulnerability as a basis for the development of heart failure	21H02919
KAKENHI Grant-in-Aid for Scientific Research (S)	Genome immunity: elucidation of the antiviral activity of endogenous bornaviruses and their utilization as functional resources	20H05682
KAKENHI Grant-in-Aid for Scientific Research (B)	Integration and reactivation of human herpesvirus 6: association with diseases	21H02972
Biobank - Construction and Utilization biobank for genomic medicine REalization (B-Cure), Japan Agency for Medical Research and Development (AMED)	Management of the Japanese biobank	JP19km0605001
Practical Research Project for Life-Style related Diseases including Cardiovascular Diseases and Diabetes Mellitus, Japan Agency for Medical Research and Development (AMED)	Multi-layered and integrated research for prevention of atrial fibrillation and serious complications	JP22ek0210164
Biobank - Construction and Utilization biobank for genomic medicine REalization, Japan Agency for Medical Research and Development (AMED)	Understanding pathogenesis of atrial fibrillation and implementation of precision medicine by WGS and multi-omics	JP21tm0724601
Biobank - Construction and Utilization biobank for genomic medicine REalization, Japan Agency for Medical Research and Development (AMED)	Implementation of next-generation precision medicine for cardiovascular disease by multi-omics	JP20km0405209
Practical Research Project for Rare / Intractable Diseases, Japan Agency for Medical Research and Development (AMED)	Understanding pathology and implementation of precision medicine for intractable cardiovascular disease by multi-omics analysis	JP20ek0109487

PUBLICATIONS

	Title	DOI	Dataset ID
1	A genome-wide association study identifies PLCL2 and AP3D1-DOT1L-SF3A2 as new susceptibility loci for myocardial infarction in Japanese.	doi:10.1038/ejhg.2014.110	hum0014.v1.freq.v1
2	A functional variant in ZNF512B is associated with susceptibility to amyotrophic lateral sclerosis in Japanese.	doi:10.1093/hmg/ddr268	hum0014.v2.jsnp.92als.v1
3	Functional variants in ADH1B and ALDH2 coupled with alcohol and smoking synergistically enhance esophageal cancer risk.	doi: 10.1053/j.gastro.2009.07.070	hum0014.v2.jsnp.182ec.v1
4	SNPs in KCNQ1 are associated with susceptibility to type 2 diabetes in East Asian and European populations.	doi: 10.1038/ng.208	T2DM (JSNP)
5	Common variants in a novel gene, FONG on chromosome 2q33.1 confer risk of osteoporosis in Japanese.	doi: 10.1371/journal.pone.0019641	Osteoporosis (JSNP)
6	Genome-wide association studies in the Japanese population identify seven novel loci for type 2 diabetes.	doi: 10.1038/ncomms10531	hum0014.v3.T2DM-1.v1 hum0014.v3.T2DM-2.v1
7	Multi-ancestry genome-wide association study of 21,000 cases and 95,000 controls identifies new risk loci for atopic dermatitis.	doi: 10.1038/ng.3424	hum0014.v4.AD.v1
8	Genome-wide association study identifies eight new susceptibility loci for atopic dermatitis in the Japanese population.	doi: 10.1038/ng.2438	hum0014.v4.AD.v1
9	Identification of six new genetic loci associated with atrial fibrillation in the Japanese population.	doi: 10.1038/ng.3842	hum0014.v5.AF.v1
10	Genome-wide association study identifies 112 new loci for body mass index in the Japanese population.	doi:10.1038/ng.3951	hum0014.v6.158k.v1 JGAD000123 JGAD000124
11	Genome-wide association study identifies seven novel susceptibility loci for primary open-angle glaucoma.	doi: 10.1093/hmg/ddy053	hum0014.v7.POAG.v1
12	Genetic analysis of quantitative traits in the Japanese population links cell types to complex human diseases.	doi:10.1038/s41588-018-0047-6	hum0014.v8.58qt.v1 JGAD000144-JGAD000201
13	Elucidating the genetic architecture of reproductive ageing in the Japanese population	doi: 10.1038/s41467-018-04398-z	hum0014.v9.Men.v1 hum0014.v9.MP.v1
14	Deep whole-genome sequencing reveals recent selection signatures linked to evolution and disease risk of Japanese.	doi: 10.1038/s41467-018-03274-0	JGAD000220
15	Germline pathogenic variants of 11 breast cancer genes in 7,051 Japanese patients and 11,241 controls.	doi: 10.1038/s41467-018-06581-8	JGAD000209
16	A Variant within the FTO confers susceptibility to diabetic nephropathy in Japanese patients with type 2 diabetes	doi: 10.1371/journal.pone.0208654	hum0014.v12.T2DMwN.v1
17	Identification of 28 new susceptibility loci for type 2 diabetes in the Japanese population	doi: 10.1038/s41588-018-0332-4	hum0014.v13.T2DMmeta.v1
18	GWAS of smoking behaviour in 165,436 Japanese people reveals seven new loci and shared genetic architecture.	doi: 10.1038/s41562-019-0557-y	hum0014.v14.asi.v1 hum0014.v14.cpd.v1 hum0014.v14.ens.v1 hum0014.v14.fcs.v1
19	Characterizing rare and low-frequency height-asssociated variants in the Japanese population	doi: 10.1038/s41467-019-12276-5	JGAD000220 (fastq) JGAD000220 (reference panel) hum0014.v15.ht.v1
20	Germline pathogenic variants in 7,636 Japanese patients with prostate cancer and 12,366 controls.	doi: 10.1093/jnci/djz124	JGAD000288
21	Large-scale genome-wide association study in a Japanese population identifies novel susceptibility loci across different diseases	doi: 10.1038/s41588-020-0640-3	hum0014.v17 hum0014.v18 hum0014.v21
22	GWAS of 165,084 Japanese individuals identified nine loci associated with dietary habits	doi: 10.1038/s41562-019-0805-1	hum0014.v19
23	Population-specific and transethnic genome-wide analyses identify distinct and shared genetic risk loci for coronary artery disease.	doi: 10.1038/s41588-020-0705-3	hum0014.v20.cad.v1
24	Genetic characterization of pancreatic cancer patients and prediction of carrier status of germline pathogenic variants in cancer-predisposing genes	doi: 10.1016/j.ebiom.2020.103033	JGAD000438
25	Population-based Screening for Hereditary Colorectal Cancer Variants in Japan	doi: 10.1016/j.cgh.2020.12.007	JGAD000458 JGAD000459
26	Genome-wide association study reveals BET1L associated with survival time in the 137,693 Japanese individuals	doi: 10.1038/s42003-023-04491-0	hum0014.v27.surv.v1
27	Cross-ancestry genome-wide analysis of atrial fibrillation unveils disease biology and enables cardioembolic risk prediction	doi: 10.1038/s41588-022-01284-9	hum0014.v29.AF.v1
28	Association between germline pathogenic variants in cancer-predisposing genes and lymphoma risk	doi: 10.1111/cas.15522	JGAD000460 JGAD000721
29	Helicobacter pylori, Homologous-Recombination Genes, and Gastric Cancer	doi: 10.1056/NEJMoa2211807	JGAD000720 JGAD000721
30	Germ line DDX41 mutations define a unique subtype of myeloid neoplasms	doi: 10.1182/blood.2022018221	JGAD000399 JGAD000400
31	Combined landscape of single-nucleotide variants and copy number alterations in clonal hematopoiesis	doi: 10.1038/s41591-021-01411-9	JGAD000399 JGAD000400
32	Characterizing rare and low-frequency height-associated variants in the Japanese population	doi: 10.1038/s41467-019-12276-5	JGAD000777
33	Chromosomal alterations among age-related haematopoietic clones in Japan	doi: 10.1038/s41586-020-2426-2	JGAD000777 JGAD000836
34	Detection of trait-associated structural variations using short-read sequencing	doi: 10.1016/j.xgen.2023.100328	JGAD000831
35	Population-specific reference panel improves imputation quality for genome-wide association studies conducted on the Japanese population		JGAD000873

USRES (Controlled-access Data)

Principal Investigator	Affiliation	Country/Region	Research Title	Data in Use (Dataset ID)	Period of Data Use
Mark Daly	Broad Institute of MIT and Harvard			JGAD000101, JGAD000102, JGAD000123, JGAD000124, JGAD000144-JGAD000201, JGAD000220	2018/09/11-2023/07/31
Yukinori Okada	Department of Statistical Genetics, Osaka University Graduate School of Medicine			JGAD000101, JGAD000102, JGAD000123, JGAD000124, JGAD000144-JGAD000201, JGAD000220	2018/09/20-2021/03/31
Shigeo Kamitsuji	Statistical Analysis Division, StaGen Co., Ltd.			JGAD000123	2018/10/04-2019/03/31
Katsushi Tokunaga	Department of Human Genetics, Graduate School of Medicine, The University of Tokyo			JGAD000123, JGAD000124, JGAD000144-JGAD000201, JGAD000220	2018/11/13-2026/11/08
Tatsuhiko Tsunoda	Department of Medical Science Mathematics, Medical Research Institute, Tokyo Medical and Dental University		Research on big data analysis for precision medicine	JGAD000123, JGAD000124, JGAD000144-JGAD000201, JGAD000220	2018/12/18-2021/06/19
Liming Liang	Harvard T.H. Chan School of Public Health, Department of Epidemiology			JGAD000123	2019/01/21-2021/12/31
Masao Nagasaki	Center for Genomic Medicine, Graduate School of Medicine Center for the Promotion of Interdisciplinary Education and Research, Kyoto University		Development and application of bioinformatics methods to facilitate the detection of genes associated with multifactorial disorders based on large-scale whole genome sequencing data of Japanese individuals	JGAD000123, JGAD000124, JGAD000144-JGAD000201, JGAD000220	2019/01/31-2027/03/31
Seishi Ogawa	Department of Pathology and Tumor Biology, Graduate School of Medicine, Kyoto University			JGAD000209	2019/02/04-2021/03/31
Shigeo Kamitsuji	Statistical Analysis Division, StaGen Co., Ltd.			JGAD000123	2019/03/13-2022/03/31
Takashi Kohno	National Cancer Research Institute, Division of genome biology			JGAD000123, JGAD000124, JGAD000220	2019/04/15-2019/12/31
Shigeo Horie	Department of Urology, Juntendo University, Graduate School of Medicine			JGAD000123, JGAD000220	2019/05/14-2024/03/31
Tatsuhiko Tsunoda	Department of Medical Science Mathematics, Medical Research Institute, Tokyo Medical and Dental University		Research on sequence, image data analysis for precision medicine	JGAD000123, JGAD000124, JGAD000144-JGAD000201, JGAD000220	2019/06/06-2023/08/31
Kengo Kinoshita	Tohoku Medical Megabank Organization		Construction of Japanese whole genome database	JGAD000220	2019/06/24-2022/03/31
Kouya Shiraishi	Division of Genome Biology, National Cancer Research Institute		Elucidation of immune-system networks between host and tumor based on genomic analysis	JGAD000124	2019/08/05-2023/03/31
Shigeo Kamitsuji	Statistical Analysis Division, StaGen Co., Ltd.		Mendelian randomization study using genetic markers of uric acid levels as an instrumental variable	JGAD000123, JGAD000124, JGAD000146, JGAD000148, JGAD000149, JGAD000155, JGAD000156, JGAD000157, JGAD000174, JGAD000188	2019/08/16-2024/03/31
Shigeo Kamitsuji	Statistical Analysis Division, StaGen Co., Ltd.		Mendelian randomization study using 58 clinical laboratory tests and SNP genotype data.	JGAD000123, JGAD000124, JGAD000144-JGAD000201	2019/08/22-2024/03/31
Osamu Ogasawara	Bioinformation and DDBJ Center, National Institute of Genetics		Evaluation of human genome analysis workflow using JGA/AGD genome data.	JGAD000123, JGAD000220	2019/10/11-2024/03/31
Seishi Ogawa	Department of Medical Science, Kyoto University		Comprehensive analysis of genetic alterations in hematological malignancies	JGAD000102, JGAS000123, JGAD000220	2019/11/14-2024/03/31
Yasushi Okazaki	Diagnostics and Therapeutics of Intractable Diseases, Juntendo University Graduate School of Medicine		Identification of disease biomarkers by disease cohort research network -Whole genome sequencing of epilepsy-	JGAD000123	2020/06/04-2023/03/31
Chihiro Hata	Bioinformation and DDBJ Center, National Institute of Genetics		Identification of hypomorphic mutations in Japanese breast cancer patients	JGAD000209	2020/06/04-2023/03/31
Yosuke Kawai	Genome Medical Science Project, National Center for Global Health and Medicine		Large scale genome analysis of modern human genomes to infer the origin of Yaponesians	JGAD000123, JGAD000124, JGAD000144-JGAD000201, JGAD000220	2020/06/19-2022/03/31
Nakao Iwata	Department of Psychiatry, Fujita Health University School of Medicine		Research for investigating susceptibility of mental state, mental disorders, drug efficacy and side effects through genetic analysis	JGAD000123, JGAD000124	2020/08/17-2022/12/31
Atray Dixit	Coral Genomics, Inc.	Derivation and Evaluation of Functional Response Scores	JGAD000101, JGAD000123, JGAD000124, JGAD000144-JGAD000201, JGAD000220	2020/08/24-2021/07/21
Hae Kyung Im	Biological Sciences Division, University of Chicago		Predicted Gene Expression: High Power, Mechanism, and Direction of Effect	JGAD000123, JGAD000124, JGAD000144-JGAD000201, JGAD000220	2020/09/15-2023/06/23
Hongyu Zhao	Department of Biostatistics, Yale School of Public Health		Leveraging multi-ethnic data and functional annotations in causal variant identification, genetic correlation estimation, and genetic risk prediction	JGAD000101, JGAD000102, JGAD000123, JGAD000124, JGAD000144-JGAD000201, JGAD000220	2020/09/24-2024/03/01
Charleston Chiang	Center for Genetic Epidemiology, Keck School of Medicine, University of Southern Califolnia		Investigating the evolution of complex genetic architecture in participants of Biobank Japan	JGAD000101, JGAD000102, JGAD000123, JGAD000124, JGAD000144-JGAD000201, JGAD000220	2020/09/28-2025/07/01
Shigeo Kamitsuji	Statistical Analysis Division, StaGen Co., Ltd.		Identifying the genetic risk factors for Stent Thrombosis by genome-wide association study	JGAD000123, JGAD000124, JGAD000145, JGAD000146, JGAD000149, JGAD000151, JGAD000155, JGAD000156, JGAD000158, JGAD000159, JGAD000163, JGAD000165-JGAD000167, JGAD000170, JGAD000172-JGAD000175, JGAD000182, JGAD000187-JGAD000189, JGAD000192-JGAD000196, JGAD000200, JGAD000201	2020/10/26-2025/03/31
Ali Torkamani	Scripps Research Institute		Genomics Deep Learning	JGAD000101, JGAD000102, JGAD000123, JGAD000124, JGAD000144-JGAD000201, JGAD000220	2020/11/16-2023/07/10
Kazuhiro Nakayama	Department of Integrated Biosciences, Graduate School of Frontier Sciences, The University of Tokyo		Investigation of genome variation influening activity of brown adipose tissues	JGAD000123, JGAD000124	2020/11/26-2023/09/18
Keishi Fujio	Department of Allergy and Rheumatology, Graduate School of Medicine, The University of Tokyo		Integrative analysis of immune-cell eQTL data and large-scaled GWAS data in Japanese	JGAD000101, JGAD000102, JGAD000123, JGAD000124, JGAD000144-JGAD000201, JGAD000220	2020/12/14-2025/03/31
Masataka Kikuchi	Department of Computational Biology and Medical Sciences, Graduate School of Frontier Science, The University of Tokyo	Japan	Imputation analysis using a Japanese reference panel	JGAD000220	2020/12/14-2027/06/30
Fumihiko Matsuda	Center for Genomic Medicine, Kyoto University		Elucidation of Japanese genetic diversity	JGAD000220	2021/01/05-2025/03/31
Emiko Noguchi	Department of Medical Genetics, Faculty of Medicine, University of Tsukuba		Exploratory study of genetic factors in allergic diseases	JGAD000220	2021/03/16-2032/03/31
Noriko Sato	Department of Molecular Epidemiology, Medical Research Institute, Tokyo Medical and Dental University		Analysis of genetic and environmental risks of obesity and diabetes based on regional cohort longitudinal data	JGAD000220	2021/04/09-2023/03/31
Takashi Kohno	Division of Genome Biology, National Cancer Center Research Institute		Identification of genetic risk factors in AYA(Adolescence and Young Adult) cancer	JGAD000209, JGAD000220	2021/05/20-2025/03/31
Yasunobu Nagata	Department of hematology, Nippon Medical School		Identification of the mechanisms for pathogenesis of hematologic tumors based on novel genetic abnormalities	JGAD000123, JGAD000124, JGAD000144-JGAD000201, JGAD000220	2021/05/26-2026/03/31
Atsushi Kawakami	Department of Immunology and Rheumatology, Nagasaki University Hospital		An exploratory study to determine the genetic polymorphisms or mutations associated with type 1 diabetes and interstitial lung disease induced by immune checkpoint inhibitor; nivolumab	JGAD000220	2021/06/14-2022/03/30
Akihiro Fujimoto	Department of Human Genetics, Graduate School of Medicine, The University of Tokyo		Comprehensive analysis of mutations and genetic diversity by analyzing whole-genome sequence data	JGAD000220, JGAD000410	2021/09/16-2024/11/30
Yoshihiro Asano	Department of Cardiovascular Medicine Graduate School of Medicine, Osaka University		Sensitive gene analysis of hereditary cardiovascular disease	JGAD000123, JGAD000124, JGAD000144-JGAD000201, JGAD000220	2021/07/16-2022/05/31
Hironori Masuko	Department of Pulmonary Medicine, University of Tsukuba		Search for susceptibility genes for chronic inflammatory airway diseases	JGAD000123	2021/08/11-2023/03/31
Takashi Kohno	Division of Genome Biology, National Cancer Center Research Institute		Identification of genetic risk factors in AYA(Adolescence and Young Adult) cancer	JGAD000209, JGAD000220	2021/09/27-2025/03/31
Fumihiko Matsuda	Center for Genomic Medicine, Kyoto University		Development of personalized medicine	JGAD000123, JGAD000220	2021/09/27-2025/03/31
Takashi Matsuda	Advanced Informatics & Analytics, Astellas Pharma Inc.		Investigation of the correlation between Liver cancer/Hepatitis B and polymorphism	JGAD000102, JGAD000123	2021/11/05-2022/07/31
Emiko Noguchi	Department of Medical Genetics, Faculty of Medicine, University of Tsukuba		Identification of the pathogenic factors for food allergy	JGAD000220	2021/12/03-2025/03/31
Yosuke Kawai	Division of Molecular Pathology, The Institute of Medical Science, The University of Tokyo		Population Genetic Analysis of the Origin of Japanese Populations	JGAD000123	2021/12/08-2023/03/31
Toshiharu Ninomiya	Department of Epidemiology and Public Health, Graduate School of Medical Sciences, Kyushu University		Japan Prospective Studies Collaboration for Aging and Dementia (JPSC-AD)	JGAD000220	2021/12/08-2026/2/28
Joshua Chiou	Internal Medicine Research Unit, Pfizer		Evaluating GWAS associations from Biobank Japan to Support Confidence in Rationale for Therapeutic Targets	JGAD000123, JGAD000124, JGAD000144-JGAD000201, JGAD000220	2022/01/27-2022/12/31
Gil McVean	Genomics plc	United Kingdom of Great Britain and Northern Ireland	Development of polygenic risk scores in diverse ancestries for diseases, traits and conditions	JGAD000101, JGAD000102	2022/07/19-2025/07/01
Gil McVean	Genomics plc	United Kingdom of Great Britain and Northern Ireland	Using large-scale reference panels for imputation and ancestry analysis to support target discovery and polygenic risk score models	JGAD000220, JGAD000410	2022/08/04-2025/08/01
Hirofumi Nakaoka	Department of Cancer Genome Research, Sasaki Institute	Japan	Analysis of hypomorphic variants in breast cancer-associated genes by using large-scale sequencing data sets	JGAD000209	2022/08/17-2024/03/31
Emiko Noguchi	Department of Medical Genetics, Faculty of Medicine, University of Tsukuba	Japan	Research on genetic predisposition to inflammatory lung disease	JGAD000220	2022/09/19-2027/03/31
Nuria Lopez-Bigas	Institute for Research in Biomedicine (IRB Barcelona)	Spain	Study of the genetic basis of clonal hematopoiesis	JGAD000399, JGAD000400	2022/11/06-2025/08/01
Keiko Yamazaki	Department of Public Health, Graduate School of Medicine, Chiba University	Japan	Prediction of effectiveness to molecular target drugs in Japanese patients with inflammatory bowel disease	JGAD000220	2022/11/09-2025/03/31
Ryosuke Kitoh	Department of Otorhinolaryngology-Head and Neck Surgery, Shinshu University School of Medicine	Japan	Genome-wide association study of the sudden sensorineural hearing loss	JGAD000123	2022/12/19-2027/03/31
Shigeo Kamitsuji	Statistical Analysis Division, StaGen Co., Ltd.	Japan	Genome-Wide Association Study to identify genetic factors for strabismus in Japanese population	JGAD000123	2023/02/13-2027/02/28
Kei Yura	Graduate School of Humanities and Sciences, Ochanomizu University	Japan	Phenotype Prediction of Cancer Suppressor Gene BRCA1 variants	JGAD000220	2023/03/16-2025/03/31
Yoshihiro Onouchi	Department of Public Health, Graduate School of Medicine, Chiba University	Japan	A Multicenter Study to Identify Genetic Factors in Kawasaki Disease	JGAD000220	2023/03/30-2025/03/31
Yoshihiro Onouchi	Department of Public Health, Graduate School of Medicine, Chiba University	Japan	A study of the genetic background of differences in antibody response to COVID-19 vaccine	JGAD000220	2023/04/19-2025/12/31
Masaki Kato	kansai medical university	Japan	Exploratory and validation study of genetic and biological factors for the development of precision medicine algorithms for psychiatric disorders	JGAD000101, JGAD000102, JGAD000123, JGAD000220	2023/08/25-2028/06/30
Hiroki Kimura	Department of Psychiatry, Nagoya University Graduate school of medicine	Japan	Research on elucidation of susceptibility to brain and mental illness (vulnerability to disease onset) and efficacy and side effects of drugs (treatment responsiveness) through genetic analysis	JGAD000220	2023/11/17-2025/10/28
Chikashi Terao	Laboratory for Statistical and Translational Genetics, RIKEN Center for Integrative Medical Sciences	Japan	Research on personalized medicine based on genomics information	JGAD000123, JGAD000124, JGAD000144-JGAD000201, JGAD000220	2023/11/21-2026/03/31
Yasuhiro Mochida	Kidney Disease and Transplant center, Shonan Kamakura General Hospital	Japan	Association between Clonal hematopoiesis of indeterminate potential and Chronic Kidney Disease in Japanese cohort study	JGAD000399, JGAD000400	2024/02/07-2027/03/31
Hiroyuki Mishima	Department of Human Genetics, Atomic Bomb Disease Institute, Nagasaki University	Japan	Development of Methods to Mitigate Batch Effects in Human Whole Genome Sequencing	JGAD000220	2024/04/17-2027/03/31
Chikashi Terao	Clinical Research Center, Shizuoka General Hospital	Japan	Investigation of Genetic Factors Associated with Human Phenotypic Traits	JGAD000220, JGAD000495, JGAD000777	2024/04/17-2028/12/03
Koichi Matsuda	Department of Computational Biology and Medical Sciences, Graduate school of Frontier Sciences, The University of Tokyo	Japan	Disease Cohort Research Network for Disease Marker Exploratory Studies	JGAD000209, JGAD000220, JGAD000288, JGAD000438, JGAD000458-JGAD000460, JGAD000495, JGAD000531, JGAD000720, JGAD000721, JGAD000777, JGAD000831-JGAD000834	2024/06/17-2029/03/31
Masao Nagasaki	Division of Biomedical Information Analysis, Medical Research Center for High Depth Omics, Medical Institute of Bioregulation, Kyushu University	Japan	Development and application of bioinformatics methods to facilitate the detection of genes associated with multifactorial disorders based on large-scale whole genome sequencing data of Japanese individuals	JGAD000123, JGAD000124, JGAD000144-JGAD000201, JGAD000220	2024/06/24-2027/03/31
Kouya Shiraishi	Department of Clinical Genomics, National Cancer Center Research Institute	Japan	Search for genes involved in susceptibility to lung cancer	JGAD000123	2024/06/24-2025/12/31
Chikashi Terao	Laboratory for Statistical and Translational Genetics, RIKEN Center for Integrative Medical Sciences	Japan	Research on personalized medicine based on genomics information	JGAD000836	2024/07/08-2026/03/31
Chikashi Terao	Clinical Research Center, Shizuoka General Hospital	Japan	Investigation of Genetic Factors Associated with Human Phenotypic Traits	JGAD000836	2024/07/16-2028/12/03
Taisei Mushiroda	Laboratory for Pharmacogenomics, RIKEN Center for Integrative Medical Sciences	Japan	Search of genomic biomarkers associated with drug-induced eruptions	JGAD000220	2024/08/01-2028/03/31
Masahiro Nakatochi	Public Health Informatics, Department of Integrated Sciences, Nagoya University Graduate School of Medicine	Japan	Exploration of genetic factors involved in the onset, progression, and prognosis of amyotrophic lateral sclerosis	JGAD000679	2024/08/27-2030/03/31