NBDC Research ID: hum0014.v18
Click to Latest version.
SUMMARY
Aims: Identify disease-related genes in Japanese
Methods: Genomic DNA samples were genotyped by following methods: Human610-Quad BeadChip, HumanHap550v3 Genotyping BeadChip, HumanOmniExpress-12 BeadChip, HumanExome BeadChip, OmniExpressExome BeadChip (Illumina), high-density oligonucleotide arrays (Perlegen Sciences), or Invader (Hologic Japan). Genome-Wide Association Studies (GWAS) for myocardial infarction (MI) , type II diabetes mellitus (T2DM), Atopic dermatitis (AD), atrial fibrillation (AF), Body Mass Index (BMI), primary open-angle glaucoma (POAG), 58 quantitative traits, age at menarche / menopause, smoking behaviour, height, and 41 diseases (among them, the samples of 4 diseases were partially overlapped with those of previous release) were performed using about 500-2700K variants. Meta analyses for T2DM with diabetic nephropathy and for T2DM were also performed. A whole-genome sequencing analysis for 1,026 patients, who were registered Bio Bank Japan from 2003 - 2007 was performed with Illumina HiSeq2500. Target sequencing analyses of 11 hereditary breast cancer genes in 7,104 breast cancer patients and 23,731 controls, and 8 hereditary prostate cancer genes in 7,636 prostate cancer patients and 12,366 controls were also performed with Illumina HiSeq2500. A new reference panel was build with WGS data of the biobank Japan project (N=1,037) and the 1KGP p3v5 ALL (N=2,504).
Participants/Materials: Participants for the Tailor-made Medical Treatment Program (BioBank Japan: BBJ)
URL: https://biobankjp.org/cohort_3rd/english/index.html
Data Set ID | Type of Data | Criteria | Release Date |
---|---|---|---|
hum0014.v1.freq.v1 | GWAS for MI | Un-restricted Access | 2014/09/30 |
hum0014.v2.jsnp.934ctrl.v1 |
Genotype frequencies in 934 healthy individuals (JSNP data) |
Un-restricted Access | 2015/12/28 |
35 Dieases |
Genotype frequencies in each disease (JSNP data) |
Un-restricted Access | 2015/12/28 |
hum0014.v2.jsnp.182ec.v1 |
Genotype frequencies in 182 esophageal cancer patients (JSNP data) |
Un-restricted Access | 2015/12/28 |
hum0014.v2.jsnp.92als.v1 |
Genotype frequencies in 92 amyotrophic lateral sclerosis (ALS) patients (JSNP data) |
Un-restricted Access | 2015/12/28 |
hum0014.v3.T2DM-1.v1 | GWAS for T2DM [1] | Un-restricted Access | 2016/01/28 |
hum0014.v3.T2DM-2.v1 | GWAS for T2DM [2] | Un-restricted Access | 2016/01/28 |
hum0014.v4.AD.v1 | GWAS for AD | Un-restricted Access | 2016/02/02 |
hum0014.v5.AF.v1 | GWAS for AF | Un-restricted Access | 2016/05/18 |
JGAS000101 | Genotype and phenotype data for 8180 AF patients | Controlled Access (Type I) | 2016/05/18 |
hum0014.v6.158k.v1 | GWAS for BMI | Un-restricted Access | 2017/09/08 |
JGAS000114 |
BMI data for 158,284 individuals Genotype data for 182,505 individuals |
Controlled Access (Type I) | 2017/09/08 |
hum0014.v7.POAG.v1 | GWAS for POAG | Un-restricted Access | 2018/04/04 |
hum0014.v8.58qt.v1 | GWAS for 58 quantitative traits | Un-restricted Access | 2018/05/01 |
JGAS000114 | 58 quantitative traits data for 200,849 individuals | Controlled Access (Type I) | 2018/05/01 |
GWAS for age at menarche and menopause | Un-restricted Access | 2018/08/07 | |
JGAS000114 | WGS for 1,026 individuals | Controlled Access (Type I) | 2018/08/13 |
JGAS000140 | target sequencing of 11 hereditary breast cancer genes in 7,104 breast cancer patients and 23,731 controls | Controlled Access (Type I) | 2018/10/16 |
hum0014.v12.T2DMwN.v1 | meta analysis of 2 GWASs for T2DM with diabetic nephropathy | Un-restricted Access | 2018/12/10 |
hum0014.v13.T2DMmeta.v1 | meta analysis of 4 GWASs for T2DM | Un-restricted Access | 2019/01/25 |
hum0014.v14.smok.v1 | GWAS for smoking behaviour | Un-restricted Access | 2019/03/26 |
JGAS000114 | a reference panel from WGS data of the biobank Japan project (N=1,037) and 1KGP p3v5 ALL (N=2,504) | Controlled Access (Type I) | 2019/09/27 |
hum0014.v15.ht.v1 | GWAS for height | Un-restricted Access | 2019/09/27 |
JGAS000203 | target sequencing of 8 hereditary prostate cancer genes in 7,636 prostate cancer patients and 12,366 controls | Controlled Access (Type I) | 2019/10/07 |
hum0014.v17 | GWAS for 40 diseases | Un-restricted Access | 2019/10/08 |
hum0014.v18 | GWAS for Breast cancer | Un-restricted Access | 2019/11/26 |
* Data users need to apply the Form 2 (Application Form for Using NBDC Human Data) to reach the Controlled Access Data. Learn more
* When the research results including the data which were downloaded from NHA/DRA, are published or presented somewhere, the data user must refer the papers which are related to the data, or include in the acknowledgment. Learn more
MOLECULAR DATA
Participants/Materials | 1666 MI patients and 3198 controls |
Targets | genome wide SNVs |
Target Loci for Capture Methods | - |
Platform | Illumina [Human610-Quad BeadChip, HumanHap550v3 Genotyping BeadChip] |
Source | gDNA extracted from peripheral blood cells |
Cell Lines | - |
Reagents (Kit, Version) | Illumina Human610-Quad Beadchip |
Genotype Call Methods (software) | GenCall software (GenomeStudio) |
Filtering Methods | sample call rate ≧ 0.98, SNP call rate ≧ 0.99, HWE P ≧ 1 x 10^-6 |
Marker Number (after QC) | 455,781 SNPs (hg18/GRCh36) |
NBDC Data Set ID |
(Click the Data Set ID to download the file) |
Total Data Volume | 71.3 MB (xlsx) |
Comments (Policies) | NBDC policy |
Participants/Materials | 934 Japanese healthy individuals (JSNP) |
Targets | genome wide SNVs |
Target Loci for Capture Methods | - |
Platform | Illumina [HumanHap550v3 Genotyping BeadChip] |
Source | gDNA extracted from peripheral blood cells |
Cell Lines | - |
Reagents (Kit, Version) | Illumina HumanHap550v3 Genotyping BeadChip |
Genotype Call Methods (software) | GenCall software (GenomeStudio) |
Filtering Methods | sample call rate < 0.98, SNP call rate < 0.99, HWE P < 1 x 10^-6 |
Marker Number (after QC) | 515,286 SNPs |
NBDC Data Set ID |
(Click the Data Set ID to download the file) |
Total Data Volume | 32.2 M (zip [xls]) |
Comments (Policies) | NBDC policy |
35 Diseases (JSNP)
Participants/Materials |
Cancer (Lung cancer, Breast cancer, Gastric cancer, Colorectal cancer, Prostate cancer) Cardiovascular diseases (Heart failure, Myocardial infarction, Unstable angina, Stable angina, Cardiac arrhythmias, Arteriosclerosis obliterans) Cerebrovascular disorders (Brain infarction, Intracranial aneurysm) Respiratory tract diseases (Interstitial pneumonitis & pulmonary fibrosis, Pulmonary emphysema, Bronchial asthma) Chronic liver diseases (Chronic hepatitis C, Liver cirrhosis) Eye diseases (Cataract, Glaucoma) Others (Epilepsy, Periodontal disease, Urolithiasis, Nephrotic syndrome, Uterine myoma, Endometriosis, Osteoporosis, Rheumatoid arthritis, Amyotrophic lateral sclerosis, Hay fever, Atopic dermatitis, Drug eruptions , Hyperlipidemias, Diabetes mellitus, Basedow disease )
about 190 patients in each disease set |
Targets | genome wide SNVs |
Target Loci for Capture Methods | - |
Platform | Perlegen Sciences [high-density oligonucleotide arrays] |
Source | gDNA extracted from peripheral blood cells |
Cell Lines | - |
Reagents (Kit, Version) | - |
Genotype Call Methods (software) | - |
Filtering Methods | - |
Marker Number (after QC) | About 200,000 SNPs (b129) |
NBDC Data Set ID |
Cancer (Lung cancer, Breast cancer, Gastric cancer, Colorectal cancer, Prostate cancer) Cardiovascular diseases (Heart failure, Myocardial infarction, Unstable angina, Stable angina, Cardiac arrhythmias, Arteriosclerosis obliterans) Cerebrovascular disorders (Brain infarction, Intracranial aneurysm) Respiratory tract diseases (Interstitial pneumonitis & pulmonary fibrosis, Pulmonary emphysema, Bronchial asthma) Chronic liver diseases (Chronic hepatitis C, Liver cirrhosis) Eye diseases (Cataract, Glaucoma) Others (Epilepsy, Periodontal disease, Urolithiasis, Nephrotic syndrome, Uterine myoma, Endometriosis, Osteoporosis, Rheumatoid arthritis, Amyotrophic lateral sclerosis, Hay fever, Atopic dermatitis, Drug eruptions, Hyperlipidemias, Diabetes mellitus, Basedow disease) (Click the disease names to download the file) |
Comments (Policies) | NBDC policy |
*Chromosomal position of each SNP is based on dbSNP build 129. If you need other mapping information, please contact us.
Participants/Materials | 182 esophageal cancer patients (JSNP) |
Targets | genome wide SNVs |
Target Loci for Capture Methods | - |
Platform | Illumina [HumanHap550v3 Genotyping BeadChip] |
Source | gDNA extracted from peripheral blood cells |
Cell Lines | - |
Reagents (Kit, Version) | Illumina HumanHap550v3 Genotyping BeadChip |
Genotype Call Methods (software) | GenCall software (GenomeStudio) |
Filtering Methods | sample call rate < 0.98, SNP call rate < 0.99, HWE P < 1 x 10^-6 |
Marker Number (after QC) | 503,734 SNPs |
NBDC Data Set ID |
(Click the Data Set ID to download the file) |
Total Data Volume | 6.6 MB (zip [txt]) |
Comments (Policies) | NBDC policy |
Participants/Materials | 92 ALS patients (JSNP) |
Targets | large-scale case-control association study |
Target Loci for Capture Methods | - |
Platform | Hologic Japan [Invader] |
Source | gDNA extracted from peripheral blood cells |
Cell Lines | - |
Reagents (Kit, Version) | Invader assay system (Third Wave Technologies) |
Genotype Call Methods (software) | ABI PRISM SDS versions 2.0 - 2.2 |
Filtering Methods | SNP call rate ≥ 0.95, HWE P ≥1.0 x 10^-2 |
Marker Number (after QC) | 48,939 SNPs |
NBDC Data Set ID |
(Click the Data Set ID to download the file) |
Total Data Volume | 3.2 MB (zip [txt]) |
Comments (Policies) | NBDC policy |
Participants/Materials |
9817 T2DM patients 6763 controls (healthy individuals and patients with Intracranial aneurysm, Esophageal cancer, Uterine cancer, Pulmonary emphysema, or Glaucoma [without T2DM]) |
Targets | genome wide SNVs |
Target Loci for Capture Methods | - |
Platform | Illumina [OmniExpressExome Beadchip] |
Source | gDNA extracted from peripheral blood cells |
Cell Lines | - |
Reagents (Kit, Version) | Illumina OmniExpressExome Beadchip kit |
Genotype Call Methods (software) | GenCall software (GenomeStudio) |
Filtering Methods | sample call rate < 0.98, SNP call rate < 0.99, MAF < 0.01, HWE P < 1 x 10^-6 in control |
Marker Number (after QC) | 552,915 SNPs (hg19) |
NBDC Data Set ID |
(Click the Data Set ID to download the file) |
Total Data Volume | 84.0 MB (xlsx) |
Comments (Policies) | NBDC policy |
Participants/Materials |
5646 T2DM patients 19,420 controls (patients with Colorectal cancer, Breast cancer, Prostate cancer, Lung cancer, Gastric cancer, Arteriosclerosis obliterans, Cardiac arrhythmias, Brain infarction, Myocardial infarction, Gallbladder cancer and Cholangiocarcinoma, Pancreatic cancer, Drug eruptions, Rheumatoid arthritis, Amyotrophic lateral sclerosis, Liver cancer, Liver cirrhosis, Osteoporosis, or Uterine myoma [without T2DM]) |
Targets | genome wide SNVs |
Target Loci for Capture Methods | - |
Platform | Illumina [Human610-Quad BeadChip] |
Source | gDNA extracted from peripheral blood cells |
Cell Lines | - |
Reagents (Kit, Version) | Illumina Human610-Quad Beadchip kit |
Genotype Call Methods (software) | GenCall software (GenomeStudio) |
Filtering Methods | sample call rate < 0.98, SNP call rate < 0.99, MAF < 0.01, HWE P < 1 x 10^-6 in control |
Marker Number (after QC) | 479,088 SNPs (hg18) |
NBDC Data Set ID |
(Click the Data Set ID to download the file) |
Total Data Volume | 72.6 MB (xlsx) |
Comments (Policies) | NBDC policy |
Participants/Materials |
1472 AD patients 7966 controls (healthy individuals and patients with Intracranial aneurysm, Esophageal cancer, Uterine cancer, Pulmonary emphysema, or Glaucoma [without AD and Bronchial asthma]) |
Targets | genome wide SNVs |
Target Loci for Capture Methods | - |
Platform | Illumina [HumanOmniExpress BeadChip] |
Source | gDNA extracted from peripheral blood cells |
Cell Lines | - |
Reagents (Kit, Version) | HumanOmniExpress BeadChip |
Genotype Call Methods (software) |
minimac [imputation (1000 genomes Phase I v3)] |
Association Analysis (software) | mach2dat [GWAS] |
Filtering Methods |
Genotyping QC: sample call rate < 0.98, SNV call rate < 0.99, HWE P < 1 x 10^-6 in the control samples Imputation QC: HWE P < 1 x 10^-6 or MAF < 0.01 in the reference panel Differences of MAF between the GWAS dataset and the reference panel > 0.16 |
Marker Number (after QC) | About 7,700,000 SNPs (hg19) |
NBDC Data Set ID |
(Click the Data Set ID to download the file) |
Total Data Volume |
ADGWAS_auto.txt (525 MB) ADGWAS_X_females.txt (17 MB) ADGWAS_X_males.txt (15 MB) |
Comments (Policies) | NBDC policy |
Participants/Materials |
8180 atrial fibrillation patients and 28,612 controls |
Targets | genome wide SNVs |
Target Loci for Capture Methods | - |
Platform | Illumina [HumanOmniExpress / HumanExome / OmniExpressExome BeadChip] |
Source | DNA extracted from peripheral blood cells |
Cell Lines | - |
Reagents (Kit, Version) | HumanOmniExpress / HumanExome / OmniExpressExome BeadChip kit |
Genotype Call Methods (software) |
minimac [imputation (1000 genomes Phase I v3)] GenCall software(GenomeStudio) |
Association Analysis (software) | mach2dat [GWAS] |
Filtering Methods |
Genotyping QC: sample call rate < 0.98, SNV call rate < 0.99, HWE P < 1 x 10^-6 in the control samples Imputation QC: HWE P < 1 x 10^-6 or MAF < 0.01 in the reference panel Differences of MAF between the GWAS dataset and the reference panel > 0.16 R square < 0.9 |
Marker Number (after QC) | About 5,000,000 SNVs |
Phenotype Data | Gender, Age |
NBDC Data Set ID / Japanese Genotype-phenotype Archive Data set ID |
[GWAS stats] (Click the Data Set ID to download the file) [Individual data sets] Phenotype: JGAD000101 Genotype: JGAD000102 |
Total Data Volume |
GWAS: 473 MB (txt) Individual phenotype-genotype data: 1 GB (txt) |
Comments (Policies) | NBDC policy |
JGAS000114 / hum0014.v6.158k.v1
Participants/Materials | 182,505 individuals (158,284 individuals for BMI study) |
Targets | genome wide SNVs |
Target Loci for Capture Methods | - |
Platform | Illumina [HumanOmniExpress / HumanExome / OmniExpressExome BeadChip] |
Source | DNA extracted from peripheral blood cells |
Cell Lines | - |
Reagents (Kit, Version) | HumanOmniExpress / HumanExome / OmniExpressExome BeadChip kit |
Genotype Call Methods (software) |
minimac [imputation (1000 genomes Phase I v3)] GenCall software (GenomeStudio) |
Association Analysis (software) | mach2qtl (v1.1.3) |
Filtering Methods |
Genotyping QC: sample call rate < 0.98, SNV call rate < 0.99, HWE P < 1 x 10^-6 QC for reference panel: After excluding 11 closely related individuals, variants with HWE P < 1.0 x 10^-6, MAF < 0.01 were excluded. QC after imputation: Variants with imputation quality of Rsq < 0.7 were excluded. |
Marker Number (after QC) | About 6,000,000 and 150,000 SNVs on autosomes and X-chromosome, respectively. |
NBDC Data Set ID / Japanese Genotype-phenotype Archive Data set ID |
[GWAS] (Click the Data Set ID to download the file) [Individual data sets] Phenotype data (BMI): JGAD000124 Genotype data: JGAD000123 |
Total Data Volume |
GWAS: 406 MB (zip) Phenotype data (BMI): 3.32 MB (txt.gz) Genotype data: 26.3GB (csv.gz) |
Comments (Policies) | NBDC policy |
Participants/Materials |
3980 POAG patients (Male: 1,997, Female: 1,983) 18,815 controls (Male: 7,817, Female: 10,998) |
Targets | genome wide SNVs |
Target Loci for Capture Methods | - |
Platform | Illumina [HumanOmniExpress / HumanExome / OmniExpressExome BeadChip] |
Source | gDNA extracted from peripheral blood cells |
Cell Lines | - |
Reagents (Kit, Version) | HumanOmniExpress / HumanExome / OmniExpressExome BeadChip kit |
Genotype Call Methods (software) |
minimac(ver. 0.1.1) [imputation (1000 genomes Phase I v3)] |
Association Analysis (software) | mach2dat(ver. 1.0.19) |
Filtering Methods |
Genotyping QC: sample call rate < 0.98, SNV call rate < 0.99, HWE P < 1 x 10^-6 QC for reference panel: After excluding 11 closely related individuals, variants with HWE P < 1.0 x 10^-6, MAF < 0.01 were excluded. QC after imputation: Variants with imputation quality of Rsq < 0.7 were excluded. We also excluded variants with |beta| > 4 in the uploaded files. |
Marker Number (after QC) |
autosomes:5,961,428 SNPs(hg19) male X-chromosome:147,351 SNPs(hg19) female X-chromosome:147,353 SNPs(hg19) |
NBDC Data Set ID |
(Click the Data Set ID to download the file) |
Total Data Volume | 113 MB(txt.zip) |
Comments (Policies) | NBDC policy |
JGAS000114 / hum0014.v8.58qt.v1
hum0014.v9.Men.v1 / hum0014.v9.MP.v1
Participants/Materials |
67,029 females with information on age at menarche 43,861 females with information on age at menopause |
Targets | genome wide SNVs |
Target Loci for Capture Methods | - |
Platform | Illumina [HumanOmniExpress / HumanExome / OmniExpressExome BeadChip] |
Source | gDNA extracted from peripheral blood cells |
Cell Lines | - |
Reagents (Kit, Version) | HumanOmniExpress / HumanExome / OmniExpressExome BeadChip kit |
Genotype Call Methods (software) |
minimac [imputation (1000 genomes Phase I v3)] GenCall software (GenomeStudio) |
Association Analysis (software) | mach2qtl (v1.1.3) |
Filtering Methods |
Genotyping QC: sample call rate < 0.98, SNV call rate < 0.99, HWE P < 1 x 10^-6 QC for reference panel: After excluding 11 closely related individuals, variants with HWE P < 1.0 x 10^-6, MAF < 0.01 were excluded. QC after imputation: Variants with imputation quality of Rsq < 0.7 were excluded. We also excluded variants with |beta| > 4 in the uploaded files. |
Marker Number (after QC) | 9,296,729 SNPs (hg19) |
NBDC Data Set ID |
menarche: hum0014.v9.Men.v1 menopause: hum0014.v9.MP.v1 (Click the Data Set ID to download the file) menarche: Dictionary file menopause: Dictionary file |
Total Data Volume |
menarche: 181 MB(txt.gz) menopause: 186 MB(txt.gz) |
Comments (Policies) | NBDC policy |
Participants/Materials | 1,026 individuals |
Targets | WGS |
Target Loci for Capture Methods | - |
Platform | Illumina [HiSeq2500] |
Library Source | DNA extracted from peripheral blood cells |
Cell Lines | - |
Library Construction (kit name) | TruSeq Nano DNA Library Preparation Kit |
Fragmentation Methods | Ultrasonic fragmentation |
Spot Type | Paired-end |
Read Length (without Barcodes, Adaptors, Primers, and Linkers) | 160 bp |
Japanese Genotype-phenotype Archive Data set ID | JGAD000220 |
Total Data Volume | 73 TB (fastq) |
Comments (Policies) | NBDC policy |
* Summarized data is available at JENGER site.
Participants/Materials | 7,104 breast cancer patients and 23,731 controls |
Targets | Target Capture |
Target Loci for Capture Methods | 11 hereditary breast cancer genes (ATM, BRCA1, BRCA2, CDH1, CHEK2, NBN, NF1, PALB2, PTEN, STK11, TP53) |
Platform | Illumina [HiSeq2500] |
Library Source | DNA extracted from peripheral blood cells |
Cell Lines | - |
Library Construction (kit name) | 1st PCR was performed with 2X Platinum Multiplex PCR Master Mix (Thermo Fisher Scientific) to amplify the target region, followed by the 2nd PCR with 8-bp barcode and adapter sequences added using KAPA HiFi HotStart DNA Polymerase (KAPA) *1 |
Fragmentation Methods | - |
Spot Type | Paired-end |
Read Length (without Barcodes, Adaptors, Primers, and Linkers) | 150 bp x 2 |
Japanese Genotype-phenotype Archive Data set ID | JGAD000209 |
Total Data Volume | 1 TB (fastq) |
Comments (Policies) | NBDC policy |
*1 Hum Mol Genet. 25,:5027-5034 (2016)
Participants/Materials |
[GWAS-1] - 2,380 T2DM with diabetic nephropathy patients - 5,234 T2DM without diabetic nephropathy patients [GWAS-2] - 429 T2DM with diabetic nephropathy patients - 358 T2DM without diabetic nephropathy patients |
Targets | genome wide SNVs |
Target Loci for Capture Methods | - |
Platform | Illumina [OmniExpressExome Beadchip / Human610-Quad BeadChip] |
Source | gDNA extracted from peripheral blood cells |
Cell Lines | - |
Reagents (Kit, Version) |
Illumina OmniExpressExome Beadchip kit Illumina Human610-Quad Beadchip kit |
Genotype Call Methods (software) |
MACH and Minimac (1000 Genomes phased JPT, CHB and Han Chinese South data n = 275, March 2012) GenCall software (GenomeStudio) |
Association Analysis (software) | mach2dat |
Filtering Methods |
sample call rate < 0.98, SNV call rate < 0.99, MAF < 0.1%, HWE P < 1 x 10-6 in control |
Marker Number (after QC) | 7,521,072 SNPs (hg19) |
NBDC Data Set ID |
(Click the Data Set ID to download the file) |
Total Data Volume | 310 MB (csv.zip) |
Comments (Policies) | NBDC policy |
Participants/Materials |
[GWAS-1] - 9,804 T2DM patients (ICD-10: E11) - 6,728 controls [GWAS-2] - 5,639 T2DM patients (ICD-10: E11) - 19,407 controls [GWAS-3] - 18,688 T2DM patients (ICD-10: E11) - 121,950 controls [GWAS-4] - 2,483 T2DM patients (ICD-10: E11) - 7,065 controls |
Targets | genome wide SNVs |
Target Loci for Capture Methods | - |
Platform | Illumina [HumanOmniExpress / HumanExome / OmniExpressExome / Human610-Quad BeadChip] |
Source | gDNA extracted from peripheral blood cells |
Cell Lines | - |
Reagents (Kit, Version) | HumanOmniExpress / HumanExome / OmniExpressExome / Human610-Quad Beadchip kit |
Genotype Call Methods (software) |
minimac [imputation(1000 genomes Phase 3)] GenCall software (GenomeStudio) |
Association Analysis (software) | mach2dat (v1.0.24) |
Filtering Methods |
Genotyping QC: exclusion criteria of GWAS1, GWAS3, GWAS4 (i) hetero count < 5 (ii) HWE P < 1.0 × 10^-6 on each chip (iii) genotype concordance rate < 0.99 with in-house WGS data (iv) SNV call rate < 0.99
exclusion criteria of GWAS2 (i) SNV call rate < 0.99 (ii) MAF < 0.01 (iii) differential missingness P < 1.0 × 10^-6 (iv) HWE P < 1.0 × 10^-6
Imputation QC: HWE P < 1 × 10^-6 or MAF < 0.01 in the reference panel Imputation quality (Rsq) < 0.3 in more than two GWAS |
Marker Number (after QC) | 12,557,761 SNPs (hg19) |
NBDC Data Set ID |
(Click the Data Set ID to download the file) |
Total Data Volume | 257 MB (txt) |
Comments (Policies) | NBDC policy |
Participants/Materials | 165,436 individuals whose smoking status is available |
Targets | genome wide SNVs |
Target Loci for Capture Methods | - |
Platform | Illumina [HumanOmniExpress / HumanExome / OmniExpressExome BeadChip] |
Source | gDNA extracted from peripheral blood cells |
Cell Lines | - |
Reagents (Kit, Version) | HumanOmniExpress / HumanExome / OmniExpressExome BeadChip kit |
Genotype Call Methods (software) |
minimac [imputation (1000 genomes Phase I v3)] GenCall software (GenomeStudio) |
Association Analysis (software) |
BOLT-LMM (v2.2) ProbABEL(v0.4.5; for X chromosome) |
Filtering Methods |
Genotyping QC: sample call rate < 0.98, SNV call rate < 0.99, HWE P < 1 x 10^-6 QC for reference panel: After excluding 11 closely related individuals, variants with HWE P < 1.0 x 10^-6, MAF < 0.01 were excluded. QC after imputation: Variants with imputation quality of Rsq < 0.7 and MAF < 0.01 were excluded. |
Marker Number (after QC) |
autosomes: 5,961,480 SNVs (hg19) male X-chromosome (Age of smoking initiation): 163,412 SNVs (hg19) female X-chromosome (Age of smoking initiation): 146,130 SNVs (hg19) male X-chromosome (Cigarettes per day): 166,111 SNVs (hg19) female X-chromosome (Cigarettes per day): 146,114 SNVs (hg19) male X-chromosome (Smoking initiation [Ever vs never smokers]): 166,138 SNVs (hg19) female X-chromosome (Smoking initiation [Ever vs never smokers]): 146,146 SNVs (hg19) male X-chromosome (Smoking cessation [Former vs current smokers]): 166,142 SNVs (hg19) female X-chromosome (Smoking cessation [Former vs current smokers]): 146,118 SNVs (hg19) |
NBDC Data Set ID |
hum0014.v14.asi.v1.zip (Age of smoking initiation) hum0014.v14.cpd.v1.zip (Cigarettes per day) hum0014.v14.ens.v1.zip (Smoking initiation [Ever vs never smokers]) hum0014.v14.fcs.v1.zip (Smoking cessation [Former vs current smokers]) (Click the Data Set ID to download the file) |
Total Data Volume | 1.9 GB(txt.gz) |
Comments (Policies) | NBDC policy |
Participants/Materials |
- WGS data (JGAD000220) of the biobank Japan project (N=1,037) - WGS data of 1KGP p3v5 ALL (N=2,504) (ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/) |
Targets |
a reference panel from WGS data (variants on autosomal chromosomes and X-chromosome) |
Target Loci for Capture Methods | - |
QC* |
We set exclusion criteria for genotypes as follows: (1) DP < 5, (2) GQ < 20, or (3) DP > 60 and GQ < 95, and regarded these genotypes as missing. Variants with call rates < 90% were excluded before variant quality score recalibration (VQSR). After VQSR, we excluded variants located in low-complexity regions (LCR), as defined by mdust software were excluded. Finally, we used BEAGLE to impute missing genotypes.
|
Deduplication* | picard (versions 1.106) |
Calibration for re-alignment and base quality* | GATK (ver.3.2-2) |
Mapping Methods* | BWA-MEM (version 0.7.5a) |
Mapping Quality* | MAPQ < 20 were excluded (HaplotypeCaller) |
Reference Genome Sequence* | GRCh37/hg19, hs37d5 |
Coverage (Depth)* | aimed at 30x depth |
Detecting Methods for Variation* | GATK HaplotypeCaller (version 3.2-2) |
Method for merging vcf files |
autosomal chromosomes: Impute2 X-chromosome: Beagle (male), Impute2 (female) |
Variant Numbers in reference panel | 61,608,817 variants (autosomal chromosomes: 59,387,070; X-chromosome: 2,221,747) |
Japanese Genotype-phenotype Archive Data set ID | JGAD000220 |
Total Data Volume | about 15 GB (vcf.gz) |
Comments (Policies) | NBDC policy |
* These processes were performed only for biobank Japan project data.
Participants/Materials | 159,095 individuals (Male: 86,257, Female: 72,838) |
Targets | genome wide variants |
Target Loci for Capture Methods | - |
Platform | Illumina [HumanOmniExpress / HumanExome / OmniExpressExome BeadChip] |
Source | gDNA extracted from peripheral blood cells |
Cell Lines | - |
Reagents (Kit, Version) | HumanOmniExpress / HumanExome / OmniExpressExome BeadChip kit |
Genotype Call Methods (software) | Minimac3 [imputation reference panel using WGS data of the biobank Japan project (N=1,037) and 1KGP p3v5 ALL (N=2,504)] |
Association Analysis (software) | BOLT-LMM (ver2.2), mach2qtl |
Filtering Methods |
Sample QCs: Exclusion criteria: 1) call rate < 98%, 2) closely related samples (PI_HAT > 0.175), and 3) outlier from Japanese cluster determined by PCA using GCTA. QC after imputation: Variants with imputation quality of Rsq < 0.3 were excluded. |
Marker Number (after QC) |
autosomes:27,211,524 variants(hg19) male X-chromosome:684,533 variants(hg19) female X-chromosome:684,533 variants(hg19) |
NBDC Data Set ID |
(Click the Data Set ID to download the file) |
Total Data Volume | about 663 MB(txt.gz) |
Comments (Policies) | NBDC policy |
Participants/Materials | 7,636 prostate cancer patients (ICD10:C61) and 12,366 controls |
Targets | Target Capture |
Target Loci for Capture Methods | 8 hereditary prostate cancer genes (ATM, BRCA1, BRCA2, BRIP1, CHEK2, HOXB13, NBN, PALB2) |
Platform | Illumina [HiSeq2500] |
Library Source | DNA extracted from peripheral blood cells |
Cell Lines | - |
Library Construction (kit name) | 1st PCR was performed with 2X Platinum Multiplex PCR Master Mix (Thermo Fisher Scientific) to amplify the target region, followed by the 2nd PCR with 8-bp barcode and adapter sequences added using KAPA HiFi HotStart DNA Polymerase (KAPA) *1 |
Fragmentation Methods | - |
Spot Type | Paired-end |
Read Length (without Barcodes, Adaptors, Primers, and Linkers) | 150 bp x 2 |
Japanese Genotype-phenotype Archive Data set ID | JGAD000288 |
Total Data Volume | 2.2 TB (fastq) |
Comments (Policies) | NBDC policy |
Participants/Materials |
41 disease (ICD10 code) Arrhythmia (I499), Bronchial asthma (J459), Atopic dermatitis (L209), Gallbladder/Cholangiocarcinoma (C23, C240), Cataract (H269), Cerebral aneurysm (I671), Cervical cancer (C539), Chronic hepatitis B (B181), Chronic hepatitis C (B182), Chronic obstructive pulmonary disease (J449), Liver cirrhosis (K746), Colorectal cancer (C189, C20), Heart failure (I509, I500), Drug eruption (L270), Uterine cancer (C549), Endometriosis (N809), Epilepsy (G409), Esophageal cancer (C159), Gastric cancer (C169), Glaucoma (H409), Graves' disease (E050), Hematopoietic tumor (C81-96), Liver cancer (C220), Interstitial lung disease/Pulmonary fibrosis (J849, J841), Cerebral infarction (I639), Keloid (L910), Lung cancer (C349), Nephrotic syndrome (N049), Osteoporosis (M8199), Ovarian cancer (C56), Pancreas cancer (C259), Periodontitis (K054), Peripheral artery disease (I709), Hay fever (J301), Prostate cancer (C61), Pulmonary tuberculosis (A169), Rheumatoid arthritis (M0690), Diabetes mellitus (E14), Urolithiasis (N209), Uterine fibroids (D259), Breast cancer (C509) |
|
Targets | genome wide variants | |
Target Loci for Capture Methods | - | |
Platform | Illumina [HumanOmniExpress / HumanExome / OmniExpressExome BeadChip] | |
Source | gDNA extracted from peripheral blood cells | |
Cell Lines | - | |
Reagents (Kit, Version) | HumanOmniExpress / HumanExome / OmniExpressExome BeadChip kit | |
Genotype Call Methods (software) |
Minimac3 [imputation (1000 genomes Phase 3 v5)] GenCall software (GenomeStudio) |
|
Association Analysis (software) | SAIGE(v0.29.4.2) | |
Filtering Methods |
QC after imputation: Exclusion criteria: Variants with imputation quality of Rsq < 0.7 |
|
Marker Number (after QC) |
autosomes: 8,712,794 variants (hg19) X-chromosome: 207,198 variants (hg19) |
|
NBDC Data Set ID | Arrhythmia | hum0014.v17.AR.v1 |
Bronchial asthma | hum0014.v17.BA.v1 | |
Atopic dermatitis* | hum0014.v17.AD.v1 | |
Gallbladder/Cholangiocarcinoma | hum0014.v17.GCc.v1 | |
Cataract | hum0014.v17.Cat.v1 | |
Cerebral aneurysm | hum0014.v17.CA.v1 | |
Cervical cancer | hum0014.v17.CeC.v1 | |
Chronic hepatitis B | hum0014.v17.CHB.v1 | |
Chronic hepatitis C | hum0014.v17.CHC.v1 | |
Chronic obstructive pulmonary disease | hum0014.v17.COPD.v1 | |
Liver cirrhosis | hum0014.v17.Cir.v1 | |
Colorectal cancer | hum0014.v17.CC.v1 | |
Heart failure* | hum0014.v17.HF.v1 | |
Drug eruption | hum0014.v17.DE.v1 | |
Uterine cancer | hum0014.v17.UC.v1 | |
Endometriosis | hum0014.v17.EM.v1 | |
Epilepsy | hum0014.v17.Ep.v1 | |
Esophageal cancer | hum0014.v17.EC.v1 | |
Gastric cancer | hum0014.v17.GC.v1 | |
Glaucoma* | hum0014.v17.Gla.v1 | |
Graves' disease | hum0014.v17.GD.v1 | |
Hematopoietic tumor | hum0014.v17.HT.v1 | |
Liver cancer | hum0014.v17.LiC.v1 | |
Interstitial lung disease/Pulmonary fibrosis | hum0014.v17.IP.v1 | |
Cerebral infarction | hum0014.v17.CI.v1 | |
Keloid | hum0014.v17.Kel.v1 | |
Lung cancer | hum0014.v17.LuC.v1 | |
Nephrotic syndrome | hum0014.v17.NS.v1 | |
Osteoporosis | hum0014.v17.OP.v1 | |
Ovarian cancer | hum0014.v17.OC.v1 | |
Pancreas cancer | hum0014.v17.PaC.v1 | |
Periodontitis | hum0014.v17.PD.v1 | |
Peripheral artery disease | hum0014.v17.PAD.v1 | |
Hay fever | hum0014.v17.Hay.v1 | |
Prostate cancer | hum0014.v17.PrC.v1 | |
Pulmonary tuberculosis | hum0014.v17.PT.v1 | |
Rheumatoid arthritis | hum0014.v17.RA.v1 | |
Diabetes mellitus* | hum0014.v17.DM.v1 | |
Urolithiasis | hum0014.v17.Uro.v1 | |
Uterine fibroids | hum0014.v17.UF.v1 | |
Breast cancer | hum0014.v18.BC.v1 | |
(Click the Data Set ID to download the file) |
||
Total Data Volume |
autosomes: about 0.8-1.3 GB each X-chromosome: about 20-30 MB each |
|
Comments (Policies) | NBDC policy |
* Data of 4 diseases were partially overlapped with those of previous releases (Glaucoma [hum0014.v7.POAG.v1], Atrial fibrillation [hum0014.v5.AF.v1], Atopic dermatitis [hum0014.v4.AD.v1], and Diabetes mellitus [hum0014.v3.T2DM-2.v1]).
DATA PROVIDER
Principal Investigator: Michiaki Kubo
Affiliation: RIKEN Center for Integrative Medical Sciences
Project / Group Name: Tailor-made Medical Treatment Program (Bio Bank Japan: BBJ)
URL: https://biobankjp.org/english/index.html
Funds / Grants (Research Project Number) :
Name | Title | Project Number |
---|---|---|
Ministry of Education, Culture, Sports, Science and Technology in Japan | Tailor-made Medical Treatment Program (the 3rd phase) | - |
Tailor-Made Medical Treatment with the BioBank Japan Project (BBJ), Japan Agency for Medical Research and Development (AMED) | Generating large-scale data of genetic polymorphism to identify disease-related genes | 17km0305002h0005 |
PUBLICATIONS
Title | DOI | Data Set ID | |
---|---|---|---|
1 | A genome-wide association study identifies PLCL2 and AP3D1-DOT1L-SF3A2 as new susceptibility loci for myocardial infarction in Japanese. | doi:10.1038/ejhg.2014.110 | hum0014.v1.freq.v1 |
2 | A functional variant in ZNF512B is associated with susceptibility to amyotrophic lateral sclerosis in Japanese. | doi:10.1093/hmg/ddr268 | hum0014.v2.jsnp.92als.v1 |
3 | Functional variants in ADH1B and ALDH2 coupled with alcohol and smoking synergistically enhance esophageal cancer risk. | doi: 10.1053/j.gastro.2009.07.070 | hum0014.v2.jsnp.182ec.v1 |
4 | SNPs in KCNQ1 are associated with susceptibility to type 2 diabetes in East Asian and European populations. | doi: 10.1038/ng.208 | T2DM (JSNP) |
5 | Common variants in a novel gene, FONG on chromosome 2q33.1 confer risk of osteoporosis in Japanese. | doi: 10.1371/journal.pone.0019641 | Osteoporosis (JSNP) |
6 | Genome-wide association studies in the Japanese population identify seven novel loci for type 2 diabetes. | doi: 10.1038/ncomms10531 | |
7 | Multi-ancestry genome-wide association study of 21,000 cases and 95,000 controls identifies new risk loci for atopic dermatitis. | doi: 10.1038/ng.3424 | hum0014.v4.AD.v1 |
8 | Genome-wide association study identifies eight new susceptibility loci for atopic dermatitis in the Japanese population. | doi: 10.1038/ng.2438 | hum0014.v4.AD.v1 |
9 | Identification of six new genetic loci associated with atrial fibrillation in the Japanese population. | doi: 10.1038/ng.3842 | hum0014.v5.AF.v1 |
10 | Genome-wide association study identifies 112 new loci for body mass index in the Japanese population. | doi:10.1038/ng.3951 | |
11 | Genome-wide association study identifies seven novel susceptibility loci for primary open-angle glaucoma. | doi: 10.1093/hmg/ddy053 | hum0014.v7.POAG.v1 |
12 | Genetic analysis of quantitative traits in the Japanese population links cell types to complex human diseases. | doi:10.1038/s41588-018-0047-6 | |
13 | Elucidating the genetic architecture of reproductive ageing in the Japanese population | doi: 10.1038/s41467-018-04398-z | |
14 | Deep whole-genome sequencing reveals recent selection signatures linked to evolution and disease risk of Japanese. | doi: 10.1038/s41467-018-03274-0 | JGAD000220 |
15 | Germline pathogenic variants of 11 breast cancer genes in 7,051 Japanese patients and 11,241 controls. | doi: 10.1038/s41467-018-06581-8 | JGAD000209 |
16 | A Variant within the FTO confers susceptibility to diabetic nephropathy in Japanese patients with type 2 diabetes | doi: 10.1371/journal.pone.0208654 | hum0014.v12.T2DMwN.v1 |
17 | Identification of 28 new susceptibility loci for type 2 diabetes in the Japanese population | doi: 10.1038/s41588-018-0332-4 | hum0014.v13.T2DMmeta.v1 |
18 | GWAS of smoking behaviour in 165,436 Japanese people reveals seven new loci and shared genetic architecture. | doi: 10.1038/s41562-019-0557-y | |
19 | Characterizing rare and low-frequency height-asssociated variants in the Japanese population | doi: 10.1038/s41467-019-12276-5 | |
20 | Germline pathogenic variants in 7,636 Japanese patients with prostate cancer and 12,366 controls. | doi: 10.1093/jnci/djz124 | JGAD000288 |
21 | In submission | doi: |
USRES (Controlled-Access Data)
Principal Investigator: | Affiliation: | Data in Use (Data Set ID) | Period of Data Use |
---|---|---|---|
Mark Daly | Broad Institute of MIT and Harvard | JGAD000101, JGAD000102, JGAD000123, JGAD000124, JGAD000144-JGAD000201, JGAD000220 | 2018/09/11-2023/07/31 |
Yukinori Okada | Department of Statistical Genetics, Osaka University Graduate School of Medicine | JGAD000101, JGAD000102, JGAD000123, JGAD000124, JGAD000144-JGAD000201, JGAD000220 | 2018/09/20-2021/03/31 |
Shigeo Kamitsuji | Statistical Analysis Division, StaGen Co., Ltd. | JGAD000123 | 2018/10/04-2019/03/31 |
Katsushi Tokunaga | Department of Human Genetics, Graduate School of Medicine, The University of Tokyo | JGAD000123, JGAD000124, JGAD000144-JGAD000201, JGAD000220 | 2018/11/13-2026/11/08 |
Tatsuhiko Tsunoda | Department of Medical Science Mathematics, Medical Research Institute, Tokyo Medical and Dental University | JGAD000123, JGAD000124, JGAD000144-JGAD000201, JGAD000220 | 2018/12/18-2020/03/31 |
Liming Liang | Harvard T.H. Chan School of Public Health, Department of Epidemiology | JGAD000123 | 2019/01/21-2021/12/31 |
Seishi Ogawa | Department of Pathology and Tumor Biology, Graduate School of Medicine, Kyoto University | JGAD000209 | 2019/02/04-2021/03/31 |
Shigeo Kamitsuji | Statistical Analysis Division, StaGen Co., Ltd. | JGAD000123 | 2019/03/13-2022/03/31 |
Takashi Kohno | National Cancer Research Institute, Division of genome biology | JGAD000123, JGAD000124, JGAD000220 | 2019/04/15-2019/12/31 |
Shigeo Horie | Department of Urology, Juntendo University, Graduate School of Medicine | JGAD000123, JGAD000220 | 2019/05/14-2024/03/31 |
Kengo Kinoshita | Tohoku Medical Megabank Organization | JGAD000220 | 2019/06/24-2022/03/31 |
Kouya Shiraishi | Division of Genome Biology, National Cancer Research Institute | JGAD000124 | 2019/08/05-2023/03/31 |
Shigeo Kamitsuji | Statistical Analysis Division, StaGen Co., Ltd. | JGAD000123, JGAD000124, JGAD000146, JGAD000148, JGAD000149, JGAD000155, JGAD000156, JGAD000157, JGAD000174, JGAD000188 |
2019/08/16-2024/03/31 |
Shigeo Kamitsuji | Statistical Analysis Division, StaGen Co., Ltd. | JGAD000123, JGAD000124, JGAD000144-JGAD000201 |
2019/08/22-2024/03/31 |
Osamu Ogasawara | Bioinformation and DDBJ Center, National Institute of Genetics | JGAD000123, JGAD000220 | 2019/10/11-2024/03/31 |
Seishi Ogawa | Department of Medical Science, Kyoto University | JGAD000102, JGAS000123, JGAD000220 |
2019/11/14-2024/03/31 |