The Biobank Japan project genotype data

Overall description:

These files were genotype data of the participants in the Biobank Japan in the Genome Studio Final Report Format.
In total, genome-wide genotype data from 182,557 samples were available.
Please note that 52 samples were genotyped by Exome array only.
Phenotype information is also uploaded via Japan Genotype-phenotype Archive (JGA) under the support of National Bioscience Database Center (NBDC).
Please note that these are Controlled-Access Data (Please see NBDC Data Sharing Policy).

Methodological description:

All DNA samples were provided by the Biobank Japan at the Institute of Medical Science, The University of Tokyo.
We performed whole-genome genotyping using Illumina’s OmniExpressExome microarray (HumanOmniExpressExome-8 v1.0 or v1.2) or the combination of OmniExpress (HumanOmniExprss-12 v1.0) and Exome (HumanExome-12 v1.0 or v1.1) microarray.
All the genotyping experiment was performed according to the manufacture’s protocol.
After the experiment, we imported idat file into GenomeStudio software (version: 2011.1.0.24550) using manifest file of each microarray.

Manifest files used for each microarray are listed below:

We performed the following quality control procedures for samples and SNPs using GenomeStudio software constructed by each microarray. We first excluded samples with call rate of less than 0.98 based on the standard cluster file provided by Illumina.
Then we performed reclustering of all SNPs using the samples with call rate of 0.98 or more.

After the reclustering procedure, we excluded SNPs that does not meet either of the following quality control parameters:

  1. Cluster Sep ≥ 0.4,
  2. AA_R Mean ≥ 0.25 and AB_R Mean ≥ 0.25 and BB_R Mean ≥ 0.25,
  3. AB_T_Mean: 0.2 ≤ and ≤ 0.8,
  4. Het_Excess: -0.3 ≤ and ≤ 0.2, and
  5. Call freq ≥ 0.99.

After the quality control procedures for samples and SNPs described above, we exported Final Report from GenomeStudio software by each whole-genome genotyping microarray.

Please see the Infinium® Genotyping Data Analysis (https://www.illumina.com/Documents/products/technotes/technote_infinium_genotyping_data_analysis.pdf) for further details of sample and SNP quality control procedures.
Lists of genotyped SNPs and related information (manifest, probe information etc.) of each microarray are available at manufacture’s website (https://support.illumina.com/downloads.html).

Uploaded files:

■ Autosomal variants:

File name Genotyping array No. of samples No. of variants
OE13_OEEv10_Cutoff.csv HumanOmniExpressExome-8v1_A 34,739 951,117
OE13_OEEv12_Cutoff.csv HumanOmniExpressExome-8v1-2_A 112,888 964,193
OE13_OE_Cutoff.csv HumanOmniExpress-12v1_J 34,878 730,525
OE13_HEv10_Cutoff.csv HumanExome-12v1_A 13,421 247,870
OE13_HEv11_Cutoff.csv HumanExome-12v1-1_A 21,112 242,901

■ X-chromosomal variants:

File name Genotyping array No. of samples No. of variants
OE13_ChrX_OEEv10_Cutoff.csv HumanOmniExpressExome-8v1_A 34,739 22,394
OE13_ChrX_OEEv12_Cutoff.csv HumanOmniExpressExome-8v1-2_A 112,888 22,927
OE13_ChrX_OE_Cutoff.csv HumanOmniExpress-12v1_J 34,878 18,055
OE13_ChrX_HEv10_Cutoff.csv HumanExome-12v1_A 13,421 5,205
OE13_ChrX_HEv11_Cutoff.csv HumanExome-12v1-1_A 21,112 5,105

Link:

Japan Genotype-phenotype Archive (JGA)
National Bioscience Database Center (NBDC)
The Biobank Japan
RIKEN Center for Integrative Medical Sciences

Reference:

If you use genotype data, please cite the following paper:

  1. Akiyama et al. Genome-wide association study identifies 112 loci for body mass index in the Japanese population. Nat Genet.(2017). doi: 10.1038/ng.3951

For the reference of the BioBank Japan project, please cite the following papers:

  1. Nagai et al. Overview of the BioBank Japan Project: Study design and profile. J. Epidemiol. 27, 2-8 (2017).

  2. Hirata et al. Cross-sectional analysis of BioBank Japan clinical data: A large cohort of 200,000 patients with 47 common diseases. J. Epidemiol. 27, 9–21 (2017).