NBDC Research ID: hum0311.v2

Click to Latest version.

 

SUMMARY

Aims: The BioBank Japan (BBJ) is a biobank established in the Institute of Medical Science, the University of Tokyo to collect clinical information and biological materials (DNA and serum samples). It collected about 200 thousand participants of 47 diseases started in 2003 (BBJ 1st cohort), and 67 thousand participants of 38 diseases started in 2013 (BBJ 2nd cohort), both in collaboration with 12 medical centers." This project is aiming at further utilization of the materials, and clinical and genomic information managed by BBJ to contribute to precision medicine by storing, managing, and providing the materials and data, as well as identifying biomarkers associated with disease risk, prognosis, and drug sensitivity.

Methods: Asian Screening Array (ASA-24v1-0_A2)

                 Imputation results based on TOPMed r2 (GRCh38)

Participants/Materials: 11,716 + 180,882 patients from BBJ 1st cohort and 42,689 patients from BBJ 2nd cohort

URL: https://biobankjp.org/en/index.html

 

Data Set IDType of DataCriteriaRelease Date
JGAS000412 Genotype data for 11,716 patients from BBJ 1st cohort and 42,689 patients from BBJ 2nd cohort Controlled Access (Type I) 2021/11/30
JGAS000541 Imputation data and index data for 180,882 patients from BBJ 1st cohort Controlled Access (Type I) 2022/07/28

*Release Note

*Data users need to apply an application for Using NBDC Human Data to reach the Controlled-access Data. Learn more

 

MOLECULAR DATA

JGAS000412

Participants/Materials

11,716 patients from BBJ 1st cohort and 42,689 patients from BBJ 2nd cohort

ICD10: C34, C15, C16, C18-C21, C22, C25, C23, C24, C61, C50, C53, C54, C56, C81-C86, C90-C93, I63, G45.9, I67.1, G40, J45, A15-A19, J44.9, J84.1-9, I21.0-9, I20.0, I20.1, I20.8, I20.9, R00, I44.0-3, I45.5-6, I47-I49, I50, I70.9, B18.1, B18.2, K74.6, N04, N20-N23, M80-M81, E10, E11, E88.8, E78.0-5, E78.8-9, E05.0, M05-M06, J30.1, L91.0, L20, L51.1-2, L27.0, D25, N80, R56.0, H40, H25-H26, K05, G12.2, I61, C64

Targets genome wide SNPs
Target Loci for Capture Methods -
Platform Illumina [Asian Screening Array (ASA-24v1-0_A2)]
Library Source DNAs extracted from peripheral blood cells or saliva
Cell Lines -
Reagents (Kit, Version) Infinium Asian Screening Array-24 v1.0 BeadChip Kit
Genotype Call Methods (software) GenomeStudio
Marker Number (after QC) 657,060 SNVs (GRCh38)
Japanese Genotype-phenotype Archive Data set ID JGAD000529
Total Data Volume 1,020 GB (idat, csv, plink binary)
Comments (Policies) NBDC policy

 

JGAS000541

Participants/Materials

180,882 patients from BBJ 1st cohort

ICD10: A15-A16, B16-B17.0, B18.0-B18.1, B17.1, B18.2, C15, C16, C18, C22, C23-C24, C25, C33-C34, C50, C53, C54, C56, C61, C81, D25, E05, E10, E78.0-E78.5, G12, G40-G41, H25-H26, H40-H42, I20, I21-I22, I44-I49, I50, I60, I69.0, I63, I69.3, I70, J30, J41-J44, J45-J46, J80-J84, K05, K74.3-K74.6, L00-L99, L20, M05-M06, M80-M82, N04, N20-N23, N80, R00-R9

Targets genome wide SNVs
Target Loci for Capture Methods -
Platform Illumina [HumanOmniExpressExome, HumanOmniExpress, HumanExome]
Library Source DNAs extracted from peripheral blood cells or saliva
Cell Lines -
Reagents (Kit, Version) HumanOmniExpressExome-8, HumanOmniExpress-12, HumanExome-12 kit
Genotype Call Methods (software)

GenomeStudio Software

Eagle software (v2.4.1) without a reference panel

Minimac4 software (v1.0.2)

Reference Genome Sequence TOPMed reference panel (Version R2 on GRC38)
Filtering Methods

Before imputation, we excluded SNPs using the following criteria:

    - Heterozygosity count for each chip < 5

    - P-value for Hardy–Weinberg equilibrium (HWE) for each chip < 1.0 x 10^-6 *

    - Genotype concordance rate with whole-genome sequencing (WGS) for 939 samples < 99.5% and its non-reference discordance rate >= 0.5%

    - Lower call rate SNPs if the position was the same when merging datasets

    - Call rate < 99%

   * P-values for chrX SNPs were calculated by using female samples

We also excluded samples using the following criteria :

   - Call Rate < 98%

   - Samples whose inferred sex was not matched with the clinical information

   - Lower call rate samples for duplicated or monozygotic twin in the dataset

   - Outliers from East Asian clusters from principal component analysis with 1KGp3v5 samples.

Marker Number (after QC)

autosomes: 515,587 SNVs (GRCh38)

X-chromosome: 11,140 SNVs (GRCh38)

Japanese Genotype-phenotype Archive Data set ID JGAD000660
Total Data Volume 11.1 TB (vcf, tbi)
Comments (Policies) NBDC policy

 

DATA PROVIDER

Principal Investigator: Koichi Matsuda

Affiliation: Graduate school of Frontier Science, The University of Tokyo

Project / Group Name: Management of disease-oriented biobank in Japan for utilization

URL: https://biobankjp.org/en/index.html

Funds / Grants (Research Project Number):

NameTitleProject Number
Biobank - Construction and Utilization biobank for genomic medicine REalization (B-Cure), Japan Agency for Medical Research and Development (AMED) Management of disease-oriented biobank in Japan for utilization JP20km0605001
Platform Program for Promotion of Genome Medicine, Japan Agency for Medical Research and Development (AMED) Phenotype-wide association study of 180,000 Biobank Japan samples using high density imputation of TOPMED reference panel JP21km0405215

 

PUBLICATIONS

TitleDOIData Set ID
1
2

 

USERS (Controlled-Access Data)

Principal InvestigatorAffiliationResearch TitleData in Use (Data Set ID)Period of Data Use