To obtain information on the probe sequences contained on the GWAS array, we used BLAST.
The uploaded files were generated for the genotype data of the Biobank Japan (Study ID: JGAS00000000114).
These results will help to convert genotypes into forward strand.
Version of blast: 2.2.26
Version of reference: GRCh37.3
We run blast using the following command:
blastall \
-F F \
-i ${TopGenomicSequence_in_manifest_file} \
-p blastn \
-m 0 \
-e 1e-40 \
-d ${ncbi_build37.3} \
-o ${result}
After that, we excluded the information by the following criteria:
| File name | No. of variants |
|---|---|
| HumanOmniExpressExome-8v1_A_FullBlastInformation.txt | 951,116 |
| HumanOmniExpressExome-8v1-2_A_FullBlastInformation.txt | 964,193 |
| HumanOmniExpress-12v1_J_FullBlastInformation.txt | 731,442 |
| HumanExome-12v1_A_FullBlastInformation.txt | 247,870 |
| HumanExome-12v1-1_A_FullBlastInformation.txt | 242,901 |
| No. of column | column | description |
|---|---|---|
| 1 | #ID | Variant ID in illumina manifest file |
| 2 | MapType | Please see the descriptions shown in the table below |
| 3 | BlastChr | Chromosome estimated by blast |
| 4 | BlastPos | Chromosomal position estimated by blast |
| 5 | IlmnAllele1 | Allele1 in illumina manifest file (shown as ‘A’ in FinalReport) |
| 6 | IlmnAllele2 | Allele2 in illumina manifest file (shown as ‘B’ in FinalReport) |
| 7 | ForwardAllele1 | Allele1 estimated by blast (forward strand of the reference, shown as ‘A’ in FinalReport) |
| 8 | ForwardAllele2 | Allele2 estimated by blast (forward strand of the reference, shown as ‘B’ in FinalReport) |
| 9 | NewID | Variant ID in database (1000 genome project phase3 or dbSNP144) |
| 10 | Source | Source of NewID (1000 genome project phase3 or dbSNP144) |
We defined the result of BLAST as follows:
| MapType | defenition |
|---|---|
| unmap | Probe sequence was not aligned to the reference sequence |
| multimap | Probe sequence was aligned to the refernce sequence twice or more |
| singlemap | Probe sequence was uniquely aligned to the reference sequence |
Japan Genotype-phenotype Archive (JGA)
National Bioscience Database Center (NBDC)
The Biobank Japan
RIKEN Center for Integrative Medical Sciences
BLAST