June 10, 2019
Ver. 4.0
Introduction
The National Bioscience Database Center (NBDC) of the Japan Science and Technology Agency operates the NBDC Human Database in accordance with the NBDC Guidelines for Human Data Sharing (hereinafter, the Data Sharing Guidelines). This “NBDC Security Guidelines for Human Data (for Data Users)” (hereinafter, the User Security Guidelines) provides the minimum set of requirements that should be fulfilled in order to safely utilize controlled-access data defined in the Data Sharing Guidelines for the purpose of research activities, while protecting data confidentiality.
The controlled-access data may contain data that could be used to identify individuals in combination with other information. Therefore, measures must be implemented as required for the security level (standard-level (Type I) or high-level (Type II)) designated by a data submitter for each data set.
Because the information technology (IT) environments surrounding data users are diverse and ever-changing, merely complying with the User Security Guidelines may not be sufficient for data security. Data users are responsible for understanding their IT environments to be used for saving and calculating controlled-access data well, and taking additional security measures as deemed necessary, e.g., by referring to the security rules defined by the administrator of each IT environment as well as other guidelines[1][2].
The User Security Guidelines will be updated appropriately in response to IT developments.
1. Definitions
- Controlled-access data
- The “controlled-access data” as defined in the Data Sharing Guidelines.
- Principal investigator (PI)
- The “PI” as defined in the Data Sharing Guidelines.
- Data user
- The “data user” as defined in the Data Sharing Guidelines.
- Data server (see Figure 1)
- A computer for data users to store and calculate controlled-access data, owned by the data user or the organization to which data users belong, or the “available server outside of affiliated organization (hereinafter, “off-premise-server”)” as defined in the Data Sharing Guidelines. In the IT environment including the data server, it is necessary to satisfy the following (1) to (4) as preconditions (except in the case of using only the off-premise-server).
- (1) Devices with high mobility, such as notebook PCs that are at high risk of loss or theft are not used.
- (2) The equipment of the data server and the storage device / medium storing the data are managed by the organization that owns them.
- (3) When installing the data server in a LAN, the LAN must be owned by the data users’ affiliated organization. In addition, on the LAN on which the data server is installed (hereinafter, data server-installed LAN), a firewall that restricts communication between the external network and the data server-installed LAN must be installed by the network administrator, and access from/to the outside is kept necessary minimal (Example: IP address and port of source and destination are limited) to maintain high security.
- (4) In the data server-installed LAN, if there is a computer used by a person other than the data users, the communication with other computers is appropriately managed by the firewall function.
- A computer for data users to store and calculate controlled-access data, owned by the data user or the organization to which data users belong, or the “available server outside of affiliated organization (hereinafter, “off-premise-server”)” as defined in the Data Sharing Guidelines. In the IT environment including the data server, it is necessary to satisfy the following (1) to (4) as preconditions (except in the case of using only the off-premise-server).
- Data access terminal (see Figure 1)
- A device that is for data users to access data in the data server and that does not permanently save the data locally. When transmitting data between the data access terminal and the data server, via a communication path outside the data server-installed LAN, it is necessary that all communication paths are encrypted with sufficient strength or that the data themselves are encrypted before being transmitted.
Figure 1 Data server-installed LAN, off-premise-server, data server and data access terminal
2. Measures to Be Taken under Standard-Level (Type I) Security
2.1 Basic Rules for Data Use
Data users must use the controlled-access data based on the following basic rules.
- Data users must store the controlled-access data in the data server and, in principle, must not move outside of the server.
- In cases where it is unavoidable to temporarily move the controlled-access data outside of the data server but within the data server-installed LAN, data users must delete the data outside of the server promptly after use in a way that does not allow restoration.
- Data users must not duplicate data except for the following cases. In any case, however, duplicated data must be deleted promptly after use in a way that does not allow restoration.
- Creation of a backup copy of the data
- Temporary duplication for data transfer
- Temporary duplication performed by software
- Access to the controlled-access data is granted exclusively to the data users and must be conducted solely from data servers or data access terminals.
- Because IT environments surrounding data users are diverse and ever changing, data security is not necessarily guaranteed only by complying with these guidelines. Therefore, data users must understand the IT environment used for data storage and data calculation well, and take additional security measures as deemed necessary based on the security rules specified by the administrator of each IT environment and other guidelines[1][2].
2.2 What the Principal Investigator Must Do
Data use in general
- The PI should ensure that all data users fully understand and comply with the User Security Guidelines (For Data Users).
- The PI should confirm that the data users have received an education on information security implemented by their affiliated organization or the like.
- The PI should keep a record of information regarding data users and the data server (including information on the data storage place in the file system) in an electronic file or the like accessible to only data users, and update the record every time a change occurs. The record must be managed so that the update history can be reviewed.
- The PI should accept an audit conducted by the NBDC Human Data Review Board or a third party commissioned by the NBDC with regard to the state of implementation of security measures.
- The PI should submit Form 5 (Checklist for the NBDC Security Guidelines for Human Data) to the NBDC Human Data Review Board Office at the time of application for data use and, in principle, every August thereafter. However, if the end of the first August comes within six months from the starting date of the data use, the submission of the form in that August may be exempted.
- In case of a security incident such as data breach, the PI must follow the procedure described in “Responsibilities of Data Users” in the Data Sharing Guidelines and take measures such as notification to the NBDC.
Data server
When an “off-premise-server” is used, the PI must clarify the responsibility sharing with the “off-premise-server“ by means of the server usage rules etc.
- The PI should prepare for a server (including a virtual server) and file system that are dedicated to the study as described in the Application Form for Data Use. When there is no choice but to use a server shared with other persons who are not data users, the access to the folders containing the controlled-access data should be limited only to the data users.
- If a computer used by a person other than data users exists within the data server-installed LAN, the PI should at least enable the firewall functions provided by the operating system (OS) (e.g., iptables in Linux) and restrict communication from/to the inside of the data server-installed LAN properly.
- The PI should not allow sharing of a user ID or a password for the data server, even among data users. In addition, the PI must set a sufficiently strong password that cannot be guessed by others. (It must be at least 8 characters long. It is desirable to combine numbers, upper case letters, lower case letters and symbols. Do not use those that are easy to guess such as name, phone number, birthday, and the like.)
- The PI should apply the latest security patches insofar as possible for all software installed on the data server.
- The PI should not install unnecessary software, particularly, file sharing software (also called file exchange or P2P software; e.g., Winny, BitTorrent).
- The PI should install antivirus software, and perform virus scan at once whenever moving a file from the outside of the data server. The PI must keep the antivirus software and virus definition file up to date.
- The PI should not start unnecessary processes as many as possible when the OS boots up etc.
- Desirably, as security monitoring, the PI should periodically acquire and analyze various logs of the data server.
- When discarding a device that saved the controlled-access data, the PI should initialize the data storage in a way that cannot be restored.
- In case of a security incident such as data breach, the PI should disconnect the relevant device immediately from the data server-installed LAN.
2.3 What the Data User Must Do
- The data user should receive an education on information security implemented by the affiliated organization and the like.
- When logging in to the data server from a data access terminal via computational network outside the data server-installed LAN, the data user should encrypt all the communication paths using a sufficiently strong encryption method every time data are transmitted between the data access terminal and the data server or encrypt the data themselves before transmitting them. It is desirable to perform similar encryption when logging in to the data server from the inside of the data server-installed LAN.
- The data user should not access data from a terminal application on a device that can be used by many and unspecified persons (e.g., a PC in an Internet cafe).
- The data user should apply the latest security patches to the data access terminal whenever possible.
- When leaving a terminal, the data user should log out from the data server or lock the terminal. In addition, the terminal should be configured so that the screen is locked after a certain period of inactivity (around 15 minutes).
- The data user should not copy or save data displayed on a data access terminal screen to the local disk. It is desirable to use a terminal application which does not permit copying and saving of data displayed on the terminal screen to the local disk.
- The data user should disable a cache function, if any, which automatically saves data on the data access terminal.
- When obtaining a backup, the data user should ensure that one of the following requirements is met:
- The backup is saved on the data server.
- When the backup is saved in a mobile device (e.g., a tape, USB memory, CD-ROM, notebook PC), the data is encrypted and deleted after use in a way that does not allow restoration. A record of information regarding the mobile device should be kept, e.g., in an electronic file accessible to only the data user, to minimize the risk of theft or loss and to enable early detection of an incident of data theft or data loss.
- When it is inevitable to use a mobile device for temporary data transfer, the data user should handle the data in the same way as backup data.
- When it is inevitable to print data out, the data user should strictly manage the printout to protect the confidentiality of the data, and shred the printout after use.
- After finishing data use, the data user should delete all data including all backups from all the devices in a way that does not allow restoration of the data. If the data cannot be deleted by the above method, such as for paper or a mobile device, the data should be physically destroyed by cutting or the like. It is preferable to delete any temporary file that was generated during computations as soon as it becomes unnecessary.
- In case of a security incident such as data breach, the data user should immediately disconnect the relevant device from the data server-installed LAN and report the incident to the PI. In case where an “off-premise-server” is used, the data user should immediately take actions according to the server usage measures etc.
3. Measures to Be Taken under High-Level (Type II) Security (except for using only an “off-premise-server”)
In addition to the measures listed in the previous section, “2. Measures to Be Taken under Standard-Level (Type I) Security,” the following measures must be taken with regard to the data server.
- The PI should place the data server in a server room that meets all of the following requirements.
- (1) Access to the room is limited, using multi-factor authentication with at least two of the following three authentication methods.
- Biometric authentication (e.g., vein, fingerprint, iris, and face recognition).
- Property-based authentication (e.g., IC card, one-time password, and USB token).
- Knowledge-based authentication (e.g., password).
- (2) Record of access to the room is automatically obtained and made available for later audit.
- (3) The server room must be dedicated to the purpose as described in the Application Form for Data Use. If a dedicated server room cannot be set up, the data server must be stored in a locked dedicated server rack.
- (1) Access to the room is limited, using multi-factor authentication with at least two of the following three authentication methods.
4. Contact Information for Inquiries about the User Security Guidelines
The NBDC Data Sharing Subcommittee Office
http://humandbs.biosciencedbc.jp/en/contact-us
References
[1] NCBI. NIH Security Best Practices for Controlled-Access Data Subject to the NIH Genomic Data Sharing (GDS) Policy. (Online) March 9, 2015.
https://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/GetPdf.cgi?document_name=dbgap_2b_security_procedures.pdf
[2] Ministry of Health, Labor, and Welfare. Iryojoho shisutemu no anzenkanri ni kansuru gaidorain (Guidelines for Security Management for Medical Information Systems) [in Japanese]. Version 5, May 2017.
http://www.mhlw.go.jp/file/05-Shingikai-12601000-Seisakutoukatsukan-Sanjikanshitsu_Shakaihoshoutantou/0000166260.pdf
Special Notes on Revisions
Amendments to Ver. 3.0
As a data server that uses controlled-access data, in addition to a server owned by the data users’ affiliated organization, the “off-premise-server” is made usable. Also, the security items are revised.
Amendments to Ver. 2.0
The term “Open data” in Ver. 2.0 is replaced by “unrestricted-access data” in Ver. 3.0.
Amendments to Ver. 1.0
In Ver. 1.0, Type II security required only biometric authentication; in Ver. 2.0, either property-based or knowledge-based authentication is additionally required even when biometric authentication is performed. If a room access control system has already been installed in accordance with Ver. 1.0, the system must be updated in compliance with Ver. 2.0 at an appropriate time (e.g., at the time of authentication device renewal or upgrade).