Last updated: August 20 2014
Participants in the Open Humans: Public Data Sharing study generously make their data public under the Creative Commons Zero Public Domain Dedication (v1.0 or later). The following usage guidelines are based on goodwill. These are not a legal contract, but we request you follow these guidelines when using data from our project.
If an individual has not intentionally shared their name/identity in their Open Humans profile, you should not make specific efforts to re-identify that individual from this data unless you meet the following guidelines:
If you have suggestions for changes to these guidelines, please feel free to contact us.
Last updated: February 16 2016
Members of Open Humans must choose names and usernames that meet the following guidelines. These names are public and can uniquely identify their account.
Open Humans accounts must represent individual Members, not organizations or positions within those organizations. Names and usernames must reflect this and may not imply shared use.
You may use your real name as your name or username (or both). You are not required to reveal your real name, but remember that your account's data and information might still be highly identifiable.
You may not use a name that implies you are a specific, identifiable person, unless that is your real name. If you have the same name as a well-known person to whom you are unrelated and are using your real name then you should state clearly on your userpage that you are unrelated to the well-known person.
Examples of disruptive or offensive names or usernames are names that contain or imply profanity or personal attacks.
Your name or username should not promote an organization or product.
Your name or username should not be otherwise misleading or confusing. For example, you may not have a username that leads people to believe the account has permissions it does not have (e.g. "administrator" or "moderator").
We're also flattered by the enthusiasm demonstrated by usernames similar to our own organization (e.g. "OpenHuman1"), but we believe these could be misleading and are therefore not allowed.
Open Humans has the following practices that it expects connected studies and other projects to follow.
Explain the data you'll receive
Give a plain English list of the data your project will access and store. Describe the potential sensitivity and identifiability of this data. Give these lists to your participants or users, and (if you are a study) to your IRB or equivalent ethics board.
For example, instead of saying:
"We will access and store your Hypothetical Diet Tracker App data."
"We will access and store the following data from Hypothetical Diet Tracker App: your ZIP code, food diary, and weight log data."
Explain your data security
You are responsible for how your project manages data.
Give a plain English description of how you will manage the data. Explain whether that data is identified or potentially identifiable, its sensitivity, and other security issues that may be relevant. Share this with your participants or users, and (if you are a study) with your IRB or equivalent ethics board.
Be aware of existing de-identification standards
You should be aware of what types of data are considered "identifiable" when you're deciding which data to collect and how to manage it. Although you may have access to data without explicit personal identifiers, that data can still be highly identifiable.
"De-identification" refers to processing personal data to make it very difficult or impossible to re-identify an individual. Open Humans does NOT de-identify data. The most well known standard for data de-identification is HIPAA's safe harbor guidelines.
Don't ask for more data than you need
When you're requesting data and information, be considerate. Don't needlessly increase the identifiability and/or sensitivity of the data you'll be collecting.
For example, avoid unnecessary granularity that makes data more identifiable. If someone's year of birth is sufficient for your research, don't ask for the month and day.
Share data with project members
Open Humans supports the philosophy of "equal access": when generating data about individuals, we should try to give them access to that data. For example, we would like to support a study that wished to give their participants access to resulting raw genome data.
Projects can use our APIs to upload data for their project members. Your data will be private in their account, where they will be able to manage it as an additional data source.
Organize data according to type
If you have data types that are very different, consider sharing them as separate units (e.g. "sequencing data" and "survey data") to facilitate your participants' downstream management of that data.
Minimize the use of personal data
It's trivial to identify a data set if someone's name or email is included. Avoid collecting or maintaining this information, if possible. When you do collect such information, try to minimize its use (e.g. don't include it in data analysis files).
Use HTTPS (HTTP over SSL) to encrypt interactions. This is required to protect user information, tokens, passwords, and other sensitive data that is transmitted.
If you are running your own website, your SSL should be audited for using weak encryption algorithms and support for perfect forward secrecy with a tool like Qualys SSL Server Test.
Keep secrets secret
Your project will have secret keys, codes, and tokens, that are used to authenticate identity and encrypt interactions. These MUST be kept secret (e.g. as local files or environment variable). You should use encrypted communications to share these with other administrators.
If a secret is accidentally leaked, e.g. in a code commit (even if private!), make sure removal is complete (e.g. using git-filter-branch) but – more importantly – make sure this secret is invalidated and a new key or token is generated.
Monitor and limit admin access
Have a policy for who has administrative access to servers and data. Revoke access when it is no longer needed.
Use standard software and services
Security is hard to implement correctly, and standard web frameworks and packages exist that implement these for you (e.g. password hashing). Standard services, like platform cloud hosting, can also help by implementing and updating standard security tools (e.g. SSL).
Stay up to date
Security practices constantly need updating. Be sure to update operating systems and software packages to stay up to date with the latest security updates.
If you're using passwords for account management, there is no need to store them. Use well-established salt and cryptographic hash functions (e.g. bcrypt) to verify passwords without storing them, to minimize the damage a database breach could cause.
Backup your database
Perform regular backups, and regularly test the data restoration process. It's easy to think you are performing backups correctly, until it's too late.