Genetic testing for nerds
August 29, 2015 4:05 PM Subscribe
I'm thinking of getting my genome sequenced by 23andme. However, I am not interested in a responsive, user-friendly experience. I want to get my raw data, upload it to Promethease, and also play around with it myself. I also have some privacy concerns, and am wary of the family/ancestry features. Help me think this through.
I don't believe there's a huge privacy risk involved in giving 23andme my genome, as long as I can trust that that they'll delete my information if I ask, and that they won't sell individually-identifiable information. Is it reasonable to assume that they'll stick to their privacy policy? Are regulations like HIPAA and GINA strong enough to protect me if they don't?
While 23andme doesn't do too many "nasty surprise" tests (e.g. no Huntington's), if something seemingly innocent that they do sequence were later implicated in disease, would I face job/insurance discrimination that I could have avoided by refusing to get my genome sequenced? (Like, being forced to disclose certain genes, but if I haven't been tested then I can truthfully say I don't know.)
A more immediate concern is the fact that the service shows you your genetic relatives. I am not interested in being connected to these people. I also don't want to get any family surprises. It's okay if these connections are stored in a database somewhere, as long as I can't see my relatives and they can't see me under my real name. If a past customer can confirm that it's possible to avoid the "DNA Relatives" features entirely, I would feel a lot more comfortable using it.
My main interest is in uploading the raw data to Promethease, since 23andme can no longer give health information. Most of the stuff I'm interested in is trivial but interesting (e.g. earwax, caffeine metabolism). I also might want to play around with bioinformatics tools, if there's anything interesting to be done with an individual genome. Suggestions also welcome for cool hobbyist projects; the tools don't need to be user-friendly at all.
I don't believe there's a huge privacy risk involved in giving 23andme my genome, as long as I can trust that that they'll delete my information if I ask, and that they won't sell individually-identifiable information. Is it reasonable to assume that they'll stick to their privacy policy? Are regulations like HIPAA and GINA strong enough to protect me if they don't?
While 23andme doesn't do too many "nasty surprise" tests (e.g. no Huntington's), if something seemingly innocent that they do sequence were later implicated in disease, would I face job/insurance discrimination that I could have avoided by refusing to get my genome sequenced? (Like, being forced to disclose certain genes, but if I haven't been tested then I can truthfully say I don't know.)
A more immediate concern is the fact that the service shows you your genetic relatives. I am not interested in being connected to these people. I also don't want to get any family surprises. It's okay if these connections are stored in a database somewhere, as long as I can't see my relatives and they can't see me under my real name. If a past customer can confirm that it's possible to avoid the "DNA Relatives" features entirely, I would feel a lot more comfortable using it.
My main interest is in uploading the raw data to Promethease, since 23andme can no longer give health information. Most of the stuff I'm interested in is trivial but interesting (e.g. earwax, caffeine metabolism). I also might want to play around with bioinformatics tools, if there's anything interesting to be done with an individual genome. Suggestions also welcome for cool hobbyist projects; the tools don't need to be user-friendly at all.
23 and me no longer shows even traits (hair, ear wax, etc.) for new members. Yes, I'm still cranky.
posted by wintersweet at 4:45 PM on August 29, 2015 [1 favorite]
posted by wintersweet at 4:45 PM on August 29, 2015 [1 favorite]
Health information is fully available for Canadian customers, and they're adding new studies and analysis to my profile constantly. So if you can get a Canadian address...
posted by blue_beetle at 4:49 PM on August 29, 2015
posted by blue_beetle at 4:49 PM on August 29, 2015
Response by poster: Their website does claim they can't test for Huntington's, because it isn't an SNP. Thanks for warning me that they test BRCA and APOE, though.
wintersweet -- you might want to upload your raw data to Promethease, although bear in mind the potential for frightening revelations. It's $5, and they have example reports on their website.
posted by vogon_poet at 5:09 PM on August 29, 2015 [1 favorite]
wintersweet -- you might want to upload your raw data to Promethease, although bear in mind the potential for frightening revelations. It's $5, and they have example reports on their website.
posted by vogon_poet at 5:09 PM on August 29, 2015 [1 favorite]
I have done this thru Ancestry.com and then have run the raw data thru Promethease. I am also involved in the Human Genome Project.
I am older and already on Medicare. If I was younger I would only do it if the DNA was not linked to me in any way. I don't trust that the current privacy systems can really protect the information. I don't have any real proof that it is not secure, I'm just old and suspicious of big business.
If you want to go through an extensive consent process sign up for the Human Genome Project. They practically talk you out of it before letting you sign up.
posted by cairnoflore at 6:59 PM on August 29, 2015
I am older and already on Medicare. If I was younger I would only do it if the DNA was not linked to me in any way. I don't trust that the current privacy systems can really protect the information. I don't have any real proof that it is not secure, I'm just old and suspicious of big business.
If you want to go through an extensive consent process sign up for the Human Genome Project. They practically talk you out of it before letting you sign up.
posted by cairnoflore at 6:59 PM on August 29, 2015
Best answer: I think you should do a little bit of extra research before you dive in-- it seems like you may have a few misconceptions about the service 23andMe is offering, and understanding it better might settle some of your concerns.
Last time I checked 23andMe does not sequence your genome or any part of it. What they do is SNP (single nucleotide polymorphism) genotyping, which looks for small genetic variations associated with specific traits. This is a decent introduction to SNPs.
To unpack this a bit:
SNP genotyping looks for specific, known single-letter (that is, single nucleotide) variations (polymorphisms) in your genome. To fabricate an example:
99% of the population has, somewhere in their genome, the following sequence:
CGGAGCAAGGTAC
The other 1% has the following sequence:
CGGAGCTAGGTAC
..and also has a distinctive third nostril. That thymine in question isn't necessarily intrinsic to the genetic variation which causes an embarrassing supernumerary nostril, but it's a handy "flag" which can be used to test for its presence.
The implications of this are:
1. The data 23andMe produces is not your personal genome, or even part of it. It's just a list of SNPs, and which variant you carry.
2. 23andMe can only look for specific, already known genetic variants/SNPs.
3. Since only known SNPs can be identified, the analysis is essentially static, and the updates they send to their customers are limited to new research on SNPs identified in the initial run. The potential to re-analyse this static list of SNPs in the light of genetic variations identified in the future is limited. I suppose there's an outside chance that a currently unknown condition could be linked to an existing SNP (I'd be interested in a real geneticist's opinion on this), but to retrieve data about SNPs identified after you purchase their service, 23andMe would need a new spit sample, and would need to run its analysis again.
Of course, it's always possible that they'll retain a vial of your spit in a freezer somewhere which could be screened for novel SNPs at a later date (or even used to sequence your genome!) but if you're paranoid about this, you had better be worried about protecting your precious bodily fluids in general.
posted by pullayup at 9:37 PM on August 29, 2015 [14 favorites]
Last time I checked 23andMe does not sequence your genome or any part of it. What they do is SNP (single nucleotide polymorphism) genotyping, which looks for small genetic variations associated with specific traits. This is a decent introduction to SNPs.
To unpack this a bit:
SNP genotyping looks for specific, known single-letter (that is, single nucleotide) variations (polymorphisms) in your genome. To fabricate an example:
99% of the population has, somewhere in their genome, the following sequence:
CGGAGCAAGGTAC
The other 1% has the following sequence:
CGGAGCTAGGTAC
..and also has a distinctive third nostril. That thymine in question isn't necessarily intrinsic to the genetic variation which causes an embarrassing supernumerary nostril, but it's a handy "flag" which can be used to test for its presence.
The implications of this are:
1. The data 23andMe produces is not your personal genome, or even part of it. It's just a list of SNPs, and which variant you carry.
2. 23andMe can only look for specific, already known genetic variants/SNPs.
3. Since only known SNPs can be identified, the analysis is essentially static, and the updates they send to their customers are limited to new research on SNPs identified in the initial run. The potential to re-analyse this static list of SNPs in the light of genetic variations identified in the future is limited. I suppose there's an outside chance that a currently unknown condition could be linked to an existing SNP (I'd be interested in a real geneticist's opinion on this), but to retrieve data about SNPs identified after you purchase their service, 23andMe would need a new spit sample, and would need to run its analysis again.
Of course, it's always possible that they'll retain a vial of your spit in a freezer somewhere which could be screened for novel SNPs at a later date (or even used to sequence your genome!) but if you're paranoid about this, you had better be worried about protecting your precious bodily fluids in general.
posted by pullayup at 9:37 PM on August 29, 2015 [14 favorites]
Of course, it's always possible that they'll retain a vial of your spit in a freezer somewhere which could be screened for novel SNPs at a later date (or even used to sequence your genome!) but if you're paranoid about this, you had better be worried about protecting your precious bodily fluids in general.
They do retain your sample "for a minimum of one year and a maximum of ten years" by default, and notify you of that in the initial setup, and allow you to request that the sample be destroyed in the Privacy/Consent part of your personal settings.
posted by russm at 9:52 PM on August 29, 2015
They do retain your sample "for a minimum of one year and a maximum of ten years" by default, and notify you of that in the initial setup, and allow you to request that the sample be destroyed in the Privacy/Consent part of your personal settings.
posted by russm at 9:52 PM on August 29, 2015
Best answer: I suppose there's an outside chance that a currently unknown condition could be linked to an existing SNP (I'd be interested in a real geneticist's opinion on this),
This is, for me, the biggest unknown and the thing that makes me a bit uncomfortable. My understanding is that these days SNPs are identified via high throughput sequencing, so we'll presumably reach a point where all SNPs and major allelic variants in the human genome are fairly well characterized, even if we don't yet know exactly what they're doing. So, then, the question is which SNPs 23andMe is checking---are they limiting themselves (for cost or technical reasons) to a circumscribed set, or are they taking a maximalist approach, including SNPs which may not yet be addressed in the literature?
posted by pullayup at 10:24 PM on August 29, 2015 [1 favorite]
This is, for me, the biggest unknown and the thing that makes me a bit uncomfortable. My understanding is that these days SNPs are identified via high throughput sequencing, so we'll presumably reach a point where all SNPs and major allelic variants in the human genome are fairly well characterized, even if we don't yet know exactly what they're doing. So, then, the question is which SNPs 23andMe is checking---are they limiting themselves (for cost or technical reasons) to a circumscribed set, or are they taking a maximalist approach, including SNPs which may not yet be addressed in the literature?
posted by pullayup at 10:24 PM on August 29, 2015 [1 favorite]
Best answer: So, then, the question is which SNPs 23andMe is checking---are they limiting themselves (for cost or technical reasons) to a circumscribed set, or are they taking a maximalist approach, including SNPs which may not yet be addressed in the literature?
From 23andme directly:
Without going into too much detail about my bioinformatics work, I can tell you that SNP annotation — associating real genomic variants with real literature — is very time-consuming, statistically complex and error-prone work. Pharmaceutical companies would love if this technology worked reliably because it would literally save them billions in R&D on candidate drugs. We're good at finding SNPs, but not yet good at figuring out what they do.
Technology and predictive difficulties aside, as far as privacy goes, with 23andme all your data gets sucked up by Google/Alphabet, which has a history of respecting personal privacy up to the line of the law, and absolutely no better. It's a simple fact that this company is run by people who have very little regard for individual privacy rights. When it comes to dealing with Google, caveat emptor.
Further, the scope of protection provided by GINA is purposefully very narrow. Exemptions allow openings for discrimination. There are also other avenues for genetic discrimination down the road that GINA (as currently written) will do nothing to protect against.
Aside from Google and GINA, issues with anonymity in genetic testing are such that there is no assurance of anonymity. You can read this PLoS paper by Homer et al. for a technical discussion.
Our collective genetic history is fascinating. It's too bad that money fucks it all up. Good luck with whatever you decide to do.
posted by a lungful of dragon at 1:07 AM on August 30, 2015 [10 favorites]
From 23andme directly:
The technology that we use, the Illumina HumanOmniExpress-24 format chip, analyzes hundreds of thousands of SNPs that cover the entire genome. Although this is still only a fraction of the more than 10 million SNPs that are estimated to be in the human genome, these were specially selected because they provide a lot of information about other nearby SNPs. This maximizes the information we can get from every SNP we analyze, while keeping the cost low.The HumanOmniExpress-24 assay finds ~713k markers (cite). Of these, ~395k are within proximal regulatory regions of annotated genes (cite) where they might be presumed to have some effect associated with functional genes in literature. Then add 23andme's 30k additional custom, annotated SNPs.
In addition, we have hand-picked more than 30,000 additional SNPs of particular interest from the scientific literature. As a result, we can provide you with unique, genetic information available through no other service.
Without going into too much detail about my bioinformatics work, I can tell you that SNP annotation — associating real genomic variants with real literature — is very time-consuming, statistically complex and error-prone work. Pharmaceutical companies would love if this technology worked reliably because it would literally save them billions in R&D on candidate drugs. We're good at finding SNPs, but not yet good at figuring out what they do.
Technology and predictive difficulties aside, as far as privacy goes, with 23andme all your data gets sucked up by Google/Alphabet, which has a history of respecting personal privacy up to the line of the law, and absolutely no better. It's a simple fact that this company is run by people who have very little regard for individual privacy rights. When it comes to dealing with Google, caveat emptor.
Further, the scope of protection provided by GINA is purposefully very narrow. Exemptions allow openings for discrimination. There are also other avenues for genetic discrimination down the road that GINA (as currently written) will do nothing to protect against.
Aside from Google and GINA, issues with anonymity in genetic testing are such that there is no assurance of anonymity. You can read this PLoS paper by Homer et al. for a technical discussion.
Our collective genetic history is fascinating. It's too bad that money fucks it all up. Good luck with whatever you decide to do.
posted by a lungful of dragon at 1:07 AM on August 30, 2015 [10 favorites]
Best answer: No expert, but one of the attorneys at my job in health policy, who is a genomic medicine policy expert, highly advised staff to avoid these kinds of tests. Basically, the current business model for them is selling their database to third parties.
The issue is that, yes for the time being this information is anonymous, but who knows in 15 years? The big question is whether laws protecting the public will keep up with the science. I don't know anything about the current legal protections, but this attorney seemed to think they are already insufficient.
http://www.scientificamerican.com/article/23andme-is-terrifying-but-not-for-reasons-fda/
posted by forkisbetter at 6:51 AM on August 30, 2015
The issue is that, yes for the time being this information is anonymous, but who knows in 15 years? The big question is whether laws protecting the public will keep up with the science. I don't know anything about the current legal protections, but this attorney seemed to think they are already insufficient.
http://www.scientificamerican.com/article/23andme-is-terrifying-but-not-for-reasons-fda/
posted by forkisbetter at 6:51 AM on August 30, 2015
Best answer: It is not excessively prudent to treat SNPchip data as potentially personally identifiable - if not now then in the future - by any sufficiently invested data-mining entity. For the immediate future, much of the possible hazard from personal genetic data stems not from what it can predict with reasonable accuracy, but from what interested entities believe it can predict. New methods for bootstrapping genetic datasets are continually being developed and refined, and it is hard to predit how a small dataset might be transformed in the future. If the moderately probable chance that this dataset could be persistently available (in a fashion deemed anonymized by present standards) would cause you to worry, I would suggest that you look into 2 options:
1. look into one-off 'factoid' tests for your primary markers of interest (several of the CYP* markers have been available at various points by various testing companies, for example)
2. if you have access to family and family history, try to estimate how much of your genotype-of-interest can be inferred through direct observation + classical pedigrees. This can be surprisingly informative for many of the more visible markers (pigmentation, earwax and tooth shape, etc) that pique people's interest, and can be a great way to learn about genetic inheritance.
posted by Svejk at 8:34 PM on August 30, 2015
1. look into one-off 'factoid' tests for your primary markers of interest (several of the CYP* markers have been available at various points by various testing companies, for example)
2. if you have access to family and family history, try to estimate how much of your genotype-of-interest can be inferred through direct observation + classical pedigrees. This can be surprisingly informative for many of the more visible markers (pigmentation, earwax and tooth shape, etc) that pique people's interest, and can be a great way to learn about genetic inheritance.
posted by Svejk at 8:34 PM on August 30, 2015
Response by poster: Everybody's answers were really helpful. I best answered the stuff a future reader will want to skim through. I'm not going to 23andme but I might look for a CYP* test from somewhere.
posted by vogon_poet at 2:53 PM on September 4, 2015
posted by vogon_poet at 2:53 PM on September 4, 2015
This thread is closed to new comments.
23andme's chip, unless it's purposefully gotten less useful, does sequence serious alleles involved in the 'nasty surprise' tests, including Huntington's, apoe4, brca, etc. When they had the health information in place, you had to go through another layer of protection for those particularly potentially consequential expressions to be revealed to you. Uploading to Promethase removes that double layer and if you've got it, it'll show up in the raw.
However, I think things like 'traits' (eg earwax) may still be available to post-2013 users. Someone else who joined after the health info cutoff may be able to update this.
This is how I dealt with my privacy concerns:
1. I bought it 'as a gift', and the giftee, which was actually me, has a false last name.
2. Re 'family and friends' sharing, you can opt out of sharing. When I log into my account and look at the 'family and friends' tab 'manage sharing,' it currently says "You currently are not sharing your genetic data with anyone. Click here if you would like to share your data with a friend or family member." So yes, you can totally avoid it.
The future legal concerns are something I did not deal with in particular. I figured GINA and the dissociation of myself from the sequence in the database were enouguh.
posted by cobaltnine at 4:35 PM on August 29, 2015