Help me name a genome!
August 14, 2013 8:30 AM   Subscribe

I'm working on a new book and I need to name a genome. I can't handwave it-- while the book is SF, I don't want people in the know to be able to look at it like people look at Law & Order address (ie, nobody from New York would ever think that's a real address. Nobody in science would ever think that's a real genome.)

I have no idea how the naming protocol works. Like, I know that BRCA is the breast cancer genome, but I don't know why it's named that. I need a genome that I can mutate (it has to do with cognitive perception, if that helps,) and that I can also mechanically mutate with a virus (that could turn the gene on, when the default is off.)

Chances are, I sound like a total idiot right now. Please help me not sound like a total idiot.
posted by headspace to Media & Arts (11 answers total) 2 users marked this as a favorite
First: gene, not genome. Genome is the complete set.

Have you see this?
posted by supercres at 8:36 AM on August 14, 2013 [7 favorites]

Genes are often named for the protein they encode, and so you usually get acronyms. The TNFA gene makes the TNFA protein, which stands for tumor necrosis factor alpha. So start by coming up with a name for what you need the gene to do. That said, protein names, and thus gene names, can be random. There's a pretty important gene named after Sonic the Hedgehog, for example.
posted by tau_ceti at 8:44 AM on August 14, 2013 [1 favorite]

Like, I know that BRCA is the breast cancer genome, but I don't know why it's named that.

Ha, from supercres' link: BRCA is literally just short for BReast CAncer.

There's a gene called COPE so you can't use that for cognitive perception, but none called CGPR...
posted by showbiz_liz at 8:46 AM on August 14, 2013 [2 favorites]

Everybody else is covering the gene naming, so I'll address your other requirements.

If you want more background on how a virus can turn on a gene, look into how oncoviruses work, the slow transforming kind. You will want to read up on regulation of gene expression and epigenetics. Here's a recent press release on research done into gene activation.

If you want research into terminology/phrasing on how research into this kind of stuff might be done, if that is relevant to your story, read The Emperor of All Maladies: A history of cancer. Towards the end, it gets into genetic research on cancer--I'm thinking the sections on how cancer can be activated in already existing genes and mutates in response to treatments might be useful to you.
posted by foxfirefey at 8:56 AM on August 14, 2013 [1 favorite]

Seconding everything said above (e.g., I use a gene in my work called PRLR, it stands for prolactin receptor or Cytb stands for cytochrome b).

You could check out the GenBank website ( to see how abbreviations stand for names. GenBank is a repository for genes (which may be called loci as well, singular, locus) and is used, at least in the US, by pretty much all academics that publish papers that use genetics (many journals make depositing gene sequences onto GenBank a requirement for publication). The website is somewhat technical/specialized (if you don't have a background in genetics/biology), as it will list aspects of the gene like whether it's an intron (non-coding) or exon (coding) section of a gene, if it's mitochondrial versus nuclear, etc., but anyone can use it, and it might be a good place for you to mess around and get some ideas on gene names, how the abbreviations go with names, etc. You can also look up entire genomes there as well. People do not always annotate their names very well and sometimes only the abbreviation and not what it stands for is listed, so something to keep in mind if you peruse GenBank.
posted by PinkPoodle at 9:03 AM on August 14, 2013 [1 favorite]

The HUGO Gene Nomenclature Committee (HGNC) decides on the gene symbols (the abbreviations) and the full versions of the names. The official decisions will be on the web site that supercres linked to, not on GenBank.

If you want to make up your own name, the HGNC Guidelines for Human Gene Nomenclature explain how the names work. Human gene symbols should be printed in italics, all caps, so BRCA1 rather than BRCA1.

Keep in mind there are different conventions for different species or substances, so a viral gene would be named differently.
posted by grouse at 9:18 AM on August 14, 2013 [1 favorite]

Yeah, there's a lot of room for creative license when it comes to gene names (somebody compiled a top ten of the more interesting ones here). And some of them are real misnomers, too. I mean, for all of the things that TNFa does, necrosing tumours is not one of them!

There are a few things you may want to consider here: Is your gene specific to humans? Or is is present in all mammals? Most genes have homologs, which can be highly conserved (if they're really important) or not (if they're redundant, or possibly evolutionarily "new"). This kind of matters, because the naming rights often go to the first person to describe the gene (though re-naming of genes goes on all the time). If your gene is present in other animals, then animal researchers might have been the first to find it (such that the human gene equivalent might then be called "gene x homolog" (e.g. "Gardner-Rasheed feline sarcoma viral (v-fgr) oncogene homolog"). Animal researchers would likely be able to knock the gene out of mice, so they could very well name it based on the phenotype of the knockout mouse (a la the tinman mutation in flies described in the link above). If the gene has only been found in humans, then it's function would likely be less well understood, which would affect it's name. For example, if the only thing the discoverer can determine about the gene (at the time of discovery) is that it encodes a protein that binds to another protein, it might get something boring like "Fibronectin binding protein A". If the gene has come out of a screen for genes associated with cancer or some other disease, you'll might get something like BRCA mentioned above. If the only information about the gene is structural, then you'll get something even drier, like "receptor-type tyrosine-protein kinase FLT3." Finally, many genes are discovered simultaneously by multiple researchers, who each assign the gene their own name. Thus, genes frequently have lots of names, although ultimately an official name will be adopted, so this is a (relatively) short-lived phenomenon.

In other words, gene names tell a story. They are based on the nature of their discovery and the original investigations done by the discoverers to understand the gene's function. If you really want to get authentic, your gene needs a backstory. I am sure that people here can help you come up with one, but you'll need to tell us a few more things about what this gene does...

Also, when scientists describe a gene to a colleague, they will often tell the story of it's discovery and characterisation, so it's definitely something that could go into dialogue.

Good luck!
posted by kisch mokusch at 9:22 AM on August 14, 2013 [4 favorites]

That said, protein names, and thus gene names, can be random. There's a pretty important gene named after Sonic the Hedgehog, for example.

It's not actually all that random if you look at the history, which underscores kisch mokusch's excellent point. A gene called "hedgehog" was identified in flies because it made the cuticle of the larvae look spiky like a hedgehog when the gene was mutated. Much later, multiple homologs of hedgehog were found in mammals and the researchers wanted to give them more specific names. Which is how we got "Desert", "Indian", and yes, "Sonic."

(BTW, homologs are basically genes that share a lot of sequence similarity and are thought to be evolutionarily related.)
posted by en forme de poire at 9:39 AM on August 14, 2013 [1 favorite]

My quick and dirty take on this - gene names are often three or four letters with a number at the end, like SDH, MACB, TMEM7. The letters are often an acronym of the protein the gene makes or the effect the gene has.

Most combinations of a few numbers and letters won't look ridiculous - using "ZIF2" would work, and you don't have to say outright what it stands for (Zombie Inducing Factor 2?)
posted by abecedarium radiolarium at 11:53 AM on August 14, 2013

(That's good advice, but I would add to check with supercres's link to make sure you're not occupying the same namespace as a real human gene.)
posted by en forme de poire at 12:41 PM on August 14, 2013 [1 favorite]

Hard to say because I don't know that you give enough info... but cognitive perception makes me think of intellect. How about NSTN? (Einstein? Yes? No? If the answer is yes, I will take a free, autographed copy, please...)
posted by brownrd at 3:47 PM on August 14, 2013

« Older Help a business-illiterate eng...   |  I've had an iPhone 4 for a cou... Newer »
This thread is closed to new comments.