Automtaed Drug Discovery?
September 23, 2009 8:58 AM   Subscribe

A question about automated drug discovery

What are the obstacles to completely (or nearly completely) automated drug discovery, where machines use AI algorithms to come up with potential drugs, the molecules are produced, and then tested against an array of targets (similar to DNA microarrays)?

posted by mpls2 to Science & Nature (13 answers total) 5 users marked this as a favorite
This looks like a three phase problem, roughly speaking: Design molecule. Create molecule. Test molecule.

Design molecule — you have at least two approaches here. A) take pre-existing molecule of known utility and, say, substitute a sulphur for an oxygen. In other words, variations. That's probably already in the works somewhere. B) Look at pre-existing protein, substrate, or whatever, then design a molecule to fit into it. This would be hard, because it is non-trivial to work backwards from a final shape to, say, a series of amino acids which fold into that shape. It might be easy for the simpler chemicals, but we've probably already mined that to some degree.

Create molecule — for molecules in category A, you could take pre-existing procedures and attempt to modify them as a starting point, but I am guessing that is full of surprises. For molecules in category B, if you are producing proteins, that might be not as hard as you might expect. Once you had the sequence of amino acids, you could work backwards to the DNA or RNA encoding, and it would be theoretically possible (though non-trivial) to produce that and attempt to splice it into some kind of workhorse bacteria good for small-scale tests. I've been following the production of one such protein and this is still a hard thing to scale up, in some cases taking the work of years to produce enough for even a pilot study.

Test molecule — I don't even know where to begin on this, but once you take it all the way out to human testing, this strikes me as Really Hard. It's difficult for humans to design good tests for some drugs; it is also difficult to teach computers to do things we ourselves have not mastered, particularly if the process we are describing is still a human process.

This is all utterly handwavy, but this seems fairly far out both from an AI (and a lesser extent, robotics) level. Automated chip design is comparatively easier, and that's not exactly a walk in the park.
posted by adipocere at 9:11 AM on September 23, 2009

You'll need to clarify the question. Combinatorial chemistry methods produce and test hundreds of thousands of compounds ever year. There are no fundamental obstacles.

Did you mean how could it be improved? If so, along what lines? The number of compounds, the diversity of compounds, the usefulness of the compounds, the cost?

One current obstacle is that so far combinatorial chemistry has not been particularly fruitful. Although it's been in heavy use since the mid-90s, it has only produced a relative handful of promising compounds, and so far only one has become an approved drug.
posted by jedicus at 9:15 AM on September 23, 2009

Molecules in solution are really complicated, as is predicting their interaction. Cells and life are extremely complicated. Many of the problems that you want to address are either not that well understood or have fundamental tradeoffs, like drugs for depression and cytotoxic agents for cancer. How to make a given compound is not always obvious.

In reality, if you come up with a specific molecule that you want to target, pharmaceuticals do have giant compound libraries that they screen for binding / interaction in a parallel way after doing some basic computational screens. If you have a drug and want to make it better, computation can come up with something similar and maybe better. That's part of why we now have 800 statins.
posted by a robot made out of meat at 9:15 AM on September 23, 2009

I am not a pharmacologist, but here is my understanding of the current issues in the field. Someone will likely be along to correct my misconceptions and distortions. Offhand, the issues include the development of the AI algorithms themselves, the development of syntheses for each of the predicted compounds, and the development of functional target arrays.

The first is an extremely complicated problem: while it's typically possible to describe how a drug binds to a target receptor post hoc, the models of protein-drug interactions currently available aren't adequate to the task of predictive design, at least in most cases. Traditionally, what has been done is to place the synthesis step first: generate large libraries of related compounds, then screen them against your targets and try to identify structure-activity relationships from there; the best 'hits' can then be derivativized further into new libraries to try to increase their activity.

Another approach which has gained traction lately is to let the 'design' be done by a biological system: that is, to make an array of antibodies against the target and screen those. Mammalian immune systems have excellent systems built-in for creating high-affinity compounds, but the disadvantage is that the resultant product is very expensive to produce in large scale and will have very specific pharmacokinetic properties.

The problem of creating screening arrays is complicated by the fact that activity usually includes biological response, not just binding, so you typically can't just array out the receptor of interest like a microarray. Cell-based assays, where the receptor is expressed in a cell line and the resultant signaling pathway coupled to a reporter, are amenable to this kind of large-scale screening, but it's non-trivial to set up reporters if your receptor falls outside of the 'normal' receptor classes (like G-protein coupled receptors, for which there are off-the-shelf solutions available.)
posted by monocyte at 9:26 AM on September 23, 2009

I'm not sure what you mean by "AI algorithms". Combinatorial chemistry is a mature field and it has no shortage of algorithms for designing drugs. Candidate molecules are produced at a rate of millions/year.

Otherwise, things already work pretty much as you describe them in your question. The production and testing is done using robots.
posted by mr_roboto at 10:16 AM on September 23, 2009

Response by poster: Thanks for your fantastic answers so far.

One problem that seems so intriguingly simple, but apparently difficult is the one of cancer drug therapy (i guess that's "chemotherapy"), where the problem (as I understand it, but I'm definitely no expert) is to find a drug that selectively kills cancer sells but leaves normal cells alone. It just seems like that's a problem that highly amenable to automation and that if you just try enough molecules, you're inevitably going to hit on one that does the job.

Again, I'm no expert, but I'm intrigued why progress in that area seems so frustratingly slow.
posted by mpls2 at 10:42 AM on September 23, 2009

Best answer: I'm intrigued why progress in that area seems so frustratingly slow

The biggest problem is that if you just throw together a few million random compounds you'll find that the vast majority are either inert or toxic. Even for a compound that is both effective and non-toxic, it's common for the vast majority of nearly identical compounds to either be inert or toxic. Some drugs, like the statins mentioned earlier, seem to lend themselves to variation, but most do not.

the to find a drug that selectively kills cancer sells but leaves normal cells alone. It just seems like that's a problem that highly amenable to automation and that if you just try enough molecules, you're inevitably going to hit on one that does the job.

Right now automated testing only checks for basic indicators of biological activity. It does not test all possible effects a compound could have because organisms (especially ones like people) are complex and the way a given drug works is often unintuitive or poorly understood, and thus difficult to test for.

For example, the mechanism of action of minoxidil (Rogaine) is still not understood. Since we don't know how it works, we can't devise a test to see if other compounds might work in the same way. Well, you can do the obvious test of applying the compound to a bald rat and seeing if hair grows back, but that's time-consuming, expensive, and difficult to automate.

So, for an example close to your question: not all cancer cells are alike, and different chemotherapy drugs are more or less effective for different kinds of cancers. Testing all those different kinds of cancers would be expensive and difficult, if not impossible in the case of a drug that only worked well in the whole body but not on an isolated tissue sample.

Another is that different kinds of cancers are in different parts of the body. Depending on how the drug is treated by the body, it may be more or less able to get to a given part of the body. For example, drugs that are primarily excreted in the urine unmetabolized can be very effective against diseases that affect the urinary tract. Along those same lines, it can be very difficult to treat brain cancer with drugs because very few drugs can pass the blood brain barrier. Thus, a drug might test as very effective in vitro but be useless in vivo because it's very hard to get the drug to the part of the body it would be effective in, or difficult to do so without the dosage becoming toxic.

Some of the most promising research in chemotherapy is not to try to find a magic bullet compound that kills only cancer cells but rather to get better at delivering existing chemotherapy compounds to cancer cells and only cancer cells. This can be done by attaching the drug to a ligand that binds more readily with cancer cells than normal cells, or it can be done by distributing the drug throughout the body and then using something like ultrasound to activate the drug in a specific area (e.g., a tumor).

Unfortunately, as indicated above, those kinds of approaches do not lend themselves to automated testing.
posted by jedicus at 11:12 AM on September 23, 2009 [1 favorite]

Best answer: if you just try enough molecules, you're inevitably going to hit on one that does the job

Let's start there. Okay, so how many molecules could possibly be made through combinatorial chemsitry that are of drug-like size and basic properties? Why, it's about 10^200. Is that a big number? Oh yeah. It's so big the universe couldn't hold all of those potential molecules - not even close. You can read some about that one very tiny part of the drug discovery problem here.

Let's then say you have all those compounds and you exclude, using well-known rules, the very obviously toxic ones. Because this is still a mind-boggingly large number, you want to narrow it down using a selection of other well-known rules, and perhaps tailor your search to a particular shape, basic chemical structure, or set of surface properties. Let's say a computer can do this. Again, you're dealing with such a large number that running programs to do this are very expensive in terms of computational cost. How expensive? You're talking days, on massive clusters, for molecule libraries that are vastly smaller than 10^200.

And those are just the first steps in intelligent drug design.
posted by Cuppatea at 11:33 AM on September 23, 2009 [2 favorites]

Best answer: The long, hard, and expensive part isn't discovering candidate molecules, it's developing them into safe and effective drugs. Let's say you've got a candidate that shows great binding for your target. Now you've got to make sure that:

1) The candidate shows good oral bioavailability—i.e., it can pass from the gut to the bloodstream in significant quantities. Drugs that can't be taken orally generally won’t sell enough to offer a return on investment.
2) The candidate distributes properly to the location of the target. This is particularly troublesome for psychoactive drugs, as most large molecules will not pass through the blood-brain barrier.
3) The candidate has strong specificity for its target. A drug which binds to several targets is likely to have more side effects, which could effect its tolerability.
4) The candidate is nontoxic. This is not always easy to predict.
5) The candidate drug shows a good metabolic/elimination profile. A drug which is broken down too quickly or too slowly can be problematic. Its metabolites may be bioactive too, so these also have to be tested.
6) The drug can be synthesized in an easy and cost-efficient manner. A wonder drug is worthless if it can't be mass-produced at a reasonable cost.

If the candidate does not meet all of these conditions (and it never does initially), its structure has to be modified. But a modification to improve its absorption might, for instance, decrease its activity, or increase its toxicity. So there's a constant back and back-and-forth, involving many different iterations of synthesis and testing. Some of these steps can be automated, but many of them cannot. Many projects are abandoned because they drag on too long or reach a dead end.

You also have to keep in mind that the way a drug performs in an assay doesn’t necessarily translate to how it performs in an organism. Similarly, the way a drug performs in an animal model doesn’t necessarily translate to humans. From what I understand, there are actually a surprising number of cancer drugs that show efficacy in animal models of cancer, but which, for one reason or another, don’t seem to work in humans. In general, most drugs that are developed never make it past clinical trials.
posted by dephlogisticated at 12:35 PM on September 23, 2009 [1 favorite]

It’s also worth noting that most drug companies don’t do basic research. It’s too expensive, with too little chance of a payoff. NIH and academia are the ones that do exploratory research, investigating disease mechanisms and identifying potential drug targets. Almost all for-profit pharmaceutical research starts with a target that’s already been identified.
posted by dephlogisticated at 12:46 PM on September 23, 2009 [1 favorite]

Agreeing with everybody above - The main stumbling block is that, even if you can generate theoretical models on the fly, our modelling of the human system is way too incomplete to effectively evaluate the results.

As far as automated testing - I took a tour of one of ******'s GMO dev laboratory this sumer, and they've gotten the biological testing down to an incredibly automated business. Each modified seed/crossbreed is ID tracked from planting, grown in a barcoded soilpot that is enclosed to be handled in an automated manner: it's ferried out of the greenhouse, photographed, sampled, watered, and returned, on an incredibly tight timetable without human intervention - Which lets them brute-force through a lot of the time consuming aspects of development.

Not really applicable to humans, but if there were any way to avoid all that, they wouldn't have sunk millions into building it.
posted by Orb2069 at 1:14 PM on September 23, 2009

Lots and lots of good stuff in In the Pipeline about the drug discovery business today - some posts address these very issues.
posted by lalochezia at 2:13 PM on September 23, 2009

I keep trying to find a good way to explain this, and keep failing. So instead, here - play this game for a while! What you'll be doing is folding proteins. This is something a lot of computer cycles have been plowed into and, well, not much has happened as a result. It appears that computers are really good at finding local minima but do a relatively crappy job at the whole global minima. High school students with web connections, on the other hand, are pretty good at it.

And that's just finding stable conformations of proteins we already know the amino acid sequence of. Imagine picking and choosing amino acids and you went along and then hoping you got something that worked. There is one particular site one version of botulism toxin where one amino acid makes it one of the most deadly toxins known, the 19 others render it effectively inert.
posted by Kid Charlemagne at 6:46 PM on September 23, 2009 [1 favorite]

« Older Vietnamese TV, anyone?   |   Please help me find this documentary Newer »
This thread is closed to new comments.