Wednesday, February 6, 2013

Wanna buy some tetrahedral centers?

Chemists: Devise syntheses for the following molecules in enantiopure form, starting from affordable, commercially available precursors.* Oh, and make sure your routes are three steps or fewer.


As conventional total syntheses, of course, that would be a tall order, and each molecule would take a substantial amount of time and effort. But in an interesting new paper (behind paywall at Nature Chemistry) by Paul Hergenrother's group at the University of Illinois at Urbana-Champaign, the authors do just that, as well as preparing numerous other, equally complex molecules. The effort introduces a new variant on diversity-oriented synthesis (DOS) which the authors coin "Complexity to Diversity"--or CtD.

Complexity to Diversity


What CtD entails is this: the authors take stereochemically defined, readily available natural products (here from three classes of biosynthetic small molecules) and perform skeletal transformations on them, using the complexity (chirality and ring structure) present in the molecules to create new compounds that have unusually high numbers of stereocenters and structural sophistication. Moreoever, the compounds are produced in very few steps (from one to about five, averaging three), meaning that regardless of individual stepwise yield, any of the materials may be obtained in quantities sufficient for biological testing or (importantly to chemists) full analytical characterization (check out the SI; there's a lot of NMR data and even a crystal structure!). (This, of course, stands in stark contrast to total synthesis approaches, wherein 3 mg may be the sum total available at the end and making more means gallons of tears and sweat).

At first, the idea of using natural products (typically thought up as targets) as starting materials may sound odd, but (1) that's how nature does it; and (2) semi-synthesis from easily-procured natural materials is a common strategy, the most common example being paclitaxel. One note, in case this wasn't clear: the authors aren't proposing this as an alternative to traditional total synthesis; rather, it's a forward approach designed to generate a library of novel compounds.

The authors employed three starting materials readily available on multigram scales: gibberellic acid (pronounced like "jib", not "gib"), adrenosterone, and quinine. These each represent a major class of biosynthetic natural products: terpenes, steroids, and alkaloids, respectively. Of course, each of these compounds have been the subject of total synthesis efforts (indeed, quinine is covered in KCN's Classics in Total Synthesis Vol. II, while adrenosterone pops up in Carreira's Classics in Stereoselective Synthesis).


The key was the use of structurally transformative reactions (i.e. ring distortions, as in the title of the paper). Take the following example: adrenosterone was submitted to sodium azide and sulfuric acid, giving an interesting tandem ring-expansion (Schmidt reaction) and ring-cleavage. This product (already non-trivially different from adrenosterone in terms of both ring structure and functional group presence) was subjected to a Luche reduction of the unsaturated ketone, giving stereoselectively an alcohol which was then acetylated.


Another example (this one from gibberellic acid): an initial treatment of gibberellic acid with aqueous base resulted in allylic rearrangement of the lactone to give the trisubstituted alkene. The carboxylic acid moiety was then subjected to amidation and a subsequent dual-purpose treatment with in situ-generated trifluoroperacetic acid, resulting in two stereoselective epoxidations and opening of one of the epoxides via a Wagner-Meerwein rearrangement.


A third example (of course, with quinine). In a (to me) pretty neat first step, an acid-catalyzed elimination (described as similar to a Hofmann) followed by a carboxybenzyl N-protection step gives a rearranged ring system that has lost one of two fused rings but produced a ketone (via the enol tautomer). The ketone is then subjected to Petasis methylenation (aka Diet Tebbe), setting it up for a nice Grubbs-catalyzed RCM ring closure to afford the cis-decalin (sort of) moiety.



There's lots more examples than that in the paper. In fact, from those three starting compounds, the authors managed to generate a decent-sized proof-of-concept library (a cool feature of the web interface of the journal is that a list of all the compounds in the paper is available here, complete with easily accessed ChemDraw files and PubChem links. There are 169 molecules listed. The SI is big. Granted, all the combichem types and HTS folks will dismiss that as a very small library, but it covers a much wider area of chemical space than a typical HTS collection, as the authors point out--I'll get to this shortly). 

The birth of the CtD


The concept of CtD is interesting, as it has its roots in two areas of organic chemistry which haven't been in vogue recently: diversity-oriented synthesis (as mentioned before), and chiral pool synthesis.

Diversity-oriented synthesis (which is conceptually similar to combinatorial chemistry but differs in its emphasis of skeletal diversity over substituent diversity) received a lot of attention when it was first championed by Stuart Schreiber, but industry hasn't adopted it as a strategy (though Schreiber and other proponents haven't given up on the concept; check out this article for an anti-malarial 'hit' generated by DOS in 2011). Derek Lowe has written about it several times, with appropriate reservations (incidentally, I'm a little amazed at how much chemist-rage Schreiber seems to induce in Derek's comment sections).

Incidentally, Hergenrother was a postdoc for Stuart Schreiber around 1999-2001, making him part of a group of several Schreiber alumni who utilize and extend DOS methodology (including Derek Tan at Sloan-Kettering and David Spring at Cambridge).

Chiral and inexpensive.
Chiral pool synthesis (aka chiral template synthesis), on the other hand, is simply using readily available chiral starting materials to build complex targets (sometimes called first-generation asymmetric synthesis). For instance, common "chirons" (i.e. chiral synthons) include amino acids and carbohydrates. Nowadays, this method has been somewhat largely supplanted by chiral auxiliaries and chiral catalysis (sometimes called second- and third-generation approaches) because of their broader scope and other advantages--you don't need a completely new starting material for the other enantiomer, for one, and the chiral reagent can be used in very, very small (hence reduced-cost) amounts if it's a catalyst.

So CtD seems to be a child of these two methods. Its birth was also likely motivated by biomedically-driven motives: Hergenrother's group is very involved in high-throughput screening (HTS) efforts for anticancer and antibacterial purposes. I mentioned that DOS gets a bad rap partially because it is an "academic exercise" without use in real industry; I noticed that Hergenrother has (non-CtD) licensing agreements with two companies (StemPar Sciences and startup Vanquish Oncology). It'll be revealing to see if CtD spills over into any industrial connections.

Natural product-like compound libraries


So why bother? Aren't there screening libraries out there? Aren't some of these libraries huge? Isn't combinatorial chemistry well-established? Can't you get, like, six thousand billion compounds and count on one being the magic winner?

Well, the authors conducted a significant cheminformatic analysis of screening collections and marketed drugs in order to support their strategy. They noted a recent survey article from J. Med. Chem.:
A recent study examined eight structural parameters (molecular weight, ClogP, polar surface area, rotatable bonds, hydrogen-bond donors and acceptors, and complexity and fraction of sp3-hybridized carbons (Fsp3)) of compounds synthesized by medicinal chemists over the past 50 years, and then compared them to marketed drugs
The point was this: the properties of screening collections don't generally match up well to marketed drugs, and in certain sub-categories of drugs (say, antibiotics) the mismatch is worse than for others (say, kinase inhibitors). Hence, HTS efforts using these collections are putatively destined for higher-than-expected inefficiency.

In analyzing the results of CtD, Hergenrother et al. chose to focus particularly on proportion of tetrahedral (vs. planar) carbons (Fsp3) and ClogP, comparing the CtD library to the ChemBridge 150,000-compound collection. See Figure 5 of the article (reproduced partially below) for the analysis, presented in shiny, colorful graphs. They demonstrate a clear difference between commercial libraries and the CtD compounds on three metrics: stereocenters, tetrahedral content (representing complexity), and ClogP.

Example of chemoinformatic analysis from paper, differentiating ChemBridge
library (red) from novel CtD library (blue). Click the image for a larger (i.e.
readable) version. Source: part of Figure 5 from the article (Nature).

Additionally, a matrix is shown with Tanimoto similarity coefficients (essentially a geometry-based metric of 'similarity') that indicates substantial geometrical diversification even within groups derived from a common precursor. I'm not 100% convinced on how well Tanimoto scores predict useful diversity (for instance, a compound and its enantiomer would have a coefficient of 1.0 for complete similarity, and so would brominated and fluorinated versions of each other**). Still, it's an interesting metric! Note: a Tanimoto matrix of all 169 compounds is in the SI, if you like that kind of thing.

To sum that up: the group argues that they've created a library that is more 'drug-like' (and/or natural-product-like) than traditional (read: flat and boring) screening collections. Seems reasonable, but I wish there were more chemoinformatic analysis included.

It's tricky (potential pitfalls)


The synthetic chemist in me likes this paper a lot: after all, who doesn't like a healthy dose of wedges and dashes in as few steps as possible? However, I've got a few questions. Some potential limitations:

Derivatization. Med chem efforts tend to involve lots of taking a compound and slightly modifying it a bunch of times followed by screening of the derivatives (this is why med chem articles are pretty much the most boring thing in the world to read). I worry that leads generated in this way would be difficult to conduct derivization studies on. The authors do address this:
To demonstrate that traditional derivatization strategies can be applied even to these highly complex compounds that contain an array of chemical moieties, small libraries were synthesized based on 12 of the 49 compounds. As shown in Supplementary Fig. S4, small collections of imides, N-benzylated amides, aryl amides, amides, lactones, secondary and tertiary alcohols, epoxides, triazoles, ureas and sulfonamides were created readily from these 12 small molecules, and in this manner an additional 119 highly complex compounds were synthesized.
Still, with the strategies employed here, it's very easy to envision only being able to functionalize a small area of a given molecule--and it's also feasible that the functionalizable area would be distant from the actual pharmacophore.

Throughput. Though the output here is good, the reactivity on complex materials is often, well, rather unpredictable. Accordingly, thorough purification and characterization is needed at each step. That rather limits the high-throughput aspect of an approach like this, especially compared to combi-chem and DOS approaches that use highly predictable pathways that can be automated. After all, the idea is to generate a library. Numbers-wise, it's like comparing your bookshelf to your university library (although, if your university library has three million slightly different copies of Twilight but your bookshelf has Dostoevsky, Hugo, Hemingway, Shakespeare, and Poe, numbers might not matter).

Scope/compound selection. The authors do place some guidelines for selecting CtD compounds and reactions near the end of the paper. Still, when compared to simple, achiral starting materials, the selection of multigram-available, affordable natural products with appropriate orthogonality of functional groups seems scant. It could very well be that the good CtD compounds get taken very quickly, leaving few useful options. A lot of that depends on availability of natural products, of course--but is industry really isolating and/or making enough of these for this purpose? The authors address this, somewhat, giving a list of some suggested natural products. But it's a short list.

Does CtD walk the walk? It's interesting to see that no biological screening was reported. As one of the goals of this kind of research is to expand the scope of chemical space covered in screening collections, and by doing so, to improve screening efforts, it will be important to see if that benefit comes to fruition. The chemoinformatic analysis in the paper suggests these compounds to be more natural product-like/more drug-like--will that come to anything, practically? I hope it does. But there's a big gamble here that because complexity is correlated with many drugs (e.g. antibiotics), it'll be causative too.

(End of gloom-rant).

I do think this kind of project would be an excellent training exercise for early graduate students. Routes are short, the chemist would get exposed to a variety of reactions, structural elucidation skills would get quickly strengthened, and the results could very easily be contributed to screening libraries, potentially leading to leads for biologically-driven studies.

One last thought: this work has the potential to annoy a lot of people (perhaps for bad reasons). I can see total synthesis chemists getting annoyed at the economy of steps; I can see med chemists getting annoyed at the lack of trigonal carbons and flat rings; I can see methodology or process chemists getting annoyed at the lack of optimization (since yields here aren't important); and I can see chemical biologists being confused as to whether this is or isn't just a rehash of DOS.

But I think it's a cool paper.

Comparison of synthetic approaches*** (a) Target-oriented synthesis;
(b) Medicinal chemistry/lead optimization; (c) Diversity-oriented synthesis.


Note: this journal article has also been covered in C&EN and by Chemistry Cascade.

* Of course, this instruction is disingenuous, given that retrosynthetic analysis is not really feasible here and it's not target-oriented synthesis anyway, but hey. 
** I think.
*** Alternate interpretation: total synthesis is not as good as Come On Eileen, med chem is better than but pretty much as boring as Nickelback, and DOS confuses as many people as David Bowie.
**** I'm guilty of using lots of footnotes. Sorry, See Arr Oh!

[Edit: fixed minor typographical errors.]

1 comment:

  1. A fair bit of my own work has started with fully formed naturally products. It's a very different feeling, and (in my case) has instilled a strong appreciation for the Greene Protective Groups book.

    One thing that stuck out when I read this paper was how old most of the natural products were. There's a large body of synthetic work around all of them, which I think gave the Hergenrother group a significant edge. They didn't have to adapt these transformations to their scaffolds, just chain them together.

    ReplyDelete