Sunday, February 24, 2013

Reading assignments, vol. 11

Communication of science

Peer review and publication

  • Neuroskeptic writes about the perhaps-sensical, perhaps-counterintuitive situation of stats quality in journals. It seems that high-impact journals (e.g. Science and especially Nature) are more likely than many low-impact journals to have insufficient statistical analysis. This may not be surprising, given the incentive for those journals to publish hyped-up work. Is there a similar trend in chemistry? I suspect that many medium-tier journals provide more solid experimental characterization and writeup than some of the flashier ones.
  • Scientists unsatisfied with the status quo of journal publishing practices will find this development interesting. Biologist Michael Eisen has declared that he will publicly post each paper from his lab prior to journal submission for pre-publication, community-oriented peer review. It's a refreshing idea--hopefully others will follow suit.
  • Derek Lowe points to the disappearance of the electronic-only open-access publication Journal of Advances in Developmental Research. Although he notes the relative unimportance of that particular journal, he brings up some points that open-access advocates should pay close attention to: predatory publishing and digital preservation. On a related note, Kevin Bonham writes about the premiere of a new prominent online-only journal, PeerJ.
  • The story of the recent Xi Yan plagariasm endeavor (and the journal's lackluster, non-punitive response) has been written about with proper consternation by See Arr Oh. This kind of case is amazing, as it's the kind of thing routinely warned against in undergraduate writing courses and in orientation lectures at grad programs. Despite this, the plagiarists quite often win (for another plagiarist who 'won', check out Jonah Lehrer). For those with a spare half-hour, check out the Chemjobber/See Arr Oh podcast about plagiarism and peer review. Also, the comments section at the relevant In the Pipeline posting contains a discussion of the ethics of paper submission and whose fault plagiarism is (one commenter seems to think it lies with editors/reviewers and not  professors).
  • In the last two weeks, two more entries came out at Blog Syn (a Pd-catalyzed site-selective C-H olefination and an IBX-mediated benzylic oxidation). Give them a read--and submit your comments if you have suggestions or questions! Blog Syn is supposed to be a discussion-oriented endeavor (and the further updated to Blog Syn #003 illustrate that, I think). On another note, despite broad support (including from more than one big-name prof), Blog Syn does have its critics. See particularly the comments section at In The Pipeline, which is brimming with vitriol (so much that Derek Lowe jumped in to defend Blog Syn).
  • Science librarian Bonnie Swoger discusses common metrics of scientific publishing, such as h-index and impact factor, citing the importance of context in any comparisons. 

The job market

Research highlights

Thursday, February 14, 2013

A rose by any other IUPAC name...

Today is Valentine's Day, and what's more quintessentially appropriate than roses? If you're enjoying the scent of flowers today, you have chemicals (*gasp*) to thank!

Here's a few of those chemicals found naturally in the scent of various rose cultivars:

Enjoy your terpenes, everyone!

References (further reading for the brave)
  1. Flament, I., Debonneville, C., and Furrer, A. (1993). Volatile constituents of roses: Characterization of cultivars based on the headspace analysis of living flower emissions. In Bioactive Volatile Compounds from Plants, R. Teranishi, R.G. Buttery, and H. Sugisawa, eds (Washington, DC: American Chemical Society), pp. 269–281. DOI: 10.1021/bk-1993-0525.ch019
  2. Charles S. Sell. (2003). A fragrant introduction to terpenoid chemistry. Cambridge: RSC, Royal Society of Chemistry. pp. 256–257. ISBN 978-0-85404-681-2.

Tuesday, February 12, 2013

Stop using that word: Facile

Certain words tend to catch on in the scientific literature, such as "novel", which increased exponentially in usage starting in the 1980s. One of those catchy words is one that I previously* used like candy: facile.

Chemists like to describe reactions as "facile." By that, they usually mean easily-performed, smooth, or simple. You know, not much fuss involved. And indeed, that's one of the definitions (from the OED):
adj. (1) a. That can be achieved with little effort; straightforward, easy. In later used freq. in disparaging sense: contemptibly easy. b. Of instructions, a device, etc.: easy to understand or make use of; simple. c. Of a course of action, a method, etc.: presenting few difficulties.
That is the original, historical meaning. Another (more modern) definition carries a different implication (from the Oxford Pocket Dictionary):
adj. (1) ignoring the true complexities of an issue; superficial; or (2) (of a person) having a superficial or simplistic knowledge or approach
And if you type "facile" into Google, you get this immediately:
adj. (esp. of a theory or argument) Appearing neat and comprehensive only by ignoring the true complexities of an issue; superficial
Essentially, "facile" in modern usage (last century or so) has a negative connotation. And take a look at the synonyms for facile (Oxford again):
simplistic, superficial, oversimple, oversimplified, schematic, black and white; shallow, pat, glib, slick, jejune, naive
It's interesting to note that wiktionary has a chemistry-specific entry for "facile":
(chemistry) Of a reaction or other process, taking place readily.
Of course, language is defined by usage, so maybe I'm being picky here. But why not describe procedures as "straightforward", "robust", "easily performed", or any number of other, less ambiguous descriptors? (My intuition says those terms sound too common/"blue-collar" to many academics)

When did "facile" catch on? Well, from a crude PubMed search, it looks like the late 1990s was the tipping point (incidentally, PubMed makes it way easier than SciFinder to get data on this kind of thing. Sorry, CAS). Additionally, most of these references are (not surprisingly) from synthetic organic chemistry papers. See also this description of the two meanings of the word and their historical context.

The shrouded meaning of facile adds overlooked complication when authors start to invent words (after all, organic chemists are wont to derivatize things). There's a handful of examples from the literature that use "nonfacile" (which isn't even a word; go ahead, check the OED or Merriam-Webster).

Take this example from an otherwise-good trifluoroborate-preparation paper (Lennox, A.J.J., Lloyd-Jones, G.C. Angew. Chem. Int. Ed. 2012, 51(37), 9385-9388. doi: 10.1002/anie.201203930):
Replacing MeOH with diethyl ether led to co-precipitation of 2a with other potassium salts (KF/RCO2K etc.), thus making isolation of pure 2a nonfacile. Switching to MeCN kept trifluoroborate 2a in solution, but an excess of carboxylic acid (e.g. acetic or ortho-iodobenzoic acid) was still required (Scheme 2).
So if "facile" means "contemptibly easy" or "appearing neat and comprehensive only by ignoring the true complexities of an issue", what does "nonfacile" mean? Was isolation of 2a not contemptibly easy (just acceptably easy?). Did isolation merely appear difficult while really, under the surface, it was simple?

Stop using that word.

* i.e. before a reviewer pointed out the proper definition.

Sunday, February 10, 2013

Reading assignments, vol. 10

Here's the link roundup for the week:

Science communication

Denialism, chemophobia, and fraud

Chemical education & academia

Public policy


Wednesday, February 6, 2013

Wanna buy some tetrahedral centers?

Chemists: Devise syntheses for the following molecules in enantiopure form, starting from affordable, commercially available precursors.* Oh, and make sure your routes are three steps or fewer.

As conventional total syntheses, of course, that would be a tall order, and each molecule would take a substantial amount of time and effort. But in an interesting new paper (behind paywall at Nature Chemistry) by Paul Hergenrother's group at the University of Illinois at Urbana-Champaign, the authors do just that, as well as preparing numerous other, equally complex molecules. The effort introduces a new variant on diversity-oriented synthesis (DOS) which the authors coin "Complexity to Diversity"--or CtD.

Complexity to Diversity

What CtD entails is this: the authors take stereochemically defined, readily available natural products (here from three classes of biosynthetic small molecules) and perform skeletal transformations on them, using the complexity (chirality and ring structure) present in the molecules to create new compounds that have unusually high numbers of stereocenters and structural sophistication. Moreoever, the compounds are produced in very few steps (from one to about five, averaging three), meaning that regardless of individual stepwise yield, any of the materials may be obtained in quantities sufficient for biological testing or (importantly to chemists) full analytical characterization (check out the SI; there's a lot of NMR data and even a crystal structure!). (This, of course, stands in stark contrast to total synthesis approaches, wherein 3 mg may be the sum total available at the end and making more means gallons of tears and sweat).

At first, the idea of using natural products (typically thought up as targets) as starting materials may sound odd, but (1) that's how nature does it; and (2) semi-synthesis from easily-procured natural materials is a common strategy, the most common example being paclitaxel. One note, in case this wasn't clear: the authors aren't proposing this as an alternative to traditional total synthesis; rather, it's a forward approach designed to generate a library of novel compounds.

The authors employed three starting materials readily available on multigram scales: gibberellic acid (pronounced like "jib", not "gib"), adrenosterone, and quinine. These each represent a major class of biosynthetic natural products: terpenes, steroids, and alkaloids, respectively. Of course, each of these compounds have been the subject of total synthesis efforts (indeed, quinine is covered in KCN's Classics in Total Synthesis Vol. II, while adrenosterone pops up in Carreira's Classics in Stereoselective Synthesis).

The key was the use of structurally transformative reactions (i.e. ring distortions, as in the title of the paper). Take the following example: adrenosterone was submitted to sodium azide and sulfuric acid, giving an interesting tandem ring-expansion (Schmidt reaction) and ring-cleavage. This product (already non-trivially different from adrenosterone in terms of both ring structure and functional group presence) was subjected to a Luche reduction of the unsaturated ketone, giving stereoselectively an alcohol which was then acetylated.

Another example (this one from gibberellic acid): an initial treatment of gibberellic acid with aqueous base resulted in allylic rearrangement of the lactone to give the trisubstituted alkene. The carboxylic acid moiety was then subjected to amidation and a subsequent dual-purpose treatment with in situ-generated trifluoroperacetic acid, resulting in two stereoselective epoxidations and opening of one of the epoxides via a Wagner-Meerwein rearrangement.

A third example (of course, with quinine). In a (to me) pretty neat first step, an acid-catalyzed elimination (described as similar to a Hofmann) followed by a carboxybenzyl N-protection step gives a rearranged ring system that has lost one of two fused rings but produced a ketone (via the enol tautomer). The ketone is then subjected to Petasis methylenation (aka Diet Tebbe), setting it up for a nice Grubbs-catalyzed RCM ring closure to afford the cis-decalin (sort of) moiety.

There's lots more examples than that in the paper. In fact, from those three starting compounds, the authors managed to generate a decent-sized proof-of-concept library (a cool feature of the web interface of the journal is that a list of all the compounds in the paper is available here, complete with easily accessed ChemDraw files and PubChem links. There are 169 molecules listed. The SI is big. Granted, all the combichem types and HTS folks will dismiss that as a very small library, but it covers a much wider area of chemical space than a typical HTS collection, as the authors point out--I'll get to this shortly). 

The birth of the CtD

The concept of CtD is interesting, as it has its roots in two areas of organic chemistry which haven't been in vogue recently: diversity-oriented synthesis (as mentioned before), and chiral pool synthesis.

Diversity-oriented synthesis (which is conceptually similar to combinatorial chemistry but differs in its emphasis of skeletal diversity over substituent diversity) received a lot of attention when it was first championed by Stuart Schreiber, but industry hasn't adopted it as a strategy (though Schreiber and other proponents haven't given up on the concept; check out this article for an anti-malarial 'hit' generated by DOS in 2011). Derek Lowe has written about it several times, with appropriate reservations (incidentally, I'm a little amazed at how much chemist-rage Schreiber seems to induce in Derek's comment sections).

Incidentally, Hergenrother was a postdoc for Stuart Schreiber around 1999-2001, making him part of a group of several Schreiber alumni who utilize and extend DOS methodology (including Derek Tan at Sloan-Kettering and David Spring at Cambridge).

Chiral and inexpensive.
Chiral pool synthesis (aka chiral template synthesis), on the other hand, is simply using readily available chiral starting materials to build complex targets (sometimes called first-generation asymmetric synthesis). For instance, common "chirons" (i.e. chiral synthons) include amino acids and carbohydrates. Nowadays, this method has been somewhat largely supplanted by chiral auxiliaries and chiral catalysis (sometimes called second- and third-generation approaches) because of their broader scope and other advantages--you don't need a completely new starting material for the other enantiomer, for one, and the chiral reagent can be used in very, very small (hence reduced-cost) amounts if it's a catalyst.

So CtD seems to be a child of these two methods. Its birth was also likely motivated by biomedically-driven motives: Hergenrother's group is very involved in high-throughput screening (HTS) efforts for anticancer and antibacterial purposes. I mentioned that DOS gets a bad rap partially because it is an "academic exercise" without use in real industry; I noticed that Hergenrother has (non-CtD) licensing agreements with two companies (StemPar Sciences and startup Vanquish Oncology). It'll be revealing to see if CtD spills over into any industrial connections.

Natural product-like compound libraries

So why bother? Aren't there screening libraries out there? Aren't some of these libraries huge? Isn't combinatorial chemistry well-established? Can't you get, like, six thousand billion compounds and count on one being the magic winner?

Well, the authors conducted a significant cheminformatic analysis of screening collections and marketed drugs in order to support their strategy. They noted a recent survey article from J. Med. Chem.:
A recent study examined eight structural parameters (molecular weight, ClogP, polar surface area, rotatable bonds, hydrogen-bond donors and acceptors, and complexity and fraction of sp3-hybridized carbons (Fsp3)) of compounds synthesized by medicinal chemists over the past 50 years, and then compared them to marketed drugs
The point was this: the properties of screening collections don't generally match up well to marketed drugs, and in certain sub-categories of drugs (say, antibiotics) the mismatch is worse than for others (say, kinase inhibitors). Hence, HTS efforts using these collections are putatively destined for higher-than-expected inefficiency.

In analyzing the results of CtD, Hergenrother et al. chose to focus particularly on proportion of tetrahedral (vs. planar) carbons (Fsp3) and ClogP, comparing the CtD library to the ChemBridge 150,000-compound collection. See Figure 5 of the article (reproduced partially below) for the analysis, presented in shiny, colorful graphs. They demonstrate a clear difference between commercial libraries and the CtD compounds on three metrics: stereocenters, tetrahedral content (representing complexity), and ClogP.

Example of chemoinformatic analysis from paper, differentiating ChemBridge
library (red) from novel CtD library (blue). Click the image for a larger (i.e.
readable) version. Source: part of Figure 5 from the article (Nature).

Additionally, a matrix is shown with Tanimoto similarity coefficients (essentially a geometry-based metric of 'similarity') that indicates substantial geometrical diversification even within groups derived from a common precursor. I'm not 100% convinced on how well Tanimoto scores predict useful diversity (for instance, a compound and its enantiomer would have a coefficient of 1.0 for complete similarity, and so would brominated and fluorinated versions of each other**). Still, it's an interesting metric! Note: a Tanimoto matrix of all 169 compounds is in the SI, if you like that kind of thing.

To sum that up: the group argues that they've created a library that is more 'drug-like' (and/or natural-product-like) than traditional (read: flat and boring) screening collections. Seems reasonable, but I wish there were more chemoinformatic analysis included.

It's tricky (potential pitfalls)

The synthetic chemist in me likes this paper a lot: after all, who doesn't like a healthy dose of wedges and dashes in as few steps as possible? However, I've got a few questions. Some potential limitations:

Derivatization. Med chem efforts tend to involve lots of taking a compound and slightly modifying it a bunch of times followed by screening of the derivatives (this is why med chem articles are pretty much the most boring thing in the world to read). I worry that leads generated in this way would be difficult to conduct derivization studies on. The authors do address this:
To demonstrate that traditional derivatization strategies can be applied even to these highly complex compounds that contain an array of chemical moieties, small libraries were synthesized based on 12 of the 49 compounds. As shown in Supplementary Fig. S4, small collections of imides, N-benzylated amides, aryl amides, amides, lactones, secondary and tertiary alcohols, epoxides, triazoles, ureas and sulfonamides were created readily from these 12 small molecules, and in this manner an additional 119 highly complex compounds were synthesized.
Still, with the strategies employed here, it's very easy to envision only being able to functionalize a small area of a given molecule--and it's also feasible that the functionalizable area would be distant from the actual pharmacophore.

Throughput. Though the output here is good, the reactivity on complex materials is often, well, rather unpredictable. Accordingly, thorough purification and characterization is needed at each step. That rather limits the high-throughput aspect of an approach like this, especially compared to combi-chem and DOS approaches that use highly predictable pathways that can be automated. After all, the idea is to generate a library. Numbers-wise, it's like comparing your bookshelf to your university library (although, if your university library has three million slightly different copies of Twilight but your bookshelf has Dostoevsky, Hugo, Hemingway, Shakespeare, and Poe, numbers might not matter).

Scope/compound selection. The authors do place some guidelines for selecting CtD compounds and reactions near the end of the paper. Still, when compared to simple, achiral starting materials, the selection of multigram-available, affordable natural products with appropriate orthogonality of functional groups seems scant. It could very well be that the good CtD compounds get taken very quickly, leaving few useful options. A lot of that depends on availability of natural products, of course--but is industry really isolating and/or making enough of these for this purpose? The authors address this, somewhat, giving a list of some suggested natural products. But it's a short list.

Does CtD walk the walk? It's interesting to see that no biological screening was reported. As one of the goals of this kind of research is to expand the scope of chemical space covered in screening collections, and by doing so, to improve screening efforts, it will be important to see if that benefit comes to fruition. The chemoinformatic analysis in the paper suggests these compounds to be more natural product-like/more drug-like--will that come to anything, practically? I hope it does. But there's a big gamble here that because complexity is correlated with many drugs (e.g. antibiotics), it'll be causative too.

(End of gloom-rant).

I do think this kind of project would be an excellent training exercise for early graduate students. Routes are short, the chemist would get exposed to a variety of reactions, structural elucidation skills would get quickly strengthened, and the results could very easily be contributed to screening libraries, potentially leading to leads for biologically-driven studies.

One last thought: this work has the potential to annoy a lot of people (perhaps for bad reasons). I can see total synthesis chemists getting annoyed at the economy of steps; I can see med chemists getting annoyed at the lack of trigonal carbons and flat rings; I can see methodology or process chemists getting annoyed at the lack of optimization (since yields here aren't important); and I can see chemical biologists being confused as to whether this is or isn't just a rehash of DOS.

But I think it's a cool paper.

Comparison of synthetic approaches*** (a) Target-oriented synthesis;
(b) Medicinal chemistry/lead optimization; (c) Diversity-oriented synthesis.

Note: this journal article has also been covered in C&EN and by Chemistry Cascade.

* Of course, this instruction is disingenuous, given that retrosynthetic analysis is not really feasible here and it's not target-oriented synthesis anyway, but hey. 
** I think.
*** Alternate interpretation: total synthesis is not as good as Come On Eileen, med chem is better than but pretty much as boring as Nickelback, and DOS confuses as many people as David Bowie.
**** I'm guilty of using lots of footnotes. Sorry, See Arr Oh!

[Edit: fixed minor typographical errors.]

Sunday, February 3, 2013

Reading assignments, vol. 9

This week's stuff is pretty heavily communication-themed; a lot of that is going around with the ScienceOnline 2013 deal having gone down. Anywhere, here's some general science enjoyment:


  • Chemophobia has been a major topic of the social-media-dom recently due to the ScienceOnline 2013 conference. In particular, Saturday marked the chemophobia-specific portion of the conference (Session 8A), which included a contemporaneous Twitter discussion via the hashtag #chemophobia. For those who had to work this Saturday (woo, columns, woo) the session notes have been posted online, and there's a quite impressive wiki entry containing an abundance of relevant and interesting chemophobia-related links and discussions.
  • Michelle at The Culture of Chemistry has a thoughtful analysis of a recent chemophobia-rife New York Times story; she points to language and how it affects perception of concepts.
  • Paul at ChemBark shares his tips and proposed strategies for how to combat chemophobia. It's a good read that sums up the origins and dangers of chemophobia pretty well. The recommendations are good, too: ACS should be doing its part (come on, guys!) but graduate students and faculty need to take it upon themselves to do outreach, regardless of the perceived waste of time. (That being said, the hostile intellectual atmosphere and the rough job market make spending any time on outreach seem unappealing to those trying to get as many ninth-author Tet. Lett. papers as possible published before graduating).
  • Don't miss this latest Chemjobber podcast, wherein he discusses chemophobia and chemical communication with freelance writer/chemist Rebecca Guenard. 

Science communication

  • See Arr Oh pokes fun at general features of chemistry blog entries.
  • I found this guest post by Frank Swain both insightful and heartening. He writes of his UK-based BenchPress Project, which seeks, among other things, to have volunteer scientists give guest lectures to journalism students. The goal is to increase science and math (maths) literacy among journalists. I think it's a pretty important effort; even if scientists themselves try to do outreach and writing, journalists have the broadest audience and the means to reach them. Changes in science communication have to come from within both sectors!
  • David Rubenson argues that despite a growing need for science communication, the quality of science communication has been in decline. He points to several symptoms (e.g. cluttered slides) and causative agents (e.g. overstretched researchers). I found significant his reference to two Nobelists who published infrequently (also, it reminded me of Daniel Day-Lewis).
  • Always-interesting and often-controversial, Keith Kloor discusses the relative importance of general science literacy and news literacy. He argues for the importance of the latter (while not neglecting the former); in particular, he calls for news literacy to have a place in education. It shouldn't be an unfamiliar concept to scientists, who (should) be experienced at evaluating credibility of sources.
  • UIUC anthropologist and science blogger Kate Clancy has an interesting piece (relevant to anyone who uses social media, especially those who write) about the pros and cons of filling out your online presence with your real identity.

Pseudoscience and denialism


[Edit: I forgot Brandon Findlay's columns week! Urp!]