The question of how new protein functions arise in evolution via duplication has a certain “chicken and egg” flavor to it. It’s clear that genes with new functions do often arise after older genes undergo a duplication event, but which comes first, the new function or the duplication? For a long time it seemed obvious that the duplication would come first. If the gene you start with has an important function, the number of mutations you can tolerate in that gene is limited. But if, by chance, that gene gets duplicated, evolution has a new degree of freedom. One copy must continue to perform the important function, and any deleterious mutations in that copy must be expunged. But the other copy isn’t important any more: it’s free to accumulate mutations and perhaps find its way to a new function. This is called the “mutation during nonfunctionality” model, or we might call it the “chicken first” model.
The “egg first” model (i.e., the idea that the new function may come before the duplication) was proposed about 15 years ago by Austin Hughes. He argued that a duplicate gene lying around in the genome accumulating mutations isn’t necessarily a good, or neutral, thing for the organism, and that there’s evidence that duplicate genes are either silenced, or under selection to continue to produce functional protein. (One reason for this might be the toxic effects of misfolded proteins, for example.) He put this together with then-recent evidence that a single gene could encode two functions — the lens protein tau-crystallin and the enzyme alpha-enolase had recently been shown to be the same protein — and made the following suggestion: perhaps many proteins have some level of multifunctionality, but it’s not always possible for both functions to be optimized at once. If so, then an accidental duplication of the gene encoding a multifunctional protein would allow the two gene copies to evolve in different directions and optimize different functions of the protein. This has been called “escape from adaptive conflict”.
Since then, many more multifunctional proteins have been discovered (though the term multifunctional is not as precise as it sounds: one might argue that any protein with more than one binding partner is multifunctional), perhaps increasing the plausibility of the “egg first” model. But showing that multifunctional proteins can exist, or that they can give rise to two single-function proteins is not enough; we need examples where the two functions in a single protein are in conflict, unable to reach their full potential without destroying each other. Then the importance of gene duplications would become clear.
A couple (but only a couple) of nice examples of genes encoding a protein that fulfills two different functions in a single pathway being duplicated and (arguably) resolving an adaptive conflict have been described, with the added wrinkle that in one case the main conflict appears to be in the regulation of the gene, not its function. Now, a new paper (Deng et al. 2010 Evolution of an antifreeze protein by neofunctionalization under escape from adaptive conflict, PNAS doi/10.1073/pnas.1007883107) describes what is in some ways a more dramatic example: the apparent resolution of an adaptive conflict between a cytoplasmic enzyme, sialic acid synthase, and a secreted plasma protein that acts as an antifreeze.
Antifreeze proteins evolved relatively recently in multiple lineages of fish. There are many different antifreeze proteins, and the ancestral genes from which they evolved have many different functions; so this seems like a good place to look for evidence of either chicken-first or egg-first events. The antifreeze protein that Deng et al. study is called AFPIII, and it’s found in eel pouts, ocean pouts and wolffishes. Looking at the genome of the eel pout Lycodichthys dearborni and comparing it with the genome of a related fish that lacks AFPIII, Deng et al. came to the conclusion that the AFPIII gene originally arose from a 12kb duplication of a region of the genome that contained the genes for sialic acid synthase, of which there are two, SAS-A and SAS-B. The new gene was then itself duplicated many times, presumably because the fish needed lots of antifreeze; the locus now contains over 30 copies of the AFPIII gene. Fortunately, there’s an AFPIII pseudogene in the locus as well, a copy of AFPIII that got silenced along the way and therefore stopped being under selective pressure; this gives the authors a useful “fossil” of an early stage in the evolution of the antifreeze function. Using information from the AFPIII pseudogene, the authors show that the ancestor of AFPIII was more closely related to SAS-B than SAS-A. Both of the SAS genes have 6 exons, and presumably the ancestor of SAS-B from which AFPIII evolved did too; AFPIII has lost exons 2-5 entirely, retaining only exon 6 and a portion of exon 1. It turns out that exon 1 of AFPIII includes a stretch that is homologous to a segment of untranslated sequence before exon 1 of SAS-B; this segment provides most of the new signal peptide responsible for secretion of the protein.
Was the ancestor of SAS-B bifunctional? Almost certainly — because the modern form of SAS-B is. Deng et al. cloned SAS-B and expressed it in bacteria, and tested its effects on ice crystal growth. Normally, ice crystals form as discs at 0°C. In the presence of 2mg/ml SAS-B the shape of the crystal changes, becoming hexagonal, and the freezing point is slightly depressed. AFPIII also causes the hexagonal shape change, and does better at depressing the freezing point. The portion of SAS-B that is responsible for its modest antifreeze effect is the C-terminal domain — the very portion of the protein that’s encoded by exon 7, the only exon to be completely retained in AFPIII.
Was there adaptive conflict between the antifreeze function and the sialic acid synthase function in the SAS-B precursor? Yes, at least in one direction: optimizing antifreeze function would probably have been hard while maintaining SAS function. As noted above, AFPIII managed to evolve a secretion signal peptide out of some bits of untranslated sequence and the SAS-B exon 1. Sialic acid synthase is an enzyme that you want to keep inside the cell; but you want antifreeze proteins to be everywhere. Leaving location aside, Deng et al. also asked whether SAS-B could have become more efficient as an antifreeze protein without losing its sialic acid synthetase activity. They did this by taking residues from AFPIII that are important for its antifreeze function, and inserting them into the SAS-B sequence. The resulting “hybrid” protein had no detectable SAS activity at all, showing that improved antifreeze function was unlikely to evolve— at least via this route — if the two functions continued to share one gene.
To be a perfect “escape from adaptive conflict” story, we need one more piece: the SAS-B gene should also have been released from a constraint (the constraint of maintaining the modest antifreeze activity of its precursor), and should have been freed to evolve improved function once the constraint was removed. This part of the story is shakier, since after all SAS-B still does have antifreeze activity; but SAS was around for a long time before the need for an antifreeze protein popped up, so it would be expected to be quite an efficient enzyme already. Deng et al. argue that both SAS-B and (especially) AFPIII evolved rapidly after the duplication, using the inferred ratio of non-synonymous changes to synonymous changes as evidence (non-synonymous changes, dN, alter amino acid identity, while synonymous changes, dS, do not; the ratio dN/dS is taken as a measure of the strength of Darwinian selection). One might be able to quibble with this for SAS-B — we don’t know what the ancestral SAS-B looked like and we can’t be sure that the modern one has improved function — but there is no question that AFPIII did indeed get better at its job, and could not have done so (at least along the route it took) without the separation in functions that the gene duplication allowed. The whole story seems like good evidence that eggs can indeed come first; whether this is the only, or the main way that duplication helps novel protein functions to evolve remains unclear.
A side note: in researching this article I found myself almost irrevocably stuck in the fascinating question of how antifreeze proteins work, and how hard or easy you would expect it to be to evolve an antifreeze function. (If antifreeze function is too easy to evolve, maybe almost any protein is “bifunctional”; the fact that it’s evolved in many different ways suggests this may be true. On the other hand, if the need for antifreeze function is so recent, maybe we haven’t had time to evolve a really complicated antifreeze protein yet. Do all recently evolved functions look “easy”?) I eventually extricated myself by deciding that an explanation of the biophysical mechanism of AFPIII is not essential for this story; those of you who are made of sterner stuff than I am may want to read more, in which case I suggest you start here.
Deng C, Cheng CH, Ye H, He X, & Chen L (2010). Evolution of an antifreeze protein by neofunctionalization under escape from adaptive conflict. Proceedings of the National Academy of Sciences of the United States of America PMID: 21115821