Free Republic
Browse · Search
News/Activism
Topics · Post Article

Skip to comments.

Systemic determinants of gene evolution and function
Molecular Systems Biology ^ | 9/13/05 | Eugene V Koonin

Posted on 10/03/2005 1:17:35 PM PDT by <1/1,000,000th%

What determines a gene's evolutionary rate? In particular, does it depend solely on functional constraints imposed on the structure of the encoded protein or are there higher-level factors related to the selection at the organismal level? These questions seem to be among the most fundamental ones in biology because comprehensive answers will reveal the nature of the links between genome evolution and the phenotypes of organisms. A recent study by Wall et al (2005) proves more convincingly than ever before that systemic determinants of gene evolution rate do exist, and an intriguing paper by Fraser (2005) sheds light on some of the underlying mechanisms. However, a recent report by Coulomb et al (2005) issues an important warning by showing that some of the intuitively plausible connections discovered by Systems Biology may be due to biases in the data.

Nearly 30 years ago, Wilson et al (1977) put forward a general proposition that may be called the rate-dispensability conjecture—the evolutionary rate should be a function of, firstly, the constraints on the function of the given gene (protein) and, secondly, the 'importance' (fitness effect of knockout or dispensability) of the gene for the organism: Ri=f(Pi)f(Qi) (Ri is the rate of evolution of the given protein, Pi is the probability that a substitution is compatible with the function of this protein, and Qi is the probability that the organism survives and reproduces without this protein).

The prediction, thus, is that essential (indispensable) genes, on average, should evolve slower than nonessential genes. This conjecture generally follows from Kimura's neutral theory of evolution but is nontrivial given the broad variance of structural−functional constraints on proteins, regardless of their dispensability; in principle, this variance could completely explain the distribution of evolutionary rates among genes without invoking the fitness connection. Thus, empirical tests of the conjecture are of interest, and such tests have been conducted as soon as the combination of genome sequences and genome-wide knockout fitness effect data became available. The results, however, were ambiguous. The first attempt by Hurst and Smith (1999) involving only 100 orthologous human and mouse genes, for which knockout effect data in mouse were available, failed to detect the predicted connection. A subsequent study by Hirsh and Fraser (2001) dealt with 300 yeast genes, with quantitative fitness effect data taken from the results of a genome-wide measurement in yeast and the rates derived from a comparison with the nematode orthologs. These authors reported a weak but statistically significant negative correlation between the knockout fitness effect and evolution rate, in accord with the Wilson conjecture. However, when the genes were classified into two categories, essential and nonessential, no significant difference in rates was detected. In contrast, Jordan et al analyzed much larger sets of orthologous genes in bacteria for which knockout data were available and came to the conclusion that essential genes, indeed, on average, evolved slower than nonessential ones (Jordan et al, 2002). The issue has been further confounded by two studies that examined partial correlations between evolution rate, fitness effect, and expression level of a gene and concluded that the link between evolution rate and fitness effect vanished once expression level was taken into account (Pal et al, 2003; Rocha and Danchin, 2004).

A recent study by Wall et al (2005) makes major strides to finally settle the issue. These authors produced robust estimates of short-term evolutionary rates for >3000 orthologous gene sets from four yeast species of the genus Saccharomyces and compared them with two independent data sets on the phenotypic effects of yeast gene knockouts and two measures of gene expression (experimentally determined mRNA abundance and codon adaptation index). Now, partial correlation analysis gave an unequivocal answer: a gene's evolutionary rate significantly depends both on its dispensability and on expression level, and the contributions of these two variables are, largely, independent. Thus, 'important' genes and genes that are highly expressed tend to evolve slowly, supporting and extending Wilson's conjecture.

This is not the final word on the connection between evolutionary rate, dispensability, and expression, as much work remains to be carried out to obtain reliable quantitative estimates of the strength of the dependences involved. It does seem, however, that, at least for yeast, the reality of these links is now established beyond reasonable doubt. The simple and not particularly new methodological lesson from this work is that, in many cases, careful analysis of improved data sets will do more to resolve a fundamental scientific issue than sophisticated theoretical considerations.

Gene dispensability and expression level are not the only functional variables that have been linked to the evolution rate. In the current era of Systems Biology, many researchers have been particularly intrigued by the possibility that gene evolution is affected by the topology of various interaction networks. In particular, negative correlation has been reported to exist between a gene's node degree in protein−protein interaction (Fraser et al, 2002) and coexpression networks (Jordan et al, 2004) and evolutionary rate. In other words, genes that interact with many other genes either at the level of coexpression or through physical interaction between their protein products tend to evolve slowly.

However, at least the connection between a protein's position in the interaction network and evolutionary rate has been no less contentious than the link with dispensability. Subsequent to the original report on the correlation, one re-analysis failed to confirm the overall connection although the most prolific interactors (network hubs) did seem to evolve slowly (Jordan et al, 2003), whereas another study denied the link altogether, suggesting that it was an artifact of protein abundance (Bloom and Adami, 2003).

A recent study by Fraser (2005)seems to clarify the issue and provides an intriguing insight into the evolutionary forces that may be at play in network evolution. Fraser partitioned the interaction network hubs into two classes and showed that they dramatically differ in terms of the connection with the evolutionary rate (or, more precisely, the strength of purifying selection measured as the ratio of the rates for synonymous and nonsynonymous positions in coding sequences).

It turns out that hubs that interact with numerous partners within a network module (intramodule hubs, also known under the more appealing name of 'party hubs'; Han et al, 2004), indeed, are strongly constrained and evolve much slower than either proteins that have no partners at all or intermodule hubs ('date hubs'; Han et al, 2004) that interact with partners from different modules. The intermodule hubs are only slightly more constrained than noninteractors. This observation leads to the intuitively plausible hypothesis that organization and functions of network modules tend to be conserved during evolution, whereas intermodule hubs are involved in network rewiring and could be foci of innovation.

Taken together, these recent studies make, perhaps, relatively small but concrete inroads into the domain of Evolutionary Systems Biology (Medina, 2005). This area of inquiry is just making its baby steps, and the road ahead will be long and hard. That this is so, is demonstrated by the recent analysis of Coulomb et al (2005), which, while not dealing directly with evolution, is an important note of caution for systems biologists. These authors take on the connection between a gene's position in biological networks, in particular, genome-wide networks of protein−protein interactions and essentiality. It seems intuitively almost obvious that genes with many connections (network hubs) are 'important' and should be essential more often than poorly connected genes; of course, this is perfectly compatible with the observations on slow evolution of both network hubs and essential genes discussed above. Indeed, such a connection between 'centrality and lethality' has been reported by several groups (Jeong et al, 2001); apparent links between a gene's essentiality and other topological characteristics of networks, such as clustering coefficient, also have been reported (Yu et al, 2004). However, Coulomb et al (2005) argue that these effects were caused by biases in the analyzed interaction data that contained a greater number of valid interactions for essential genes. When a supposedly unbiased data set (Ito et al, 2001) was analyzed, only a marginal correlation between node degree (centrality) and essentiality was detected, and no dependence at all was seen for other topological features of networks (Coulomb et al, 2005).

The current state of Evolutionary Systems Biology is typical of any burgeoning discipline: it is clear that there are important signals out there but our ability to discern and understand these signals is hampered both by inaccuracies and biases in the data and the inadequacy of the existing theoretical models. These difficulties notwithstanding, we should be motivated by the (I believe, reasonable) hope that, as this field matures, our one-dimensional understanding of genome evolution develops into a multidimensional picture of evolution of organisms as systems.


TOPICS: News/Current Events
KEYWORDS: crevolist; evolution
Navigation: use the links below to view more comments.
first previous 1-2021-30 last
To: jennyp
[E]very yeast cell churns out about 1.26 million individual PMA1 molecules, making it the second-most abundant cellular protein.

Which, of course, leaves me wondering what the *first*-most abundant cellular protein might be, and ready to smack the writer of the article for keeping us hanging like that.

21 posted on 10/03/2005 9:30:32 PM PDT by Ichneumon
[ Post Reply | Private Reply | To 15 | View Replies]

To: Ichneumon
leaves me wondering what the *first*-most abundant cellular protein might be

A piece of the larger Ribosome?

22 posted on 10/04/2005 4:32:38 AM PDT by donh
[ Post Reply | Private Reply | To 21 | View Replies]

To: donh; jennyp; PatrickHenry
[leaves me wondering what the *first*-most abundant cellular protein might be]

A piece of the larger Ribosome?

I haven't found an answer for animals yet, but apparently the most abundant protein in plants is Rubisco (also often named as "the most abundant protein on Earth"), and for prokaryotes it's elongation factor Tu (EF-Tu).

23 posted on 10/04/2005 5:05:32 AM PDT by Ichneumon
[ Post Reply | Private Reply | To 22 | View Replies]

To: jennyp; PatrickHenry
Here's a link to the actual research paper, no subscription required: Why highly expressed proteins evolve slowly
24 posted on 10/04/2005 5:10:56 AM PDT by Ichneumon
[ Post Reply | Private Reply | To 15 | View Replies]

To: Ichneumon

Thanks for the link, but I know my limitations. I'm going to leave all rebuttals requiring links to such material -- or professional-level comprehension thereof -- to you and a few others. I'll stick to my area of expertise -- BS detection.


25 posted on 10/04/2005 6:43:22 AM PDT by PatrickHenry (Disclaimer -- this information may be legally false in Kansas.)
[ Post Reply | Private Reply | To 24 | View Replies]

To: Ichneumon; jennyp; PatrickHenry

I noticed that the article didn't discuss the role of regulatory genes in preventing changes to PMA1. Seems like they might've made a mention of it.

I'm reading James Valentine's book, "On the Origin of Phyla". He spends a lot of time talking about regulatory genes controlling gene change as the molecular basis for evolution.

Again I remind everyone that I'm a physicist, not a biologist.


26 posted on 10/04/2005 6:54:13 AM PDT by <1/1,000,000th%
[ Post Reply | Private Reply | To 21 | View Replies]

To: Ichneumon
Rubisco (also often named as "the most abundant protein on Earth"),

Because no creature can resist those rubisco crackers with the cream filling.

27 posted on 10/04/2005 10:49:57 AM PDT by donh
[ Post Reply | Private Reply | To 23 | View Replies]

To: Ichneumon
Here's a link to the actual research paper, no subscription required: Why highly expressed proteins evolve slowly

Thanks for the link!

28 posted on 10/04/2005 12:42:39 PM PDT by jennyp (WHAT I'M READING NOW: my sterling prose)
[ Post Reply | Private Reply | To 24 | View Replies]

To: neverdem
A recent study by Fraser (2005)seems to clarify the issue and provides an intriguing insight into the evolutionary forces that may be at play in network evolution. Fraser partitioned the interaction network hubs into two classes and showed that they dramatically differ in terms of the connection with the evolutionary rate (or, more precisely, the strength of purifying selection measured as the ratio of the rates for synonymous and nonsynonymous positions in coding sequences). It turns out that hubs that interact with numerous partners within a network module (intramodule hubs, also known under the more appealing name of 'party hubs';...

Taken together, these recent studies make, perhaps, relatively small but concrete inroads into the domain of Evolutionary Systems Biology (Medina, 2005). This area of inquiry is just making its baby steps, and the road ahead will be long and hard. ...

It seems intuitively almost obvious that genes with many connections (network hubs) are 'important' and should be essential more often than poorly connected genes; ...

... The current state of Evolutionary Systems Biology is typical of any burgeoning discipline: it is clear that there are important signals out there but our ability to discern and understand these signals is hampered both by inaccuracies and biases in the data and the inadequacy of the existing theoretical models. These difficulties notwithstanding, we should be motivated by the (I believe, reasonable) hope that, as this field matures, our one-dimensional understanding of genome evolution develops into a multidimensional picture of evolution of organisms as systems.

Wow, I thought I was going to have a hard time getting to sleep tonite! This article is actually so facinating how it relates to systems engineering and network centric operations... I'll sleep well now...

29 posted on 10/04/2005 9:32:37 PM PDT by phantomworker (Let freedom ring...)
[ Post Reply | Private Reply | To 1 | View Replies]

Comment #30 Removed by Moderator


Navigation: use the links below to view more comments.
first previous 1-2021-30 last

Disclaimer: Opinions posted on Free Republic are those of the individual posters and do not necessarily represent the opinion of Free Republic or its management. All materials posted herein are protected by copyright law and the exemption for fair use of copyrighted works.

Free Republic
Browse · Search
News/Activism
Topics · Post Article

FreeRepublic, LLC, PO BOX 9771, FRESNO, CA 93794
FreeRepublic.com is powered by software copyright 2000-2008 John Robinson