A more accurate relationship between ‘effective number of codons’ and GC3s under assumptions of no selection
The ‘effective number of codons’ (Nc) introduced by Frank Wright in 1990 is one of the best measures to show the state of codon usage biases in genes and genomes. Although estimate methods of Nc have been improved by several investigators since then, no one noticed that the relationship between Nc and GC3s under assumptions of no selection given by Wright has a little but significant deviation. Since the curve showing such a relationship in Nc-plot is a useful reference line to display the main features of codon usage pattern for a number of genes, its high accuracy is important and necessary. Under ideal and ultimate conditions listed in this text a computational sample of Nc versus GC3s was derived and calculated. By nonlinear regression analysis, the relationship between Nc and GC3s without synonymous codon selection can be approximated by: Nc=2.5−s+29.5/(s2+(1−s)2)Nc=2.5−s+29.5/(s2+(1−s)2), instead of Wright's: Nc=2+s+29/(s2+(1−s)2)Nc=2+s+29/(s2+(1−s)2), where s denotes GC3s. The goodness of fit analysis of both confirmed that the new formula presented in this text is more accurate than the original one. In addition, in the case of using the same estimate method of Nc, the situation in overestimation is decreased to a certain extent by using the new reference line in comparison with Wright's one.
Graphical abstractFigure optionsDownload full-size imageDownload as PowerPoint slideHighlight► Ultimate conditions to build the relationship between Nc and GC3s were proposed. ► Codon homozygosities and a computational sample were derived and calculated. ► Based on the sample a new formula representing such a relationship was created. ► Analyses confirmed that new formula gives an accurate reference line in Nc-plot.
Journal: Computational Biology and Chemistry - Volume 42, February 2013, Pages 35–39