68, P < 0 001 To uncover the variations of gene expression and m

68, P < 0.001. To uncover the variations of gene expression and molecular conservation, all CDS genes were classified into five subclasses based on expression level. Briefly, first, we assumed that at a certain time point, some transcripts are highly expressed, and some are lowly expressed or not even transcribed. Then, excluding the non-expressed genes, we used quartation to classify all expressed genes to three expression level groups: the genes with the top 25% RPKM in buy MK5108 a sample were defined as highly expressed genes (HEG), the lowest 25% were classified to lowly expressed genes (LEG), and the median

group was defined as moderately expressed genes (MEG). Thus, if we trace one gene’s expression level across multiple samples, it might be constantly classified into HEG, MEG, LEG, or NEG (non expressed genes), which were collectively designated constantly expressed genes (CEG); otherwise, it was defined this website as variably expressed gene (VEG). All MED4 CDS genes were classified into five subgroups (HEG, MEG, LEG, NEG, and VEG). HEG had a significantly lower nonsynonymous substitution rate (Ka) than MEG or LEG (Kruskal-Wallis Test, two-tailed P < 0.001; Figure 3a), indicating a strong negative correlation between gene expression level and evolutionary rate. Intriguingly, CEG subclass

had a lower Ka than VEG (Mann–Whitney U Test, two-tailed P < 0.001; Figure 3b), even when HEG were excluded from the CEG because of their bias with

the lowest evolutionary rate among all expression subclasses (data not shown). Figure 3 Gene expression and molecular evolution of the core genome and flexible genome of Prochlorococcus MED4. (a) Box plot of the correlation between gene expression levels and (-)-p-Bromotetramisole Oxalate the nonsynonymous substitution rates (Ka). The line was drawn through the median. A circle represents an outlier, and an asterisk represents an extreme data point. (b) Nonsynonymous substitution rate comparison between CEG and VEG (Mann–Whitney U Test, two-tailed). A circle represents an outlier, and an asterisk represents an extreme data point. (c) Comparison of five expression subclasses between the core genome and flexible genome (Fisher’s exact test, one-tailed). P-value ≤ 0.05 was indicated in figure. HEG, highly expressed genes; MEG, moderately expressed genes; LEG, lowly expressed genes; NEG, non expressed genes; CEG, constantly expressed genes (including four expression subclasses mentioned above); VEG, variably expressed genes. Next, we compared the five gene expression subclasses of the core genome to that of the flexible genome. Our analysis clearly indicates that the genes in the HEG and MEG subclasses were more enriched in the core genome than in the flexible genome (17.7% > 11.5% and 26.8% > 15.3%, respectively; P < 0.001; Figure 3c). Conversely, the core genome had fewer NEG and VEG than the flexible genome (1.5% < 6.6% and 49.6% < 64.6%, respectively; P < 0.001; Figure 3c).

Comments are closed.