histolytica currently Rahman is a nonvirulent strain. We have previously demon strated that gene expression profiles are substantially different between these two strains and have used the distinct strain specific expression profiles to identify vi rulence genes. Given the data above, and to explore whether small RNAs play a role in strain specific and/or virulence gene regulation, we constructed a small RNA library from trophozoites of the E. histolytica Rahman strain. We visualized the small RNA populations in E. histoly tica Rahman by separating total RNA on a 12% denatur ing polyacrylamide gel followed by Sybr gold staining and visualized an abundant 27nt small RNA population. A small RNA library was constructed from size fractionated RNA using a 50 P inde pendent cloning approach and limited pyrophosphate sequencing was performed generating 151,656 reads.
For the purpose of mapping, we used E. histolytica Inhibitors,Modulators,Libraries HM 1 IMSS genome as a reference genome rather Inhibitors,Modulators,Libraries than the current E. histolytica Rahman assembly, based on the following facts the current E. histolytica Rahman genome assembly is in a Inhibitors,Modulators,Libraries preliminary stage containing 17,378 small contigs and is unannotated, there is a high similarity between these two strains and one previous study has estimated that only 5 out of a sample of 1,817 genes were identified as highly or significantly divergent, and Affymetrix platform microarrays found no difference in overall hybridization efficiency levels compared to HM 1 IMSS, indicating a high level of se quence identity for the protein coding genes. We realize that sequence differences between the E.
histoly tica HM 1 IMSS and Rahman strains may cause us to lose some data. However, the advantages of being able to map to an annotated genome and thus determine how many small RNAs map to protein coding genes and to intergenic regions were significant enough that we proceeded with the data generated by aligning the E. histolytica Inhibitors,Modulators,Libraries Rahman small RNA library to the E. histolytica HM 1 IMSS gen ome sequence. Following the same small RNA sequence analysis flow chart as applied to the E. histolytica HM 1 IMSS library, the E. histolytica Rahman dataset was analyzed. Overall, there were 98,414 unique sequence reads, with 84. 1% of the sequen ces found to have been sequenced only once. Small RNAs that mapped to tRNAs, rRNAs and repetitive elements were subtracted from the dataset.
The remaining reads were aligned to the E. histolytica HM 1 Inhibitors,Modulators,Libraries IMSS genome and to the predicted protein coding genes. The mapping of E. histolytica Rahman small RNA dataset showed a similar overall following website dis tribution pattern as that of the small RNA EhAGO2 2 IP library many small RNA reads from the Rahman library mapped antisense to genes . the two other main categories of small RNAs were those that mapped sense to genes and to intergenic regions. In addition, the size distribution of the alig ned reads in Rahman showed a peak at 27nt with a 50 G bias.