This new method was used for multiple sequence alignments of LRRs in the yddK protein. This analysis predicted not nine repeats of the LRRs but 13 repeats and also revealed that their “”phasing”" differ significantly. We noticed that LRRs, 1, 5 7, 8, 9, and 10 contain a unique domain whose consensus is LxxLxLxxNxLxxLxLxxxxx
with 21 residues. The variable segment offers a characteristic hydrophobic pattern unidentified previously (Figure 1A). Each LRR domain is a nested sequence and consists of repeats alternating 10- and 11- residue units of LxxLxLxxNx(x/-). LRR proteins having the IRREKO@LRR domains were identified in three steps: Step 1: Detection of LRR proteins containing the six, novel LRRs in E-coli yddk by using FASTA Step 2: Identification of the IRREKO@LRRs in individual LRR proteins by a new method. Step 3: Iteration of these two steps using novel LRRs in newly identified LRR proteins In step 1, we performed similarity search using PF01367338 the six, novel LRRs as probes by FASTA at the Bioinformatic Center, Institute for Chemical Research, Kyoto University on April 27, 2009 http://www.genome.ad.jp/. This ARS-1620 procedure detected many yddK
homologs from Escherichia Protein Tyrosine Kinase inhibitor coli strains and Shigella flexneri [Q0T447 and Q83R94] with significant similarity (E-values < 6.5 × 10-29). In addition, two other proteins were detected with significant similarity (E-value < 3.3 × 10-9). One is SSON_1653 that is 387 residues long [Q3Z1L5]. The other is SD1012_2081 with 163 residues [B3WXZ7]. In step 2,
we performed multiple sequence alignment among their LRR domains of SSON_1653 and Sd1012_2081. SSON_1653 contains 14 LRRs and 9 of the 12 repeats consist of LxxLxLxxNxLxxL(D/N)(L/F)xxxxx where “”L”" is Leu, Val, or Ile. Sd1012_2081 contains 4.5 LRRs; 3.5 of these repeats consist of LxxLxLxxNxLxxIx(I/A/F)xxaxx In step 3, the above procedures were iterated to identify other LRR proteins having this IRREKO@LRR domain. Sequence Analyses The dot-matrix comparisons were performed using the BLOSUM62 scoring matrix and a window size of 21 residues http://emboss.bioinformatics.nl/cgi-bin/emboss/dotmatcher. A radar chart is a graphical method displaying multivariate data in the form of a two-dimensional chart of three or P-type ATPase more quantitative variables represented on axes starting from the same point http://en.wikipedia.org/wiki/Radar_chart. For a given observation, the length of each ray is the occurrence frequency of each amino acid at two positions of “”IRREKO”" LRR with 21 residues. Multiple sequence alignments were performed by CLUSTALW at the Bioinformatic Center. The protein secondary structure prediction was performed by SSpro4.0 http://contact.ics.uci.edu/sspro4.html[30] and Proteus http://129.128.185.184/proteus/#[31]. Signal sequence analysis was carried out using the program SignalP [39]. Acknowledgements We thank Dr. Robert H.