******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 2.2 (Release date: 1998/05/05 20:35:42) For further information on how to interpret these results or to get a copy of the MEME software please access http://www.sdsc.edu/MEME. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://www.sdsc.edu/MEME. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= adh.s (deleted by web version of MEME) ALPHABET= ACDEFGHIKLMNPQRSTVWY Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 2BHD_STREX 1.0000 255 3BHD_COMTE 1.0000 253 ADH_DROME 1.0000 255 AP27_MOUSE 1.0000 244 BA72_EUBSP 1.0000 249 BDH_HUMAN 1.0000 343 BPHB_PSEPS 1.0000 275 BUDC_KLETE 1.0000 241 DHES_HUMAN 1.0000 327 DHGB_BACME 1.0000 262 DHII_HUMAN 1.0000 292 DHMA_FLAS1 1.0000 270 ENTA_ECOLI 1.0000 248 FIXR_BRAJA 1.0000 278 GUTD_ECOLI 1.0000 259 HDE_CANTR 1.0000 906 HDHA_ECOLI 1.0000 255 LIGD_PSEPA 1.0000 305 NODG_RHIME 1.0000 245 RIDH_KLEAE 1.0000 249 YINL_LISMO 1.0000 248 YRTP_BACSU 1.0000 238 CSGA_MYXXA 1.0000 166 DHB2_HUMAN 1.0000 387 DHB3_HUMAN 1.0000 310 DHCA_HUMAN 1.0000 276 FABI_ECOLI 1.0000 262 FVT1_HUMAN 1.0000 332 HMTR_LEIMA 1.0000 287 MAS1_AGRRA 1.0000 476 PCR_PEA 1.0000 399 RFBB_NEIGO 1.0000 346 YURA_MYXXA 1.0000 258 ******************************************************************************** ******************************************************************************** EXPLANATION OF RESULTS ******************************************************************************** For each motif that it discovers in the training set, MEME prints the following information: Summary Line This line gives the width (`width') and expected number of occurrences in the training set (`sites') of the motif. MEME numbers the motifs consecutively from one as it finds them. MEME usually finds the most statistically significant motifs first. Each motif describes a pattern of a fixed width--no gaps are allowed in MEME motifs. MEME estimates the number of places the motif occurs in the training set. This need not be an integer value. Simplified Motif Letter-probability Matrix MEME motifs are represented by letter-probability matrices that specify the probability of each possible letter appearing at each possible position in an occurrence of the motif. In order to make it easier to see which letters are most likely in each of the columns of the motif, the simplified motif shows the letter probabilities multiplied by 10 rounded to the nearest integer. Zeros are replaced by ":" (the colon) for readability. Information Content Diagram The information content diagram provides an idea of which positions in the motif are most highly conserved. Each column (position) in a motif can be characterized by the amount of information it contains (measured in bits). Highly conserved positions in the motif have high information; positions where all letters are equally likely have low information. The diagram is printed so that each column lines up with the same column in the simplified motif letter-probability matrix above it. Summing the information content for each position in the motif gives the total information content of the motif (shown in parentheses to the left of the diagram). This gives a measure of the usefulness of the motif for database searches. For a motif to be useful for database searches, it must as a rule contain at least log_2(N) bits of information where N is the number of sequences in the database being searched. For example, to effectively search a database containing 100,000 sequences for occurrences of a single motif, the motif should have an IC of at least 16.6 bits. Motifs with lower information content are still useful when a family of sequences shares more than one motif since they can be combined in multiple motif searches (using MAST). Multilevel Consensus Sequence The multilevel consensus sequence corresponding to the motif is an aid in remembering and understanding the motif. It is calculated from the motif letter-probability matrix as follows. Separately for each column of the motif, the letters in the alphabet are sorted in decreasing order by the probability with which they are expected to occur in that position of motif occurrences. The sorted letters are then printed vertically with the most probable letter on top. Only letters with probabilities of 0.2 or higher at that position in the motif are printed. As an example, the multilevel consensus sequence of motif 2 in the sample output is: Multilevel LITGAASGIG consensus V GS sequence G This multilevel consensus sequence says several things about the motif. First, the most likely form of the motif can be read from the top line as LITGAASGIG. Second, that only letter L has probability more than 0.2 in position 1 of the motif, both I and V have probability greater than 0.2 in position 2, etc. Third, a rough approximation of the motif can be made by converting the multilevel consensus sequence into the Prosite signature L-[IV]-T-G-[AG]-[ASG]-S-G-I-G. The multilevel consensus sequence is printed so that each column lines up with the same column in the simplified motif and information content diagrams above it. Motif in BLOCKS or FASTA format For use with the BLOCKS (http://www.blocks.fhcrc.org/blocks) tools, MEME prints the sites in the sequences which were used to construct the motif in BLOCKS format. The sites reported are, for the different model types: OOPS position with highest z_i in each sequence, ZOOPS position with highest z_i > 0.5 in each sequence, TCM all positions with z_i > 0.5, where z_i is the probability that an occurrence of the motif starts at position i in the sequence given the sequence and the motif model. If you inlcude the -print_fasta switch on the command line, MEME prints the motif sites in FASTA format instead of BLOCKS format. Possible Examples of the Motif As a further aid in understanding the motif, MEME displays a list of possible occurrences of the motif in the training set. This list is made by converting the motif letter-probability matrix into a position-dependent scoring matrix (log-odds matrix) and using that to compute a match score between each position in the training set and the motif. All positions which score above a threshold score are listed. (The threshold score is chosen by MEME such that the expected number of non-motif positions listed in error will equal the number of actual motif positions not listed.) The format of the list is sequence name, starting position of the (putative) occurrence, match score of the position, and the actual sequence including the ten positions before and after the motif occurrence (`site'). Position-dependent Scoring Matrix The position-dependent scoring matrix corresponding to the motif is printed for use by database search programs such as MAST. This matrix is a log-odds matrix calculated by taking the log (base 2) of the ratio p/f at each position in the motif where p is the probability of a particular letter at that position in the motif, and f is the average frequency of that letter in the non-redundant database as of 9/22/96. The scoring matrix is printed "sideways"--columns correspond to the letters in the alphabet (in the same order as shown in the simplified motif) and rows corresponding to the positions of the motif, position one first. The scoring matrix is preceded by a line starting with "log-odds matrix:" and containing the length of the alphabet, width of the motif, number of characters in the training set and the scoring threshold used in the list of possible motif examples. Motif Letter-probability Matrix The motif itself is a position-dependent letter-probability matrix giving, for each position in the pattern, the probabilities of each possible letter occurring there. The letter-probability matrix is printed "sideways"--columns correspond to the letters in the alphabet (in the same order as shown in the simplified motif) and rows corresponding to the positions of the motif, position one first. The motif is preceded by a line starting with "letter-probability matrix:" and containing the length of the alphabet, width of the motif and number of characters in the training set. ******************************************************************************** ******************************************************************************** MOTIF 1 width = 10 sites = 34.5 ******************************************************************************** Simplified A ::::331::: motif letter- C :::::::::: probability D ::::::1::: matrix E ::::::1::: F 1::::::::: G :::911:9:9 H :::::::::: I 23::::::7: K ::::::1::: L 51::::::1: M 1::::::::: N ::::::1::: P :::::::::: Q ::::::1::: R ::::::1::: S ::1:221::: T ::7:111::: V 14::11::1: W :::::::::: Y :::::::::: bits 6.2 5.6 5.0 4.4 Information 3.7 content 3.1 (17.5 bits) 2.5 * * * 1.9 ** *** 1.2 **** *** 0.6 ****** *** 0.0 ---------- Multilevel LVTGAAxGIG consensus I sequence -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=10 seqs=33 2BHD_STREX ( 10) IITGGARGLG 0.999594 3BHD_COMTE ( 10) LVTGGASGVG 0.999864 AP27_MOUSE ( 11) LVTGAGKGIG 0.999988 BA72_EUBSP ( 10) IITGGTRGIG 0.999953 BDH_HUMAN ( 59) LVTGCDSGFG 0.995269 BPHB_PSEPS ( 9) LITGGASGLG 0.999774 BUDC_KLETE ( 6) LVTGAGQGIG 0.99999 DHES_HUMAN ( 6) LITGCSSGIG 0.999994 DHES_HUMAN ( 138) LVTGSVGGLM 0.860467 DHGB_BACME ( 11) VITGSSTGLG 0.999483 DHII_HUMAN ( 38) IVTGASKGIG 0.999991 DHMA_FLAS1 ( 18) IVTGAAGGIG 0.999951 DHMA_FLAS1 ( 245) FITGSTITID 0.776778 ENTA_ECOLI ( 9) WVTGAGKGIG 0.999929 FIXR_BRAJA ( 40) LLTGASRGIG 0.999972 HDE_CANTR ( 12) IITGAGGGLG 0.998651 HDE_CANTR ( 326) LITGAGAGLG 0.999595 HDE_CANTR ( 534) PVTGETSEIG 0.712874 HDHA_ECOLI ( 15) IITGAGAGIG 0.999948 LIGD_PSEPA ( 10) FITGGASGAG 0.995832 NODG_RHIME ( 10) LVTGASGAIG 0.998811 RIDH_KLEAE ( 18) AITGAASGIG 0.999806 YINL_LISMO ( 9) IITGASSGIG 0.999988 YRTP_BACSU ( 10) LITGGGRGIG 0.999964 DHB2_HUMAN ( 86) LVTGGDCGLG 0.991564 DHB3_HUMAN ( 52) VITGAGDGIG 0.999911 DHCA_HUMAN ( 8) LVTGGNKGIG 0.99994 FVT1_HUMAN ( 36) VVTGGSSGIG 0.999911 HMTR_LEIMA ( 10) LVTGAAKRLG 0.995564 MAS1_AGRRA ( 249) LVSGSNRGVG 0.998001 PCR_PEA ( 90) VITGASSGLG 0.999547 RFBB_NEIGO ( 10) LVTGGAGFIG 0.991357 // ----------------------------------------------------------------------- Possible examples of motif 1 in the training set ----------------------------------------------------------------------- Sequence name Start Score Site ------------- ----- ----- ---------- 2BHD_STREX 10 21.35 MNDLSGKTV IITGGARGLG AEAARQAVAA 3BHD_COMTE 10 23.50 TNRLQGKVA LVTGGASGVG LEVVKLLLGE AP27_MOUSE 11 26.66 MKLNFSGLRA LVTGAGKGIG RDTVKALHAS BA72_EUBSP 10 23.77 MNLVQDKVT IITGGTRGIG FAAAKIFIDN BDH_HUMAN 59 16.32 AAEPVGSKAV LVTGCDSGFG FSLAKHLHSK BPHB_PSEPS 9 22.05 MKLKGEAV LITGGASGLG RALVDRFVAE BUDC_KLETE 6 26.65 MQKVA LVTGAGQGIG KAIALRLVKD DHES_HUMAN 6 25.56 ARTVV LITGCSSGIG LHLAVRLASD DHES_HUMAN 138 12.43 DMKRRGSGRV LVTGSVGGLM GLPFNDVYCA DHGB_BACME 11 19.96 MYKDLEGKVV VITGSSTGLG KSMAIRFATE DHII_HUMAN 38 26.48 RPEMLQGKKV IVTGASKGIG REMAYHLAKM DHMA_FLAS1 18 25.44 RPGRLAGKAA IVTGAAGGIG RATVEAYLRE DHMA_FLAS1 245 9.69 AVFLAEDGSS FITGSTITID GGLSAMIFGG ENTA_ECOLI 9 23.60 MDFSGKNV WVTGAGKGIG YATALAFVEA FIXR_BRAJA 40 24.62 RVDRGEPKVM LLTGASRGIG HATAKLFSEA HDE_CANTR 12 20.05 SPVDFKDKVV IITGAGGGLG KYYSLEFAKL HDE_CANTR 326 22.03 PTVSLKDKVV LITGAGAGLG KEYAKWFAKY HDE_CANTR 534 9.57 LLVYLGTDDV PVTGETSEIG GGWIGNTRWQ HDHA_ECOLI 15 25.00 DNLRLDGKCA IITGAGAGIG KEIAITFATA LIGD_PSEPA 10 18.28 MKDFQDQVA FITGGASGAG FGQAKVFGQA NODG_RHIME 10 20.26 MFELTGRKA LVTGASGAIG GAIARVLHAQ RIDH_KLEAE 18 23.13 MNTSLSGKVA AITGAASGIG LECARTLLGA YINL_LISMO 9 25.71 MTIKNKVI IITGASSGIG KATALLLAEK YRTP_BACSU 10 24.61 MQSLQHKTA LITGGGRGIG RATALALAKE DHB2_HUMAN 86 16.21 ELLPVDQKAV LVTGGDCGLG HALCKYLDEL DHB3_HUMAN 52 23.96 SFLRSMGQWA VITGAGDGIG KAYSFELAKR DHCA_HUMAN 8 23.55 SSGIHVA LVTGGNKGIG LAIVRDLCRL FABI_ECOLI 10 9.02 MGFLSGKRI LVTGVASKLS IAYGIAQAMH FVT1_HUMAN 36 23.31 KPLALPGAHV VVTGGSSGIG KCIAIECYKQ HMTR_LEIMA 10 17.92 MTAPTVPVA LVTGAAKRLG RSIAEGLHAE MAS1_AGRRA 249 18.26 TVEIHQSPVI LVSGSNRGVG KAIAEDLIAH PCR_PEA 90 20.85 GKKTLRKGNV VITGASSGLG LATAKALAES RFBB_NEIGO 10 17.19 MQTEGKKNI LVTGGAGFIG SAVVRHIIQN ----------------------------------------------------------------------- log-odds matrix: alength= 20 w= 10 n= 9699 bayes= 8.12955 -1.910 -1.252 -4.780 -4.068 1.075 -4.036 -2.490 1.442 -3.846 2.317 2.196 -3.919 -3.202 -2.182 -3.104 -3.829 -1.873 0.422 -0.747 -1.085 -0.807 -0.256 -3.664 -3.469 -1.042 -3.564 -4.001 2.474 -3.427 0.327 0.299 -3.144 -3.427 -3.636 -3.884 -3.054 -0.784 2.552 -3.504 -2.417 -1.566 -1.340 -2.614 -3.032 -2.747 -3.240 -2.189 -1.695 -2.172 -2.867 -1.200 -1.000 -2.943 -1.886 -2.212 0.546 3.539 -1.369 -2.783 -3.060 -1.642 -2.941 -2.196 -2.942 -4.021 3.647 -2.698 -4.037 -2.767 -4.485 -3.269 -1.901 -3.582 -3.212 -2.706 -2.161 -3.192 -3.547 -3.246 -3.567 2.113 1.088 -2.159 -2.218 -1.863 0.649 -1.582 -1.866 -2.400 -1.992 -0.995 -1.019 -0.740 -1.296 -1.737 1.334 0.678 -0.240 -1.931 -2.177 2.108 1.077 -2.128 -2.226 -1.864 0.646 -1.582 -1.867 -2.400 -1.993 -0.995 -1.003 -0.740 -1.292 -1.737 1.338 0.683 -0.238 -1.930 -2.177 0.248 -1.407 0.222 0.781 -1.389 -0.851 0.537 -1.271 0.956 -1.142 -0.235 0.447 -0.941 0.947 0.446 0.269 0.164 -0.837 -1.448 -0.733 -1.642 -2.941 -2.196 -2.941 -4.018 3.647 -2.698 -4.037 -2.766 -4.485 -3.269 -1.901 -3.582 -3.211 -2.705 -2.161 -3.191 -3.548 -3.246 -3.567 -2.773 -2.761 -3.678 -3.740 -2.263 -4.217 -3.619 3.598 -3.307 -0.251 -0.323 -3.301 -4.223 -3.451 -3.634 -3.254 -2.443 1.129 -3.248 -2.723 -1.642 -2.941 -2.195 -2.942 -4.021 3.647 -2.698 -4.037 -2.767 -4.485 -3.266 -1.901 -3.582 -3.211 -2.706 -2.160 -3.192 -3.547 -3.246 -3.567 letter-probability matrix: alength= 20 w= 10 n= 9699 0.019468 0.007628 0.001883 0.003718 0.084879 0.004226 0.003992 0.152918 0.004068 0.456928 0.105716 0.003046 0.005508 0.008981 0.006034 0.005192 0.016214 0.086245 0.007949 0.015407 0.041830 0.015208 0.004080 0.005631 0.019562 0.005863 0.001400 0.312623 0.005438 0.115021 0.028379 0.005211 0.004711 0.003278 0.003515 0.008884 0.034510 0.377562 0.001176 0.006117 0.024709 0.007177 0.008452 0.007621 0.006001 0.007337 0.004919 0.017378 0.012982 0.012573 0.010041 0.023035 0.006591 0.011026 0.011199 0.107731 0.690444 0.024926 0.001938 0.003919 0.023441 0.002365 0.011294 0.008112 0.002481 0.868738 0.003455 0.003429 0.008595 0.004096 0.002393 0.012341 0.004230 0.004400 0.007952 0.016507 0.006501 0.005505 0.001406 0.002758 0.316482 0.038599 0.011587 0.013399 0.011077 0.108699 0.007489 0.015436 0.011081 0.023052 0.011576 0.022736 0.030337 0.016596 0.015565 0.186034 0.095024 0.054505 0.003500 0.007226 0.315485 0.038323 0.011837 0.013326 0.011067 0.108487 0.007491 0.015426 0.011082 0.023046 0.011576 0.022990 0.030350 0.016644 0.015570 0.186634 0.095380 0.054560 0.003500 0.007226 0.086862 0.006847 0.060366 0.107117 0.015381 0.038424 0.032536 0.023327 0.113447 0.041556 0.019593 0.062826 0.026397 0.078566 0.070709 0.088904 0.066555 0.036030 0.004889 0.019669 0.023447 0.002365 0.011294 0.008116 0.002487 0.868710 0.003456 0.003429 0.008599 0.004096 0.002393 0.012341 0.004231 0.004400 0.007958 0.016506 0.006507 0.005504 0.001406 0.002758 0.010702 0.002679 0.004041 0.004666 0.008394 0.003728 0.001826 0.681525 0.005910 0.077057 0.018435 0.004676 0.002713 0.003726 0.004181 0.007733 0.010928 0.140725 0.001404 0.004949 0.023443 0.002365 0.011299 0.008112 0.002481 0.868723 0.003455 0.003429 0.008596 0.004096 0.002398 0.012340 0.004231 0.004400 0.007952 0.016510 0.006502 0.005505 0.001406 0.002758 Time 449.09 secs. ******************************************************************************** MOTIF 2 width = 9 sites = 30.5 ******************************************************************************** Simplified A :::::::7: motif letter- C ::::::::: probability D :9::::::: matrix E ::::::::: F ::::::::: G ::::::::9 H ::::::::: I 3:3:2:::: K ::::::::: L 1:181:::: M ::::::::: N :::::99:: P ::::::::: Q ::::::::: R ::::::::: S :::::::1: T ::::::::: V 4:4:6:::: W ::::::::: Y ::::::::: bits 6.2 5.6 5.0 4.4 Information 3.7 content 3.1 * ** (21.4 bits) 2.5 * ** * 1.9 * * **** 1.2 ********* 0.6 ********* 0.0 --------- Multilevel VDVLVNNAG consensus I I sequence -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=9 seqs=33 2BHD_STREX ( 81) VDGLVNNAG 0.990346 3BHD_COMTE ( 81) LNVLVNNAG 0.999878 ADH_DROME ( 86) VDVLINGAG 0.999089 AP27_MOUSE ( 77) VDLLVNNAA 0.996075 BA72_EUBSP ( 86) LDVMINNAG 0.999891 BDH_HUMAN ( 138) MWGLVNNAG 0.978579 BPHB_PSEPS ( 79) IDTLIPNAG 0.998506 BUDC_KLETE ( 80) FNVIVNNAG 0.998856 DHES_HUMAN ( 84) VDVLVCNAG 0.992109 DHGB_BACME ( 87) LDVMINNAG 0.999911 DHMA_FLAS1 ( 198) VDVTGNNTG 0.945818 ENTA_ECOLI ( 73) LDALVNAAG 0.99176 FIXR_BRAJA ( 112) LHALVNNAG 0.998391 GUTD_ECOLI ( 82) VDLLVYSAG 0.98111 HDE_CANTR ( 58) VDEIVKNGG 0.600202 HDE_CANTR ( 92) VHVIINNAG 0.999218 HDE_CANTR ( 396) IDILVNNAG 0.999983 HDHA_ECOLI ( 89) VDILVNNAG 0.99704 NODG_RHIME ( 81) VDILVNNAG 0.99994 RIDH_KLEAE ( 89) LDIFHANAG 0.754506 YINL_LISMO ( 83) VDAIFLNAG 0.736596 YRTP_BACSU ( 84) IDILINNAG 0.99995 CSGA_MYXXA ( 13) VDVLINNAG 0.998813 DHB2_HUMAN ( 161) LWAVINNAG 0.878164 DHB3_HUMAN ( 125) IGILVNNVG 0.998525 DHCA_HUMAN ( 83) LDVLVNNAG 0.999836 FVT1_HUMAN ( 115) VDMLVNCAG 0.997644 HMTR_LEIMA ( 103) CDVLVNNAS 0.999715 MAS1_AGRRA ( 320) IDGLVNNAG 0.997085 PCR_PEA ( 165) LDVLINNAA 0.999678 YURA_MYXXA ( 90) LDLVVANAG 0.952505 // ---------------------------------------------------------------------- Possible examples of motif 2 in the training set ---------------------------------------------------------------------- Sequence name Start Score Site ------------- ----- ----- --------- 2BHD_STREX 81 24.73 VAYAREEFGS VDGLVNNAG ISTGMFLETE 3BHD_COMTE 81 23.86 MAAVQRRLGT LNVLVNNAG ILLPGDMETG ADH_DROME 86 21.87 LKTIFAQLKT VDVLINGAG ILDDHQIERT AP27_MOUSE 77 23.24 TEKALGGIGP VDLLVNNAA LVIMQPFLEV BA72_EUBSP 86 24.86 VGQVAQKYGR LDVMINNAG ITSNNVFSRV BDH_HUMAN 138 14.76 PFEPEGPEKG MWGLVNNAG ISTFGEVEFT BPHB_PSEPS 79 18.10 ASRCVARFGK IDTLIPNAG IWDYSTALVD BUDC_KLETE 80 19.24 VEQARKALGG FNVIVNNAG IAPSTPIESI DHES_HUMAN 84 23.61 AARERVTEGR VDVLVCNAG LGLLGPLEAL DHGB_BACME 87 24.86 VQSAIKEFGK LDVMINNAG MENPVSSHEM DHMA_FLAS1 198 13.71 ILVNMIAPGP VDVTGNNTG YSEPRLAEQV ENTA_ECOLI 73 17.55 CQRLLAETER LDALVNAAG ILRMGATDQL FIXR_BRAJA 112 18.90 EVKKRLAGAP LHALVNNAG VSPKTPTGDR GUTD_ECOLI 82 15.25 SRGVDEIFGR VDLLVYSAG IAKAAFISDF HDE_CANTR 58 9.83 GGNSKAADVV VDEIVKNGG VAVADYNNVL HDE_CANTR 92 19.87 VETAVKNFGT VHVIINNAG ILRDASMKKM HDE_CANTR 396 30.52 IKNVIDKYGT IDILVNNAG ILRDRSFAKM HDHA_ECOLI 89 30.64 ADFAISKLGK VDILVNNAG GGGPKPFDMP NODG_RHIME 81 30.64 GQRAEADLEG VDILVNNAG ITKDGLFLHM RIDH_KLEAE 89 10.85 LQGILQLTGR LDIFHANAG AYIGGPVAEG YINL_LISMO 83 11.38 VELAIERYGK VDAIFLNAG IMPNSPLSAL YRTP_BACSU 84 29.19 VAQVKEQLGD IDILINNAG ISKFGGFLDL CSGA_MYXXA 13 29.42 AFATNVCTGP VDVLINNAG VSGLWCALGD DHB2_HUMAN 161 12.24 KVAAMLQDRG LWAVINNAG VLGFPTDGEL DHB3_HUMAN 125 18.92 HIKEKLAGLE IGILVNNVG MLPNLLPSHF DHCA_HUMAN 83 28.54 RDFLRKEYGG LDVLVNNAG IAFKVADPTP FVT1_HUMAN 115 21.37 IKQAQEKLGP VDMLVNCAG MAVSGKFEDL HMTR_LEIMA 103 22.15 VAACYTHWGR CDVLVNNAS SFYPTPLLRN MAS1_AGRRA 320 24.62 VTAAVEKFGR IDGLVNNAG YGEPVNLDKH PCR_PEA 165 21.92 VDNFRRSEMP LDVLINNAA VYFPTAKEPS YURA_MYXXA 90 14.75 IRALDAEAGG LDLVVANAG VGGTTNAKRL ---------------------------------------------------------------------- log-odds matrix: alength= 20 w= 9 n= 9732 bayes= 8.31344 -0.794 -0.235 -3.664 -3.466 -1.025 -3.556 -4.003 2.449 -3.426 0.346 0.313 -3.144 -3.419 -3.636 -3.888 -3.052 -0.772 2.561 -3.500 -2.412 -3.028 -3.426 4.054 -0.789 -3.902 -3.514 -2.254 -3.814 -3.608 -4.155 -3.469 -0.631 -4.393 -3.163 -3.581 -3.065 -3.539 -3.609 -3.670 -3.453 -0.772 -0.241 -3.644 -3.423 -1.021 -3.463 -3.896 2.440 -3.390 0.334 0.317 -3.113 -3.392 -3.578 -3.822 -3.005 -0.759 2.561 -3.429 -2.376 -2.595 -2.183 -4.170 -3.431 -0.726 -4.168 -2.707 -0.189 -3.214 3.056 0.709 -3.579 -3.330 -2.413 -2.775 -3.338 -2.478 -0.849 -2.219 -2.309 -0.752 -0.661 -3.402 -3.195 -1.561 -3.541 -3.019 1.775 -3.224 -0.348 -0.309 -3.170 -3.203 -3.325 -3.178 -2.968 -1.005 3.112 -3.455 -2.882 -3.437 -2.925 -1.882 -3.843 -3.461 -3.330 -0.592 -2.976 -2.890 -3.900 -2.993 4.221 -3.664 -2.402 -3.185 -1.632 -2.356 -3.444 -2.990 -3.206 -3.438 -2.925 -1.882 -3.843 -3.461 -3.330 -0.592 -2.976 -2.890 -3.901 -2.993 4.221 -3.664 -2.402 -3.185 -1.632 -2.356 -3.444 -2.990 -3.207 3.325 -0.324 -3.054 -2.712 -2.489 -1.260 -2.673 -2.170 -2.722 -2.320 -1.470 -2.780 -3.797 -2.712 -2.675 -0.342 -1.535 -0.707 -2.685 -2.979 -1.641 -2.941 -2.196 -2.942 -4.021 3.647 -2.698 -4.037 -2.767 -4.485 -3.269 -1.901 -3.582 -3.211 -2.706 -2.160 -3.192 -3.548 -3.246 -3.567 letter-probability matrix: alength= 20 w= 9 n= 9732 0.042191 0.015429 0.004081 0.005642 0.019801 0.005893 0.001399 0.307360 0.005444 0.116530 0.028654 0.005213 0.004739 0.003279 0.003505 0.008899 0.034787 0.379833 0.001180 0.006142 0.008971 0.001690 0.859467 0.036081 0.002695 0.006067 0.004702 0.004001 0.004798 0.005149 0.002084 0.029756 0.002413 0.004550 0.004338 0.008820 0.005111 0.005274 0.001048 0.002985 0.042839 0.015368 0.004138 0.005812 0.019848 0.006286 0.001507 0.305449 0.005579 0.115589 0.028728 0.005326 0.004827 0.003413 0.003669 0.009196 0.035102 0.379789 0.001239 0.006297 0.012108 0.004000 0.002874 0.005778 0.024360 0.003855 0.003434 0.049356 0.006302 0.762948 0.037711 0.003855 0.005040 0.007653 0.007582 0.007297 0.010661 0.035723 0.002866 0.006597 0.043431 0.011487 0.004894 0.006807 0.013654 0.005955 0.002767 0.192553 0.006259 0.072045 0.018615 0.005119 0.005503 0.004068 0.005734 0.009435 0.029596 0.556429 0.001217 0.004433 0.006754 0.002392 0.014037 0.004345 0.003658 0.006893 0.014882 0.007153 0.007891 0.006141 0.002897 0.858991 0.003999 0.007709 0.005705 0.023817 0.011601 0.005915 0.001679 0.003540 0.006753 0.002392 0.014037 0.004345 0.003658 0.006895 0.014882 0.007153 0.007890 0.006140 0.002897 0.858995 0.003997 0.007709 0.005705 0.023818 0.011601 0.005915 0.001679 0.003539 0.732947 0.014511 0.006231 0.009513 0.007177 0.028951 0.003516 0.012507 0.008865 0.018363 0.008327 0.006709 0.003645 0.006221 0.008128 0.058231 0.020508 0.039432 0.002075 0.004144 0.023452 0.002365 0.011294 0.008113 0.002482 0.868722 0.003455 0.003429 0.008595 0.004096 0.002393 0.012340 0.004230 0.004400 0.007952 0.016512 0.006501 0.005504 0.001406 0.002758 Time 842.83 secs. Stopped because nmotifs = 2 reached. CPU: ghidorah ******************************************************************************** DEBUG INFORMATION ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. model: mod= tcm nmotifs= 2 chi= 1 width: minw= 8 maxw= 57 shorten= yes lambda: minsites= 0 maxsites= 0 theta: prob= 1 spmap= pam spfuzz= 120 em: prior= mega b= 99960 maxiter= 50 distance= 0.001 data: n= 9996 N= 33 strands: w53 sample: seed= 0 seqfrac= 1 LRT: adj= root Dirichlet mixture priors file: prior30.plib Letter frequencies: A 0.111 C 0.012 D 0.050 E 0.055 F 0.036 G 0.090 H 0.018 I 0.057 K 0.052 L 0.092 M 0.027 N 0.041 P 0.041 Q 0.029 R 0.049 S 0.064 T 0.057 V 0.083 W 0.010 Y 0.027 Non-redundant database letter frequencies: A 0.073 C 0.018 D 0.052 E 0.062 F 0.040 G 0.069 H 0.022 I 0.056 K 0.058 L 0.092 M 0.023 N 0.046 P 0.051 Q 0.041 R 0.052 S 0.074 T 0.059 V 0.064 W 0.013 Y 0.033 Effective length of alphabet = 20 Entropy of dataset (bits) = -4.11 meme adh.s -mod tcm -protein -nostatus -nmotifs 2 -gcg ********************************************************************************