Motif 7 presented a typical K1-type signature, with AGT coding for Ser, as opposed to a TCA/G codon in the Mad20 types. All K1-type alleles contained more than one motif sequence, resulting in eleven di-motif combinations (hexapeptides). Most alleles had three or four different motifs (Figure 4A). Some di-motifs were very frequent and motif 3 1 was present in all alleles [see Additional file 4]. A clear dichotomy could be delineated based on the first 5′ di-motif being either 3 1 (group 1, 28 alleles)
or 3 4 (group 2, 49 alleles) (with the exception of allele 28 which SRT2104 in vitro displayed a 3 3 motif). Limited polymorphism was observed in the 3′ family-specific region, with a Ferrostatin-1 chemical structure non-synonymous S to L (tca>tta) mutation, observed in three alleles, and a six amino acid insertion, SPPADA, observed in a single allele (Table 2). Figure 4 Frequency distribution of the number of tri-peptide motif usage in the DK and DM alleles. A: Frequency distribution of K1-type alleles (DK alleles) by number of distinct tripeptides present. B. Frequency distribution of Mad20-types (DM alleles)
by number of distinct tripeptide nucleotide sequences present (DMR, DMRK and MK hybrids excluded). C. Frequency distribution of Mad20-types DM alleles (by number of distinct tripeptide Blasticidin S in vivo protein sequences present (DMR, DMRK and MK hybrids excluded). Similar findings were observed for the Mad20 types alleles, which differed mainly in the number, arrangement and coding sequence of six tripeptide motifs (coded 5-9). There were two synonymous sequences coding for SGG (5 and 5) such that all Mad20-type
alleles contained an SGG-encoding motif [see Additional file 4]. In this family too, all alleles contained more than one motif sequence. The majority had four distinct nucleotide sequence motifs (Figure 4B), encoding three different tripeptide sequences (Figure 4C). Some di-motifs were highly represented, with the SVA SGG motif (6 5 or 6 5) being present in virtually all alleles. There was a dichotomy within the family based on the first 5′ motif, being either 5/5 (group 1, 8 alleles) or 8 (group 2, 26 alleles) (Table 2). This group-specific 5′ end was followed by a variable copy acetylcholine number and arrangement of six di-motif sequences, which at the protein level translated into variable combinations of the SGG and SVA tripeptides. All Mad20-type block2 repeats except two (DM9 and DM29) terminated with the (5 6 5) sequence. The flanking non repeated region upstream from the tripeptide motifs was identical in all alleles. Downstream from the repeats, a 9 amino acid deletion (NSRRTNPSD) was observed in three alleles, but otherwise the family-specific region was monomorphic. Sequencing showed that 22 fragments assigned to the Mad20 family by semi-nested PCR were indeed Mad20/RO33 (MR) hybrids.