Write a program that converts chains of messenger RNA (derived sequences of DNA) to proteins using the genetic code.
The genetic code is a set of rules that translates the sequences of messenger RNA to proteins. A sequence of messenger RNA is a sequence of bases. There are four possible bases: A, C, G and U. The bases of genes are grouped in threes forming codons. Every codon corresponds to an amino acid. A protein is a sequence of amino acids.
The following figure shows the genetic code. It can be seen, for instance, that the codon GGA corresponds to glycine and that the codon AUC corresponds to isoleucine. There are also three special codons, marked with the stop symbol, that do not encode any amino acid, but indicate the end of codification. Once a stop codon is found, the gene is finished (an AUG does not have to be searched after). Moreover, proteins only start to be synthesized from the first appearance of the codon AUG. Thus, an imaginari gene GCCAAUGACUAAGGCCUAAAGA would correspond to the protein ThrLysAla.
Input
Input is a gene obtained from the GeneBank, a genome bank that can be consulted on the Internet. This gene consists of a brief finished in ‘:’ followed by the sequence of messenger RNA bases corresponding to this gene. It always appears a AUG codon before a Stop codon.
Output
The output must be the protein synthesized by this gene according the previous rules of the genetic code. Your program must print the sequence using the standard names of three letters for each amino acid. For each line, print 26 amino acids, except the last one, that may contain less.
Observation
The second instance is an artificial extract of genome of hepatitis C virus.The private test datas contain the complete genome (10 kilobases).
Input
Small test: GCCAAUGACUAAGGCCUAAAGA
Output
ThrLysAla
Input
Hepatitis C virus, partial genome: UUGUGGUACUGCCUGAUAGGGUGCUUGCGAGUGCCCCGGGAGGUCUCGUAGACCGUGCACCAUGAGCACG AAUCCUAAACCUCAAAGAAAAACCAAACGUAACACCAACCGUCGCCCACAGGACGUCAAGUUCCCGGGUG GCGGUCAGAUCGUUGGUGGAGUUUACUUGUUGCCGCGCAGGGGCCCUAGAUUGGGUGUGCGCGCGACGAG GAAGACUUCCGAGCGGUCGCAACCUCGAGGUAGACGUCAGCCUAUCCCCAAGGCACGUCGGCCCGAGGGC AGGACCUGGGCUCAGCCCGGGUACCCUUGGCCCCUCUAUGGCAAUGAGGGUUGCGGGUGGGCGGGAUGGC UCCUGUCUCCCCGUGGCUCUCGGCCUAGCUGGGGCCCCACAGACCCCCGGCGUAGGUCGCGCAAUUUGGG UAAGGUCAUCGAUACCCUUACGUGCGGCUUCGCCGACCUCAUGGGGUACAUACCGCUCGUCGGCGCCCCU CUUGGAGGCGCUGCCAGGGCCCUGGCGCAUGGCGUCCGGGUUCUGGAAGACGGCGUGAACUAUGCAACAG GGAACCUUCCUGGUUGCUCUUUCUCUAUCUUCCUUCUGGCCCUGCUCUCUUGCCUGACUGUGCCCGCUUC AGCGUUGGUGGUAGCUCAGCUGCUCCGGAUCCCACAAGCCAUCAUGGACAUGAUCGCUGGUGCUCACUGG GGAGUCCUGGCGGGCAUAGCGUAUUUCUCCAUGGUGGGGAACUGGGCGAAGGUCCUGGUAGUGCUGCUGC UAUUUGCCGGCGUCGACGCGGAAACCCACGUCACCGGGGGAAGUGCCGGCCGCACCACGGCUGGGCUUGU UGGUCUCCUUACACCAGGCGCCAAGCAGAACAUCCAACUGAUCAACACCAACGGCAGUUGGCACAUCAAU AGCACGGCCUUGAACUGCAAUGAAAGCCUUAACACCGGCUGGUUAGCAGGGCUCUUCUAUCAGCACAAAU UCAACUCUUCAGGCUGUCCUGAGAGGUUGGCCAGCUGCCGACGCCUUACCGAUUUUGCCCAGGGCUGGGG UCCUAUCAGUUAUGCCAACGGAAGCGGCCUCGACGAACGCCCCUACUGCUGGCACUAACCUCCAAGACCU
Output
SerThrAsnProLysProGlnArgLysThrLysArgAsnThrAsnArgArgProGlnAspValLysPheProGlyGly GlyGlnIleValGlyGlyValTyrLeuLeuProArgArgGlyProArgLeuGlyValArgAlaThrArgLysThrSer GluArgSerGlnProArgGlyArgArgGlnProIleProLysAlaArgArgProGluGlyArgThrTrpAlaGlnPro GlyTyrProTrpProLeuTyrGlyAsnGluGlyCysGlyTrpAlaGlyTrpLeuLeuSerProArgGlySerArgPro SerTrpGlyProThrAspProArgArgArgSerArgAsnLeuGlyLysValIleAspThrLeuThrCysGlyPheAla AspLeuMetGlyTyrIleProLeuValGlyAlaProLeuGlyGlyAlaAlaArgAlaLeuAlaHisGlyValArgVal LeuGluAspGlyValAsnTyrAlaThrGlyAsnLeuProGlyCysSerPheSerIlePheLeuLeuAlaLeuLeuSer CysLeuThrValProAlaSerAlaLeuValValAlaGlnLeuLeuArgIleProGlnAlaIleMetAspMetIleAla GlyAlaHisTrpGlyValLeuAlaGlyIleAlaTyrPheSerMetValGlyAsnTrpAlaLysValLeuValValLeu LeuLeuPheAlaGlyValAspAlaGluThrHisValThrGlyGlySerAlaGlyArgThrThrAlaGlyLeuValGly LeuLeuThrProGlyAlaLysGlnAsnIleGlnLeuIleAsnThrAsnGlySerTrpHisIleAsnSerThrAlaLeu AsnCysAsnGluSerLeuAsnThrGlyTrpLeuAlaGlyLeuPheTyrGlnHisLysPheAsnSerSerGlyCysPro GluArgLeuAlaSerCysArgArgLeuThrAspPheAlaGlnGlyTrpGlyProIleSerTyrAlaAsnGlySerGly LeuAspGluArgProTyrCysTrpHis