=Paper=
{{Paper
|id=None
|storemode=property
|title=Decryption Through the Likelihood of Frequency of Letters
|pdfUrl=https://ceur-ws.org/Vol-686/paper07.pdf
|volume=Vol-686
}}
==Decryption Through the Likelihood of Frequency of Letters==
<pdf width="1500px">https://ceur-ws.org/Vol-686/paper07.pdf</pdf>
<pre>
         Decryption Through the Likelihood of
                 Frequency of Letters

 Barbara Sánchez Rinza, Fernando Zacarias Flores, Luna Pérez Mauricio, and
                      Martínez Cortés Marco Antonio

                  Benemérita Universidad Autónoma de Puebla,
                               Computer Science
                     14 Sur y Av. San Claudio, Puebla, Pue.
                                  72000 México
                  brinza@cs.buap.mx, fzflores@yahoo.com.mx


      Abstract. The method to decrypt the information using probability
      leads to a more thorough job, because you have to know the percent-
      age of each of the letters of the language that is being analyzed here is
      Spanish. You can consider not only the probabilities of the letters also
      syllables, set of three, four letters and even words. Then you have this
      thing to do is make comparisons of the frequencies of cipher text and
      the frequencies of the language to begin to replace by a correspondence.
      And finally passing a scanner and find the decrypted text.

      Keywords Probability, Decrypt.


1   Introduction
Cryptography is the science that alters the linguistic representations of a message
[1]. For this there are different methods, where the most common is encryption.
This science masking the original references of the information by a conversion
method governed by an algorithm that allows the reverse or decryption of in-
formation. Use of this or other techniques, allowing for an exchange of messages
that can only be read by the intended beneficiaries as ’consistent’. A consistent
recipient is the person to whom the message is directed with the intention of
the sender. Thus, the recipient knows the discrete coherent used for masking the
message. So either have the means to bring the message to the reverse process
cryptographic, or can infer the process that becomes a message to the public. The
original information to be protected is called plaintext or cleartext. Encryption
is the process of converting plain text into unreadable gibberish called cipher-
text or cryptogram. In general, the concrete implementation of the encryption
algorithm (also called figure) is based on the existence of key secret information
that fits the encryption algorithm for each different use [2].

   Decryption is the reverse process to recover the plaintext from the ciphertext
and key. Cryptographic protocol specifies the details of how to use algorithms
and keys (and other primitive operations) to achieve the desired effect. The set


                                                                                      57
of protocols, encryption algorithms, key management processes and actions of
the users, which together constitute a cryptosystem, which is what the end user
works and interacts. In this work, we must first have a ciphertext which must
meet certain requirements, such a text should be bijective so that each element
of the domain carries a single element of the condominium. In addition we must
also take account of the rules of Kerckhoff [3].


2      Development work

2.1     Frequencies in Spanish

Is required to decrypt text using the odds as to how often they used certain
letters in the alphabet, for this work only considered the Spanish language [5].

      The frequencies of Spanish, which were used for this study were:

 1. Frequency triglyphs
 2. Frequency of digraphs
 3. Most common words
 4. Frequency of letters at the beginning of words
 5. Frequency of letters in Spanish
 6. Frequency Words


2.2     Triglyphs Frequencies

The letter frequency statistics may vary from one to another depending on the
corpus author has chosen to develop them. Usually differences when the corpus
is literary or consists of texts of different origins. Table 1 shows the frequency of
each of the Spanish alphabet with their respective percentage.


High frequency letters Medium frequency letters Low frequency letters Frequencies 0.5%
  letter    freq.%       letter     freq.%        letter   freq.%        G, F, V, W
    E        16,78         R          4,94          Y        1,54
    A        11,96         U          4,80          Q        1,53
    O         8,69          I         4,15          B        0,92
    L         8,37         T          3,31          H        0,89
    S         7,88         C          2,92                              J, Z, X, K, N
    N         7,01         P          2,76
    D         6,87         M          2,12

                            Table 1. Frequency triglyphs


                                                                                         58
2.3     Most Frequent words
The vowels make up about 46.38% of the text. The high frequency letters account
for 67.56% of the text. Mid-frequency points accounting for 25% of the text [4].
In the dictionary the most common vowel is A, but in written texts is the E
because of prepositions, conjunctions, verbs, etc. The most common consonants
are L, S, N, D, with about 30%. The less frequent six letters: V, N, J, Z, X and
K (just over 1%). The average frequency of a Spanish word is 5.9 letters. The
coincidence index for Spanish is 0.0775. In addition to solving the encryption
table 2 we mentioned that we most frequently used words in a text of 10 000
words.


               Most common words Two-letter words Three-letter words
               Word Frequency Frequency Word         Frequency
                DE      778        778      QUE          289
                LA      460        460      LOS          196
                El      339        339      DEL          156
                EN      302        302      LAS          114
               QUE      289        119      POR          110
                 Y      226         98      CON           82
                 A      213         74      UNA           78
               LOS      196         64      MAS           36
               DEL      156         63      SUS           27
                SE      119         47      HAN           19
               LAS      114

               Table 2. Most frequent words of one, two and three letter


      Next, table 3 shows the frequencies of the 4-letter words.

2.4     Frequency digraphs
The size of the corpus is 60,115 letters. The frequencies are absolute. The di-
graphs are read by row and column in that order. Below in table 4 shows the
union digraphs are letters from letters.

2.5     Most common initial letter
The most frequent letters in Spanish that start a word are listed in Table 5


3      Results
The ciphertext is used as said it had to be bijective and have Kerckhoff rules
and the decrypted text shown in Figure 1.


                                                                                   59
           Four-letter words   Distribution of letters in literary texts
            Word Frequency E - 16,78% R - 4,94% Y - 1,54% J - 0,30%
           PARA        67    A - 11,96% U - 4,80% Q - 1,53%
           COMO        36    O - 8,69% I - 4,15% B - 0,92%
           AYER        25    L - 8,37% T - 3,31% H - 0,89%
           ESTE        23    S - 7,88% C - 2,92% G - 0,73%
           PERO        18    N - 7,01% P - 2,77% F - 0,52%
           ESTA        17    D - 6,87% M - 2,12% V - 0,39%
            AOS        14
           TODO        11
            SIDO       11
           SOLO        10

                       Table 3. Frequency with four letters


4   Conclusions

We conclude that this method of decryption is good however would have to
tweak a little more due to it depends on the text we have and how much text
to decrypt was also observed that only decrypts an encrypted bijective. In this
work, as seen in the results of Figure 1, which apply various processes, first see
the probability of the lyrics in Spanish that are more frequent, then seen with
the syllables that are more frequent in Spanish, and then with the last word and
you miss the information, text analyzer, as shown in Figure 1 a large percentage
of the information is decoded, but as mentioned in the top, this will depend have
that much information to process it.


References
 1. Liddell and Scott’s Greek-English Lexicon. Oxford University Press. (1984)
 2. Anaya Multimedia, Codigos Y Claves Secretas: Programas En Basic, Basado A Su
    Vez En Un Estudio Lexicogrfico Del Diario ”El Pas”, Mexico 1986.
 3. Friedman, William F. And Callimahos, Lambros D., Military Cryptanalytics, Cryp-
    tographic Series, 1962
 4. Part I - Volume 2, Aegean Park Press, Laguna Hills, Ca, 1985
 5. Barker, Wayne G., Cryptograms In Spanish, Aegean Park Press, Laguna Hills, Ca.,


                                                                                      60
               A B C D E F G H I JK L M
             A 12 14 54 64 15 5 8 4 10 8 41 30
             B 11           5       14 1 12
             C 39     5    17     8 80    3
             D 32     1 2 84      1 30
             E 20 5 47 26 17 8 21 6 9 3 44 26
             F 2            9       12    1
             G 12          12        5    1
             H 15           3        5
             I 43 8 42 29 40 5 8       1 14 16
             J 4            5
             K              1
             L 44     5 5 35 1 3    28    9 5
             M 32 10       42       30
             N 41 2 33 37 41 10 6 2 28 1  5 4
             O 19 17 28 26 16 6 5 5 4 1 22 33
             P 30     1    16        5    8
             Q
             R 74 1 12 10 94 1 12 45 1 1 6 15
             S 32 2 18 15 57 3 2 4 41 1   5 7
             T 60     1    67       35
             U 13 6 11 5 52 1 3      9    9 6
             V 12        1 15       15
             W 1            1
             X        1     4
             Y 5 1 3 2 5 1 1              1 1

                Table 4. Frequency of digraphs


   letter    P     C     D    E S A L R M N T
frequency 1.1128 1.081 1.012 989 789 761 435 425 403 346 298
   letter   Q      I     H U G V F O B J Y WZK
frequency 286 281 230 219 206 183 177 169 124 47 27 19 2 1

              Table 5. Frequency of initial letters


                                                               61
Fig. 1. with each of the texts worked, 01 encrypted text, 02 text one pass, 03 second
pass the text, either original text decrypted


                                                                                        62

</pre>