From: Mok-Kong Shen on 14 Jul 2010 08:59 [Addendum] One could also treat in a similar way homophones of digrams or even trigrams. The space of trigram is 26^3=17576. Since 2^16=65536, a homophone mapping of trigrams to 16 bits could presumably be fairly satisfactorily done. However, the large table size 2^16 is obviously a substantial disadvantage. A conceivable compromise is first to assign each individual trigram T a numerical range Tr in [0, 2^16-1] and choose a function f(x) that does a bijective pseudo-random mapping in [0, 2^16-1] and obtain f(Tr) as the homophones of T. A permutation polynomial f(x) mod 2^16 could obviously be chosen to serve this purpose, noting that f^(-1)(x), for given x, can be numerically computed. M. K. Shen From: Maaartin on 14 Jul 2010 12:16 On Jul 14, 2:59 pm, Mok-Kong Shen wrote:> [Addendum] One could also treat in a similar way homophones of digrams > or even trigrams. Continuing this idea you reinvent compression. From: Mok-Kong Shen on 14 Jul 2010 16:06 > Continuing this idea you reinvent compression. Homophone is in the counter direction of compression!! M. K. Shen From: Maaartin on 14 Jul 2010 17:17 On Jul 14, 10:06 pm, Mok-Kong Shen wrote:> > Continuing this idea you reinvent compression. > > Homophone is in the counter direction of compression!! Single character homophone in fact do expand the text. Trigram homophones as you described them pack 3 chars into 2, this is a compression, at least when compared to the most straightforward representation using 3 bytes. Using good compression always leads to homophony, otherwise the compressed text would be still compressible. From: Mok-Kong Shen on 14 Jul 2010 17:25 Maaartin wrote:> Mok-Kong Shen wrote: >>> Continuing this idea you reinvent compression. >> >> Homophone is in the counter direction of compression!! > > Single character homophone in fact do expand the text. Trigram > homophones as you described them pack 3 chars into 2, this is a > compression, at least when compared to the most straightforward > representation using 3 bytes. Using good compression always leads to > homophony, otherwise the compressed text would be still compressible. I wrote "The space of trigram is 263=17576. Since 216=65536, ....". The space of 3 characters of the normal alphabet is 'expanded' to the full space of 16 bits. M. K. Shen