From: bmearns on
Maaartin, thanks so much for the feedback. My responses are below...

On Apr 25, 5:09 pm, Maaartin <grajc...(a)seznam.cz> wrote:
> On Apr 23, 8:49 pm, bmearns <mearn...(a)gmail.com> wrote:
>
> > The deck itself would be the output "digest". Really, this is just two
> > permutations of 26 elements each (each row is a permutation). There's
> > a number of ways you might use this, for instance, adding the two
> > cards in each column to get a sequence of 26 upper-and-lower-case
> > characters. For a full deck of 52 cards, each row can have 26!
> > different arrangements, which is about 88 bits, so the whole deck
> > would be 176 bits.
>
> That's true for the information contained there, but you by the column-
> wise addition loose something, don't you?

That's possible, I'm really not sure. I think it would at least
preserve 88 bits, right? But that's only one way to use the results,
albeit one that is more suitable for doing by hand then some other
ways.

>
> > I've done some preliminary tests with smaller decks (8, 10, and 12
> > card decks). For each test, I create a random 10 character string
> > (about the length of a reasonably password), and hash it. As a test
> > sequence, I perform a number of tests equal to the total number of
> > possible outputs. My understanding is that a perfect hash would
> > produce every possible output exactly once as a result.
>
> IMHO, that's quite unprobable, unless you do something completelly
> different from what I imagine.

Of course; I wasn't trying to say that any hash actually does this,
but would that be the ideal for the hypothetical perfect hash?
Initially I thought that would be the ideal, but now I'm not so sure.
A perfect hash with N different possible outputs would have an
infinite number of N-element input sets such that each input in the
set produces a different output. However, choosing a set of N random
inputs does not guarantee that it will be one of these sets. The set I
choose could actually overlap several of the "one-to-one" sets, in
which case you would not necessarily get a one-to-one result. So now
I'm not really sure what the "ideal" result is (the result against
which I should measure my hash).


>
> > I think the algorithm avalanches well. Looking at the number of bits
> > that change with each value fed into the hash, the histogram makes a
> > pretty decent bell with an average between 48 and 50 bits per hundred,
> > and a stddev of 13 to 14 per hundred (so about 68% of the time,
> > between 34% and 62% of the bits change).
>
> How do you extract the bits from the card sequence? I see no obvious
> way.
>
> In case you do it by a sort of factoradic number system, than counting
> changed bits makes IMHO no sense, do you?

Yes, I was doing factoradic, and I believe that is a fair way to test
the algorithm in general, but not the specific use of the algorithm I
previously described (with column-wise addition).

I treated each row as it's own permutation and thus interpreted it in
factoradic notation to produce an integer in [0, 26!-1]. Taking the
two rows together gives a two-digit number in base 26!, giving a value
in [0, (26!)^2-1]. In this way, each unique output of the hash has a
unique corresponding numerical value with the minimum average number
of bits, so I believe that seeing how the bits in this numerical
output change is a legitimate way to test avalanching of the algorithm
in general.

The problem, of course, is that this is not the the way I described
earlier for using the algorithm (with column-wise addition), and so
the results are not transferable. I should have been testing the
output of the algorithm as it would actually be used.

>
> > I'm not sure about collision resistance, though. In the test sequence
> > described above (in which I would expect a perfect hash to generate
> > each output exactly once) only about 44% of possible outputs are being
> > produced at all. For a 10-card deck (14400 possible outputs) the most
> > common output occurred 16 times out of 14400 tests. Is this a
> > significant bias? The average number of times each output occurs
> > (including those that occurred 0 times) is consistently 1,
>
> By definition.

Yes, sorry about that. I realized that a few hours after I posted.
Since there are N possible outputs, and I was running N tests, the
average would necessarily be 1 occurrence per output.

>
> > with a
> > stddev of 1.527 (but I'm not sure how relevant the stddev is since
> > it's not remotely a normal distribution).
>
> Standard deviation is important for any distibution. Yours may be
> something likehttp://en.wikipedia.org/wiki/Poisson_distribution#Properties
> You could do some "Hypothesis testing". Or you could first compare it
> to randomly generated result, so you get some feeling. Just generate
> 14400 random numbers from the set 0..14400-1 and compute the stats.
> Use a good PRNG and repeat it many times.
>
> > Visually inspecting the histogram (how many times each output occurs),
> > it looks pretty random to me. There are some spikes and holes, but
> > they are pretty well distributed within the histogram and for
> > different test sequences (where the PRNG producing the input strings
> > is seeded differently) they seem to occur in different places. The
> > exception is a few strange wells of about 50 consecutive output values
> > that never occurred. There are about 12 of these wells that occur in
> > two groups of 6 each, always in about the same location. Altogether,
> > they make up about 5% of the possible outcomes.
>
> > I'm not sure if these are appropriate tests to be running, or if I'm
> > interpreting the results the right way, or if there are other tests I
> > should run. Any feedback on that would be helpful.
>
> I know only test for random sequences: google for "diehard battery",
> "ecuyer Test", and "SP800-22rev1.pdf".
> There're many possibilities for generating them from hashes, e.g,
> h(0), h(1), h(2), ...
> h(0), h(h(0)), h(h(h(0)))

Thank you for the tip. I was planning to run the FIPS-140 tests
against it.

>
> > Most importantly, I'm hoping to get some feedback on whether or not
> > anyone sees any theoretical problems in the algorithm. Specifically,
> > I'm very interested in whether or not it is reversible. I believe that
> > if you don't know the previous state of the deck, then you would not
> > be able to tell which cards were swapped and therefore would not know
> > what data was fed in. Can anyone offer arguments for or against this?
> > Is this sufficient for irreversibility?
>
> I can't tell.

Thanks, Maaartin!

-Brian
From: bmearns on
On Apr 25, 2:05 pm, WTShaw <lure...(a)gmail.com> wrote:
[snip]
>
> Perhaps it is not so simple, but you probably don't want my opinion.

Only if it's on topic. Most of it wasn't (I'm shocked!), but you did
have one useful morsel.

> We I saw this post, I was amused because during a thunderstorm the
> other night the computers were safely down and I did not want to waste
> batteries.  To confirm a counted hash, I had done it hours before by
> hand, not so easy as I have trouble writing.
>
> The hash is doubly processed and the result two steps away from the
> normal alphabetic sequence is
> xjzlacokyungphfsbreiwmvdtq.

WTShaw, it never ceases to amaze me how you never think to actually
describe any of the algorithms you're talking about.



> I doubt that you can do much with it but the lessons are clear, error
> prone if you don't have a way to prove the results.

Hey, you actually posted something relevant and useful! Congrats!

You're right, hashing by hand is error prone, and that's a serious
issue. If it takes 15 minutes to complete a hash only to find out you
got it wrong, it's a pretty terrible waste of time. Worse still if you
don't know whether you got it right or wrong. As it so happens, I went
through my algorithm by hand several times and got it wrong just about
every time. And it wasn't fast. And it wasn't even with a full 52-card
deck. Alas, I'm afraid my attempt was largely fruitless except for the
always useful lessons learned. But I need to discuss that in another
post.

[snip]

-Brian
From: bmearns on
After trying out this algorithm in some slightly more realistic
scenarios, I've decided it's really not that practical to do by hand.
It takes a significant amount of time and is extremely error prone.
I've worked out another one that is incredibly simple, quite fast,
keeps the cards in your hands at all times (instead of laid out on a
work surface which is inconvenient and insecure), and I think it still
strong.

I actually opened my request for review to a slightly broader audience
over the weekend, and so the algorithm is described on the following
webpage: http://brianpmearns.com/bpm/shufflehash

If anyone is willing to take a look and see if anything pops out at
them as good or bad or noteworthy, I would very much appreciate it.

Thanks,
-Brian
From: Maaartin on
On Apr 26, 2:54 pm, bmearns <mearn...(a)gmail.com> wrote:
> That's possible, I'm really not sure. I think it would at least
> preserve 88 bits, right?

IMHO, even more. But 88 bits is surely beyond any security what you
can get using a hand hash, so I don't care.

> But that's only one way to use the results,
> albeit one that is more suitable for doing by hand then some other
> ways.

Right, but the addition is AFAIK more error-prone then the card
shuffling, isn't it?
Maybe reading the leftmost card, permuting all cards somehow and
repeat it all could turn the permutation into an arbitrary sequence in
a less error-prone way?

> Of course; I wasn't trying to say that any hash actually does this,
> but would that be the ideal for the hypothetical perfect hash?
> Initially I thought that would be the ideal, but now I'm not so sure.
> A perfect hash with N different possible outputs would have an
> infinite number of N-element input sets such that each input in the
> set produces a different output. However, choosing a set of N random
> inputs does not guarantee that it will be one of these sets. The set I
> choose could actually overlap several of the "one-to-one" sets, in
> which case you would not necessarily get a one-to-one result. So now
> I'm not really sure what the "ideal" result is (the result against
> which I should measure my hash).

The ideal result is indistinguible from random. Take a secure RNG and
look at its results.

> > > I think the algorithm avalanches well. Looking at the number of bits
> > > that change with each value fed into the hash, the histogram makes a
> > > pretty decent bell with an average between 48 and 50 bits per hundred,
> > > and a stddev of 13 to 14 per hundred (so about 68% of the time,
> > > between 34% and 62% of the bits change).
>
> > How do you extract the bits from the card sequence? I see no obvious
> > way.
>
> > In case you do it by a sort of factoradic number system, than counting
> > changed bits makes IMHO no sense, do you?
>
> Yes, I was doing factoradic, and I believe that is a fair way to test
> the algorithm in general, but not the specific use of the algorithm I
> previously described (with column-wise addition).

But this is no way how to evaluate the avalanche. Otherwise you could
always flip a single bit, run it through md5 and claim it avalanches
well. My point is, that bits of the factoradic system are not
"natural" to the process, not are they similar to the way you
interprete the result.

> I treated each row as it's own permutation and thus interpreted it in
> factoradic notation to produce an integer in [0, 26!-1]. Taking the
> two rows together gives a two-digit number in base 26!, giving a value
> in [0, (26!)^2-1]. In this way, each unique output of the hash has a
> unique corresponding numerical value with the minimum average number
> of bits, so I believe that seeing how the bits in this numerical
> output change is a legitimate way to test avalanching of the algorithm
> in general.

By a single swap in a sequence of 6 cards, the corresponding number
may change by 120, which means quite a lot bit. Changing a single bit
in the number can mean quite a big change in the card order. That's
what I mean by not "natural".

> The problem, of course, is that this is not the the way I described
> earlier for using the algorithm (with column-wise addition), and so
> the results are not transferable. I should have been testing the
> output of the algorithm as it would actually be used.

Either this, or maybe use the digits (insteads of bits) of the
factoradic system.

On Apr 26, 3:04 pm, bmearns <mearn...(a)gmail.com> wrote:
> > I doubt that you can do much with it but the lessons are clear, error
> > prone if you don't have a way to prove the results.
>
> Hey, you actually posted something relevant and useful! Congrats!

He must have meant something different. Or had he a bad day?

On Apr 26, 3:09 pm, bmearns <mearn...(a)gmail.com> wrote:
> After trying out this algorithm in some slightly more realistic
> scenarios, I've decided it's really not that practical to do by hand.
> It takes a significant amount of time and is extremely error prone.
> I've worked out another one that is incredibly simple, quite fast,
> keeps the cards in your hands at all times (instead of laid out on a
> work surface which is inconvenient and insecure), and I think it still
> strong.

With cards lying on the table I'd suggest shifting a part of a row.
Such an operation makes quite a large change of the overall state.

> I actually opened my request for review to a slightly broader audience
> over the weekend, and so the algorithm is described on the following
> webpage:http://brianpmearns.com/bpm/shufflehash

I'll have a look.
From: J.D. on
On Apr 26, 9:09 am, bmearns <mearn...(a)gmail.com> wrote:
> I actually opened my request for review to a slightly broader audience
> over the weekend, and so the algorithm is described on the following
> webpage:http://brianpmearns.com/bpm/shufflehash

It seems like it should be somewhat vulnerable to pre-image attack for
very short message lengths. For example, it is trivially reversible
for messages of just one letter (e.g. if you 'hashed' a single letter
then the card for the letter right after it (e.g. 'B' if you hashed
'A') would be at the bottom of the deck -- I am assuming the output of
the hash is simply the full deck after all shuffles are completed).
And with 52 cards, even after two or three letters have been added to
the hash, most cards remain in their starting order relative to each
other. It seems like it should be easy (easier than a brute force pre-
image attack) to restrict the guesses the cards that are no longer in
their original order relative to the other cards (which would be a
significant subset of the full deck for very short messages). For
example, I used your code to hash a 3 letter message with a deck of 26
cards. The output is this:
[18, 19, 20, 21, 22, 23, 24, 25, 4, 0, 1, 2, 3, 5, 8, 6, 7, 9, 16, 10,
11, 12, 13, 14, 15, 17]
Note that the letters from 18 to 25 are in relative order, so a good
guess is that the hash did not include any number in that range.
Similarly, 0 to 3 is in order, as is 10 to 15. In fact, a superficial
glance reveals that the numbers most radically out of the starting
order are 4, 8 and 16 (which, not incidentally, was the message). A
pre-image attack restricted to those three letters would very quickly
find the message (much faster than a standard brute force pre-image
attack on 3 letter messages).

This sort of guess-work becomes much more problematic after several
more letters are added to the hash -- but a good cryptographic hash
shouldn't be easily reversible no matter how long the message is.

First  |  Prev  |  Next  |  Last
Pages: 1 2 3 4 5 6 7 8 9 10 11 12
Prev: Dynamic Hill cipher
Next: PE Scrambler