From: Tim Little on
On 2009-09-05, I.N. Galidakis <morpheus(a)olympus.mons> wrote:
> Now, an obvious note of caution: As far as I can see, every time
> there are k predicted digits, at any stage, I have the freedom to
> either try a bottleneck digit equal to k+1 or add 1<=j<=k new digits
> to L and update and continue.

Yes. You can rule out some values of j immediately, since the only
valid j-digit blocks must start with digit j. That's why I chose my
sequence to recover a prefix ending in 54321: there are then at least
5 valid values for j.

Technically there is one more possibility: that the largest overlap
that you see between the end of LC(L) and the start of L is actually
just a coincidence. No finite amount of data can refute this.


As the number of truncated digits increases, the number of
combinations of possible original messages increases exponentially.

Example: Suppose you lose the first 300 digits of a sufficiently long
message, being (say) the first 65 blocks. The remaining blocks then
give you 235 of those lost digits, and enough overlap with the rest of
the message to be confident it's not just coincidence. For each of
the remaining 65 digits, you probably average somewhere between 1.5 to
1.8 choices based on where you can choose block boundaries. That's at
least tens of billions of choices in the tree, maybe as many as
quadrillions.


- Tim
From: I.N. Galidakis on
Tim Little wrote:

> On 2009-09-05, I.N. Galidakis <morpheus(a)olympus.mons> wrote:
>> Now, an obvious note of caution: As far as I can see, every time
>> there are k predicted digits, at any stage, I have the freedom to
>> either try a bottleneck digit equal to k+1 or add 1<=j<=k new digits
>> to L and update and continue.
>
> Yes. You can rule out some values of j immediately, since the only
> valid j-digit blocks must start with digit j. That's why I chose my
> sequence to recover a prefix ending in 54321: there are then at least
> 5 valid values for j.
>
> Technically there is one more possibility: that the largest overlap
> that you see between the end of LC(L) and the start of L is actually
> just a coincidence. No finite amount of data can refute this.
>
>
> As the number of truncated digits increases, the number of
> combinations of possible original messages increases exponentially.
>
> Example: Suppose you lose the first 300 digits of a sufficiently long
> message, being (say) the first 65 blocks. The remaining blocks then
> give you 235 of those lost digits, and enough overlap with the rest of
> the message to be confident it's not just coincidence. For each of
> the remaining 65 digits, you probably average somewhere between 1.5 to
> 1.8 choices based on where you can choose block boundaries. That's at
> least tens of billions of choices in the tree, maybe as many as
> quadrillions.

Ok. Let me then try to isolate the problem further and call these fixed point
sequences "ambiguous". In other words, an "ambiguous" fixed point is a sequence
which can recover "nested" bottleneck blocks. A "nested" bottleneck block is a
block which has a form similar to 54321, but not only this form: For example,
all the blocks below are "nested":

987697777
59821
4731
4399
329

In other words, a nested bottleneck block is a block for which if I count from
left to right, there are at least two bottleneck digits which match its
positions within the larger bottleneck block. On the first example there are 3:
7,8,9. On the second example there are 2: 1 and 5. On the third: 1 and 4. On the
fourth: 3 and 4 and on the fifth: 2 and 3.

Let me now reverse the question in hopes of getting disjointness, so we can
perhaps figure out a necessary and sufficient condition for a fixed point to
predict a unique sequence:

Can you perhaps find a counter-example for the following?

"A fixed point which predicts no nested bottleneck blocks, predicts a unique
sequence" or:
"A non-ambiguous fixed point, predicts a unique sequence".

Thanks,

> - Tim
--
Ioannis

From: I.N. Galidakis on
I.N. Galidakis wrote:
[snip]

> 987697777
> 59821
> 4731
> 4399
> 329
>
> In other words, a nested bottleneck block is a block for which if I count from
> left to right,

Sorry, typo. Make that: "from right to left".

> there are at least two bottleneck digits which match its
> positions within the larger bottleneck block. On the first example there are
> 3: 7,8,9.

Typo #2: Make that 4: 6,7,8,9.

[snip]

--
Ioannis

From: Tim Little on
On 2009-09-05, I.N. Galidakis <morpheus(a)olympus.mons> wrote:
> Can you perhaps find a counter-example for the following?
>
> "A fixed point which predicts no nested bottleneck blocks, predicts a unique
> sequence" or:
> "A non-ambiguous fixed point, predicts a unique sequence".

Both are true, and there do exist fixed points having this property.
It's not quite so easy to encode arbitrary data into such sequences,
but it is possible.


Of course, it is also possible (and much easier) to remove all
ambiguity by choosing fixed-length blocks instead of having the first
digit specify their length. E.g. with blocks of length 5, encoding
the sequence 635841021863871721799207213279185259441917332264:

6358
6 4102
3 1863
5 8717
8 2179
6 9207
4 2132
1 7918
0 5259
2 4419
3 1733
1 2264.

Given sufficent length, reconstruction from a sequence truncated at a
block boundary is trivial and unambiguous. But perhaps boring.


- Tim
From: I.N. Galidakis on
Tim Little wrote:

> On 2009-09-05, I.N. Galidakis <morpheus(a)olympus.mons> wrote:
>> Can you perhaps find a counter-example for the following?
>>
>> "A fixed point which predicts no nested bottleneck blocks, predicts a unique
>> sequence" or:
>> "A non-ambiguous fixed point, predicts a unique sequence".
>
> Both are true, and there do exist fixed points having this property.
> It's not quite so easy to encode arbitrary data into such sequences,
> but it is possible.
>
>
> Of course, it is also possible (and much easier) to remove all
> ambiguity by choosing fixed-length blocks instead of having the first
> digit specify their length. E.g. with blocks of length 5, encoding
> the sequence 635841021863871721799207213279185259441917332264:
>
> 6358
> 6 4102
> 3 1863
> 5 8717
> 8 2179
> 6 9207
> 4 2132
> 1 7918
> 0 5259
> 2 4419
> 3 1733
> 1 2264.
>
> Given sufficent length, reconstruction from a sequence truncated at a
> block boundary is trivial and unambiguous. But perhaps boring.

Terrific. Many thanks. I think you've debugged the scheme completely. I will now
slowly start updating my web page with the relevant details, citing your name.

> - Tim
--
Ioannis