From: Paul Dunn on
Guga wrote:

> (Not sure if this is what you consider "halt")...I mean, if "halt"
> means hang, break, crash....then the program is not halting. It is
> merelly going on a loop.

As far as I can tell (and I'm no computer-science graduate) the "halting
problem" is something like this:

1. Retrieve input
2. perform calculation or algorithm on that input
3. Do something else

This has a problem - will the calculation in (2) ever return? If it does,
then the code can be disassembled easily. If not, then you don't stand much
chance of predicting if it will return, unless the algorithm/calculation is
trivial. If it gets even remotely complex, it becomes a very hard thing to
figure out.

So, by "emulating" the process, you cannot say that you will ever get to
discover if the code in (3) is ever executed. This is, as far as I can tell,
the "halting problem". Many people have said before that the best way to
discover if (3) is code or data is to follow the code. Of course, you may
end up in a loop in (2), and you have to follow that loop until it exits.
But how many iterations of that loop must you follow? It could end up in the
billions of loops, it may be just one or two.

Where do you put the "cut off" point? What if the loop is infinite (halted)?
How can you tell? If the loop requires some user-input to exit, then without
the precise input given to your "analyser" then you're not going to exit.
And neither is your analyser - it will loop round and round forever, and
will "halt" (never return) by itself. And if it does that, how can you be
certain that the bytes at (4) are code, or data?

I'm having the same problem in my BASIC interpreter when a user executes
some machine code. How do I determine when that code will return to BASIC?

It's a famous problem - yes, you can tell with reasonable certainty that a
given section of code will or will not loop. But you can't tell for *all*
code.

That's the way I see it, any rate.


From: Guga on
"If the loop requires some user-input to exit, then without the precise
input given to your "analyser" then you're not going to exit. And
neither is your analyser - it will loop round and round forever, and
will "halt" (never return) by itself. And if it does that, how can you
be certain that the bytes at (4) are code, or data? "

Ok, but.....If the program needs an input to get out the loop, it means
that this is not a endless loop´anyway. The specied routine where the
loop is happening, is of course happening a loop, but, if the exti
depends of some other commdn to get out, then the ending point of that
loop can be determined.

For example, if you have this:

mov eax
push Data001
L0:
cmp eax 01 | jne @ExitLoop
jmp L0<
@ExitLoop:
xor eax eax
mov eax 01


If we have a condition inside the loop or in anyother part of the code
that will result in access the Offset at @ExitLoop, then it is not an
endless loop that can´t be retrieved. I mean, the data below @ExitLoop
will already be referenced somewhere else in the code (IN case of the
example, with the cmp eax...but it can be several lines before the loop
happens as well. It doesn´t matter, since the Data After @ExitLoop is
being referenced in other part of the program

So, on this situation, if you have a reference there, then the
disassembler must properly identify that routines, regardless the
existance of the loop.

I was talking about no references at all after @ExitLoop.

Also, if you don´t consider an internal conditional jmp or any
internal pointer to that offset, and let´s say that offset is being
pointed by another app (In case if our example, is an dll), then the
@ExitLoop have an external reference defined in the header. Or if the
code is accessed on the same way as the classes in DX, MFC etc, it
should have a Table on the main app (The dll) displaying the Offsets to
be targeted. (At least as far as i can remember...there is an reference
to the pointers to that label anywhere in the file).

But, if we have an app (Dll for example) with those unreferenced
offsets that are accessed only by an external app, then the
unreferecend data of that dll (per se) are nothing more then a bunch of
DB _untill_ they are accessed by an external file that uses them. Then
after being accessed (or during the access) they are interpreted as
code or data because they are afterall being used.

"it will loop round and round forever, and will "halt" (never return)
by itself. And if it does that, how can you be
certain that the bytes at (4) are code, or data? "

Exactly, but if you can't tell when it will stop because the data is
unacessed, neither the processor can tell, untill the data is accessed
(externally on your example)

One way _maybe_ is you keep track of the external programs that where
reponsable to access that specific code externally. I don´t know..it
is just a conjecture (Don´t know if this is the proper term in
english...i mean, a "idea", a "thought").

Best Regards,

Guga

From: Charles A. Crayne on
On 3 Oct 2005 08:48:29 -0700
"randyhyde(a)earthlink.net" <randyhyde(a)earthlink.net> wrote:

:I think that Chuck Crayne has confused a lot of people by attempting to
:claim that these theorems only apply to infinite machines.

Although I do believe the above claim, I am not the originator of it -- I
am merely reporting the claims made by the reference which you provided:

"The undecidability of the halting problem relies on the fact that
algorithms are assumed to have potentially infinite storage: at any one
time they can only store finitely many things, but they can always store
more and they never run out of memory. However, computers that actually
exist are not equivalent to a Turing machine but instead to a linear
bounded automaton, as their memory and external storage of a machine is
limited. In this case, the halting problem for programs running on that
machine can be solved with a very simple general algorithm"

-- Chuck
From: Charles A. Crayne on
On 3 Oct 2005 09:19:27 -0700
"Guga" <mauroteste(a)hotmail.com> wrote:

:I mean, if the processor itself have no problems identifying this, why
:an disassembler (Assuming it to reproduce the same steps as what the
:processor should do) is not possible ?

On any given run of a program, the processor may execute only a few of the
possible paths through the code. The disassembler, on the other hand, must
analyze all possible paths. In some cases, this is easy, but in a few,
extreme cases, it may be very difficult.

-- Chuck
From: Paul Dunn on
pH wrote:
> On Mon, 03 Oct 2005 20:42:47 GMT, "Paul Dunn"
> <paul.dunn4(a)ntlworld.com> wrote:

[Snippage]

>> It's a famous problem - yes, you can tell with reasonable certainty
>> that a given section of code will or will not loop. But you can't
>> tell for *all* code.
>>
>> That's the way I see it, any rate.
>
> Congratulations! Perhaps you could offer a hand at explaining to
> the... uhh... "slower" people here? <g>

You mean I actually got that right?

Bloody hell, there's hope for me yet :-p