From: Linards Ticmanis on
Paul Schlyter wrote:

>> IIRC the S-JiffyDOS patch uses illegal opcodes to speed up GCR decoding.
>> (Jochen, are you reading here and can comment?)
>
> That can also be done without illegal opcodes. The Apple Pascal RWTS
> did it, and was able to read all sectors of a floppy on one revolution
> of the disk -- oncluding GCR decoding on-the-fly.

But isn't Apple GCR a bit simpler than commodore GCR? Apple's 16-sector
format stores three data bytes per four disk bytes (3:4), Commodore gets
in four data bytes per five disk bytes (4:5) IIRC.

--
Linards Ticmanis
From: Stephen Harris on
John Selck <gpjiweg(a)t-online.de> wrote:

> Am 12.05.2006, 04:21 Uhr, schrieb Michael J. Mahon <mjmahon(a)aol.com>:

> > And in any case, depending on the peculiarities of a particular chip
> > implementation is just asking to be locked out of future improvements.

> Like I stated several times now: You can easily do a processor check and
> use different code. On plain 6502 you use the faster routine with illegals
> and on 65816 etc you use a normal routine. It's 5 minutes work to do that.

And now you've introduced program bloat, dual code paths, extra potential for
bugs...

--
Stephen Harris
usenet(a)spuddy.org
The truth is the truth, and opinion just opinion. But what is what?
My employer pays to ignore my opinions; you get to do it for free.
From: Knut Roll-Lund on
John Selck wrote:
> Am 19.05.2006, 22:34 Uhr, schrieb Eric Smith <eric(a)brouhaha.com>:
>
>> "John Selck" <gpjiweg(a)t-online.de> writes:
>>
>>> And then we would have ended up like the Z80: That a copy loop is faster
>>> than the actual block copy instruction :D
>>
>>
>> Can you cite an example? The block move instructions on the Z80 are
>> actually fairly efficient. They use three memory cycles to move a byte,
>> vs. the theoretical minimum of two. Doing a block copy via a software
>> loop is going to require at least five memory cycles per byte moved, and
>> probably more.
>
>
> LDIR eats 21 clock cycles per iteration, doing the same with other Z80
> instructions can be way faster.
>
> I tried to use the Z80 in the C128 for block copy/fill because I thought
> "hey, it has a block copy command, so I guess it is fast" but it wasn't.
> Then I tried normal opcodes, it was way faster than using LDIR but still
> not faster than copying the stuff with the 8502.
>
> Here's a list of Z80 instruction and their clock cycles:
>
> http://www.ftp83plus.net/Tutorials/z80inset1A.htm

I would like to point out that these are 21 Z80 clocks going twice the
rate of a 6502 (I don't know if this is true for C128 but 1MHz 6502 is
comparable to a 2MHz Z80 and were competing side by side). If you are
going to move only a few bytes it is more efficient to do so by load,
store, load, store etc... less setup and no looping.

I think you are wrong if you are talking about more data. LDIR is the
most convenient and fastest way to copy a block of data with the Z80. I
think doing any form of loop is slower. (Even if you use the POP
instruction and keep within a page you end up at 21.5 clocks per byte).

The Z80 is not a memory oriented processor its strength is to have many
registers (for the mid/late 70's) and to have efficient instructions
using them. 6502 is contrary, it has almost no registers but do memory
access more efficiently and ends up with a fairly similar performance.
6502 might well do a memory block move faster than Z80.

To say something on this threads topic the Z80 also has
undocumented/illegal opcodes and a few are convenient to use. These are
a result of the instruction decoding and work in all the real Z80's.
Most of the "illegal" instructions are just "advanced" NOP's that takes
longer than a normal NOP. Some with strange effects on flags though. It
is especially a topic for emulators, to do these correctly...

Knut
From: Bruce Tomlin on
In article <446f7e4c$0$4512$9b4e6d93(a)newsread2.arcor-online.net>,
Linards Ticmanis <ticmanis(a)gmx.de> wrote:

> Paul Schlyter wrote:
>
> >> IIRC the S-JiffyDOS patch uses illegal opcodes to speed up GCR decoding.
> >> (Jochen, are you reading here and can comment?)
> >
> > That can also be done without illegal opcodes. The Apple Pascal RWTS
> > did it, and was able to read all sectors of a floppy on one revolution
> > of the disk -- oncluding GCR decoding on-the-fly.
>
> But isn't Apple GCR a bit simpler than commodore GCR? Apple's 16-sector
> format stores three data bytes per four disk bytes (3:4), Commodore gets
> in four data bytes per five disk bytes (4:5) IIRC.

I've tinkered with code to decode both using a catweasel board. The
Apple GCR is more robust, but more complicated. It has extra codes used
for data marks (so there is no question where the data begins), it
encodes 5 or 6 bits in 8 for more redundancy (you are more likely to get
a bad nibble and know the sector is bad with no need for a checksum),
and more important, the encoding relies on every nibble starting with a
'1' bit, so after an error it has a good chance of getting back in
nibble sync (and you might get some useful data from a bad sector).

However, the bits are stored in a crazy order. After reading an Apple
GCR sector, you have to run the 5-bit or 6-bit decoded nibble data
through an unscrambler, which is different between 13 sector and 16
sector encodings. The 13 sector decoder is a total mess in C.

Commodore's GCR relies on long strings of 1s to synchronize, so if there
is a glitch in the gap, you will think you see data, and by the time you
figure it out, you could have gone too far and missed the real sector
header. I had lots of problems keeping sync with the sector headers. I
can't remember exactly, but I know I had to do a lot of tweaking to
avoid false syncs. And nibbles can start with either a 0 or a 1, so if
you get lost, you will remain hopelessly out of sync with your data.
From: Michael J. Mahon on
Jim Brain wrote:
> Michael J. Mahon wrote:
>
>> Not so. The logical elements of the NMOS design are not present in
>> the same form in CMOS. And the instruction decoding was clearly
>> redesigned, as was the control section (changed timings and bus cycle
>> patterns, new instructions).
>
>
> I guess we'll agree to disagree on the point. I say all the design
> elements are there (the 8 bit ALU, decimal mode, the 16 bit instruction
> pointer, etc.)

I was referring to the precharge and transfer gates in the NMOS process.
Different gate designs are preferred in a CMOS process, and so different
circuits.

> As for the instruction decoding, Bill just cleaned up
> the don't cares, as I see it. Hardly a huge new design. The physical
> elements changed, yes, but the logical perspective stayed the same. When
> I interviewed him back in 1995, he said as much. He simply wanted to
> clean up the design and lay it out so he could take advantage of not
> only the CMOS process but the ability to shrink the feature size every
> so many years. He hasn;t changed his design since the original. He
> just shrinks the design using the newer process and re-fabs. That's how
> he stated he got his speed increases.

The term "logical design" when applied to a processor means the
circuit diagram at the level of gates.

If the same circuit design is implemented in a different process,
the processor behavior will be identical, since it is the same
logical design (same signals, flows, timing, etc.).

If the same *architecture* is re-implemented with a new circuit
design, then only the *documented* behavior is likely to be
identical, since everything else is formally a "don't care".

>> That kind of behavior is of a completely different kind than random
>> bus clashes as multiple data sources are unintentionally gated onto
>> a bus!
>
> I'll agree they are different, but the point was that they are both
> undocumented behavior. Intel found the undocumented behavior was used
> all over, so they had to build in support for it. Therefore, every new
> Intel CPU supports this "undocumented behavior". Of course, it is now
> documented, so it becomes legitimate by virtue of so many people
> exploiting it that it became the std.

*All* undocumented behavior includes both "well-behaved but
undocumented", "wierd but predictable undocumented behavior",
and "weird and unpredictable undocumented behavior".

The well-behaved stuff obeys the principle of least surprise,
and may be candidates for documentation and full support.

The rest is *never* such a candidate. ;-)

And for a designer to be "forced" to support previously undocumented
behavior because it is "used all over" is *exactly* the point I've been
trying to make all along--this is *almost always* a bad thing.

The only exception I can think of is when a sensible design behaves
sensibly in some situations that the designer somehow neglected to
document, but should have. Of course, the only real problem then is
correcting the documentation, since any "reasonable" design will be
likely to support the behavior in any case.

Even in this case, the matter should be broached publicly and an
agreement reached about what part of the originally undocumented
behavior will become documented, and thus sanctioned by the designer
in any future implementations.

>> Long after the 65C02 design was available, lower power, faster, and
>> cheaper to make, the CBM line could not easily make use of it.
>
>
> If it had made economic sense, CBM would have tasked Mensch to add the
> errata into the C02. Apple got Bill to change the 816 timings, and CBM
> had more clout with Bill, since they originally helped set up WDC.

Possibly, but clout is usually dependent on the promise of orders,
not good feelings... ;-)

If WDC didn't get money from CBM, they wouldn't have much influence.

I'm guessing that CBM just kept turning the crank on their proprietary
6502 implementation(s), and were not motivated to pursue the 65C02.

> But, I'll concede the point that it would have required more cash and
> time than the Apple II line needed.
>
> I guess, in priciple, I agree with you that undocs fly against the rule
> of programming. HOwever, in the CBM environment, the rules are a bit
> different. As well, regardless of how one views the use of undocs, I'm
> not willing to be harsh on the MOS folks for what they did. They made a
> $20 CPU and got Woz and the Bushnell interested in using CPUs, which
> brought all of us to where we are today. I won't let a few space and
> time saving details that made perfect since in the early 1970's cloud that.

I don't fault them for what they did, either--I was just pointing out
that we don't do things that way anymore, and for good reason: we
found out by (sad) experience that programmers cannot be trusted
to "keep off the undocumented grass". ;-)

And I also agree that for any platform with a single, unchanging
implementation, as most retro platforms are today, the use of
stable undocumented behavior is not a practical problem.

My whole point is that it is a *very* bad idea for an evolving
platform, since it can seriously hamper evolution by causing
compatibility breaks.

-michael

Music synthesis for 8-bit Apple II's!
Home page: http://members.aol.com/MJMahon/

"The wastebasket is our most important design
tool--and it is seriously underused."