From: agila61 on
Tristan Mumford wrote:

> I don't think clocking is too important with I2C. Not completely sure how
> that works though.

It may be that its just the devices don't use I2C for anything high
speed, so people don't build parts that can use it at high speeds. But
AFAICU, pretty much anything the C64 could crank out could be handled,
so its not a difference that makes a difference.

But I got the impression that I2C is not for block transfers, either
its slow or less robust, so I never bother looking hard at it.

> Thats true. I mean its not too hard to use 16 bit data on an 8 bit bus,
> but reading and writing it can be a shade more fiddly.

What I was looking at was the slowdown imposed by the fiddling.

I kind of prototyped some block move routines, and the 8-bit IDE mode
(its in the Compact Flash spec that it supports 8-bit IDE, because of
the original target for things that might have 8-bit microcontrollers
running them) was noticeably quicker. Its like the following, and
you'll see that in the first (8-bit IDE) the PA2 overhead and control
register overhead is not in the loop, while in the second it is. And it
gets even worse if setting PA2 has to be protected against interrupts
(like I said, its just a sketch).

Mind you, if that's the cheapest IDE interface, that's exactly what I
would have done! I was looking at bit-banging the lines left free by
Daniel Dallman's 9600bps serial port hack, which is really, really
slow, until I realized that writing the UserPort messes with the PC
handshake, so the lines are only free for input, and you need 3 Output
lines (MOSI, clock and select).

SET_TOKN ; A=Token, May be used in ,X or ,Y loop
PHA
LDA UPA
AND #PA2_0
STA PA
LDA #CTRL_DDR
STA PBDDR
PLA
PHA
STA PB
LDA PA
ORA #PA2_1
STA PA
PLA
BMI SET_TOK1
LDA #0
STA PBDDR
RTS
SET_TOK1
LDA #$FF
STA PBDDR
RTS

GETBUF ; Called when data ready
STA W+1
LDY #0
STY W
LDA #RD_IDE0
JSR SET_TOKN ; call once
LDX #2
;
GETBUF1
LDA UPB
STA (W),Y
INY
BNE GETBUF1
INC W+1
DEX
BNE GETBUF1
TXA ; IOReturn 0=normal
RTS

Versus 16 bit

GETBUF ; Called when data ready
STA W+1
LDY #0
STY W
LDX #2
;
GETBUF1
LDA #RD_IDE0_HI
JSR SET_TOKN ; call 256 times
LDA UPB
INY
STA (W),Y
LDA #RD_IDE0_LO
JSR SET_TOKN ; call 256 time
LDA UPB
DEY
STA (W),Y
INY
INY
BNE GETBUF1
INC W+1
DEX
BNE GETBUF1
TXA ; IOReturn 0=normal
RTS

From: agila61 on
Tristan Mumford wrote:
> >> I guess what I'm saying is the dotclock is the xtal freq/2, right? Just
> >> asking because that's what I'm running the uC off (but not the USB IC).
> >> if I'm wrong I'm still within the frequency range of the uC, but it
> >> means my delay is wrong.

> > That I don't know. Is it the same crystal for PAL and NTSC? I would
> > assume that they would use the crystal that made the display work.

> Different crystals. I don't remember the values of the top of my head
> though. I can tell you the PAL version is slightly slower though. The end
> effect is that the NTSC ver. runs at a little over 1MHz and the PAL
> slightly under 1MHz.

I guess that makes sense ... PAL is higher resolution, but 50Hz, and
the C64 was designed with the video in mind first, and the rest of the
system following from that.

Offhand, the part I am most interested in is this one,

http://www.newmicros.com/cgi-bin/store/order.cgi?form=prod_detail&part=Tini2131

Its supports RS-232C and 2 SPI ports, so it could have an SPI flash RAM
connected at the same time, and still have 8 general purpose I/O lines
available to connect directly to the User PortB and do fast block
transfers with either the serial port or the flash RAM. And SPI flash
RAM comes in sizes up to 16Mb.

High speed serial port and Megabytes of storage, fast ARM offboard
processor and a real time clock to boot. Not a bad upgrade for a $30
board

From: Tristan Mumford on
On Mon, 11 Dec 2006 20:33:27 -0800, agila61 wrote:

> Tristan Mumford wrote:
>> >> I guess what I'm saying is the dotclock is the xtal freq/2, right? Just
>> >> asking because that's what I'm running the uC off (but not the USB IC).
>> >> if I'm wrong I'm still within the frequency range of the uC, but it
>> >> means my delay is wrong.
>
>> > That I don't know. Is it the same crystal for PAL and NTSC? I would
>> > assume that they would use the crystal that made the display work.
>
>> Different crystals. I don't remember the values of the top of my head
>> though. I can tell you the PAL version is slightly slower though. The end
>> effect is that the NTSC ver. runs at a little over 1MHz and the PAL
>> slightly under 1MHz.
>
> I guess that makes sense ... PAL is higher resolution, but 50Hz, and
> the C64 was designed with the video in mind first, and the rest of the
> system following from that.
>
> Offhand, the part I am most interested in is this one,
>
> http://www.newmicros.com/cgi-bin
/store/order.cgi?form=prod_detail&part=Tini2131
>
> Its supports RS-232C and 2 SPI ports, so it could have an SPI flash RAM
> connected at the same time, and still have 8 general purpose I/O lines
> available to connect directly to the User PortB and do fast block
> transfers with either the serial port or the flash RAM. And SPI flash
> RAM comes in sizes up to 16Mb.

That's pretty good. And it gives an easy method to play with the ARM
architecture.
>
> High speed serial port and Megabytes of storage, fast ARM offboard
> processor and a real time clock to boot. Not a bad upgrade for a $30
> board

That all reminds me. The uC I'm using does support SPI and I2C. It also
has interrupts for both.

Also speaking of which. The USB thing communicates with the C64 now. I can
get it past detecting to initialising and then it sort of jams. Still
working on it.


--
-----> http://members.dodo.com.au/~izabellion1/tristan/index.html <-----
===== It's not pretty, it's not great, but it is mine. =====
From: agila61 on
Tristan Mumford wrote:
> That all reminds me. The uC I'm using does support SPI and I2C. It also
> has interrupts for both.

I finally got around to browsing for I2C specifications.

The difference between 4 wire (ISP) and 2 wire (I2C) is that with 2
wire, both the clock line and data line is defined to be bi-directional
on the bus. Multiple bosses, multiple workers, but only one boss active
at a time.

Boss runs the clock, so if the C64 is the bus boss, then the clock line
would be coming out of the C64, and the Serial line would flip between
Input and Output.

Also, because the clock line can change hands, specified data transfer
rates are required. The three rates are Standard (100kbps, or
12.5KBps), Fast (400kbps, or 50KBps), and High Speed (3.4Mbps, or
425KBps).

I'm not sure, but I think the C64 can hit 100kbps pretty close with
synchronous serial.

That is, I think the CIA generates the synchronous serial clock line by
"countdown to 0, then transition the clock line". So from what I guess,
the work of the state machine is distributed between the start/restart
countdown and counter=0 states. It's like if you had the following in
software and then handed it a count of 0:

MOVEUP:
DEY
LDA (SOURCE),Y
STA (TARGET),Y
CPY #0
BNE MOVEUP
RTS

If you start your countdown at 0, its like starting your countdown at
256. That's why \_/ \_/ ... would take at least 4 system clocks per
serial clock cycle.

If that's right (I **don't** guarantee it!), then
1 = 250kbps (+/- discrepency of system clock from 1Mhz)
2 = 166.7kbps (ditto)
3 = 125kbps (ditto)
4 = 100kbps

So for the 2 add'l lines you get maybe 2 times the bandwidth, plus the
option for a direct IRQ line alongside the SPI bus.

Here's why I think SPI running at PS1 speed is a real good fit. I got
to the point of working through various parallel port protocols based
on random logic, but with an ARM mini board at up to 60MHz and 16GPIO
lines, you could just specify what you wanted. I wouldn't need to
solder, if I could find a wire wrap UserPort and and socket for the ARM
miniboard.

So, sleeping on it, with an ARM miniboard with 16 general purpose IO
lines, you could define a PPI as:
* DATA=d0-d7 (PB0-PB7), I/O data lines.
* BDC (PA2), Master direction control, Read/Write (If PA2 DDR bit is
set to input, PA2 will show as 1, and it is assumed that Read is safer
than Write).
* CLK (PC3), \_/ ... worker acquires data on _/, holds data for
delivery until \_
* ALERT (FLAG), worker pulls \__ until status cleared.

Protocol:
This is as fast as possible for block transfers (Note1), so
Reset/Select/Control is by protocol. This relies on a Maximum Data
Packet size of 256bytes.

Reset: Boss asserts Data_Read (BDC=1). Boss reads until any current
data input is exhausted. When data is exhausted, Worker inputs $0 and
ALERT=0.

Control: After Reset, Boss Asserts Data_Write (BDC=0). Workers released
Alert (ALERT=0). Worker is now in one-shot control mode. Control
address token is placed on DATA, and CLK cycled.

Data: Boss Asserts Data_Write (BDC=0) to write to control address,
Data_Read (BDC=1) to read from control address.

A maximum data packet of 256 and a set "data exhausted" means that you
test for ALERT outside a (W),Y inner loop. If you want to know write
away, set up FLAG so that the NMI gets through to the system, and
handle it with a routine on the NMI irq vector chain.

So, what is the raw throughput, abstracting from control overhead.

READPAGE:
LDY #0; +2
RPG1:
LDA (W),Y ; +5
STA CIA2PB ; +4
INY ; +2
BNE RPG1 ; +3, -1 overhead
RTS ; 5+6 call overhead
; overhead 13-1, loop 256*14 = 3597.

And then, the anti-climax. 1Mhz/3597 = 278 pages per second = 71KBps =
569kbps (there's a lot of rounding up here, effective bandwidth would
be lower). The raw bandwidth of the port is a lot more, but around
500kbps is the top speed for a parallel port burst mode using all the
hardware handshaking built in. You'd need a 65816 cartridge to take
advantage of it.

After realizing that, four independent devices with a maximum bandwidth
of 250kbps and an ability to directly copy from one to the other
started to look pretty good.

The 4-way ISP+IRQ interface could be a UserPort connector and 4 female
DB9 connectors, with common MOSI, MISO, CLK and separate SELECT and
IRQ. The only part absolutely required are diodes or a hex OR to tie
the 4 IRQ lines to FLAG without cross interference.

And, BTW, with a UP/SPI to PS1 cord, made by cutting off the system
side of a PS1 controller extension cord, you get PS1 controllers or
with a multitap 4 controllers and 4 PS1 memory cards.

From: christianlott1 on
agila61(a)netscape.net wrote:
> And then, the anti-climax. 1Mhz/3597 = 278 pages per second = 71KBps =
> 569kbps (there's a lot of rounding up here, effective bandwidth would
> be lower). The raw bandwidth of the port is a lot more, but around
> 500kbps is the top speed for a parallel port burst mode using all the
> hardware handshaking built in. You'd need a 65816 cartridge to take
> advantage of it.

71KBps seems plenty fast. You only have 64k in there anyway : ] (I
guess you're wanting to stream video or something...)

Would it be easier to stick 64k and a USB interface to the cart port?
Is it possible to just mirror the whole 64k from the cart port using
it's 16 address lines?

Then you would have instant access to the ram without having to copy
bytes with the 6510. DMA. Then maybe you could multiplex and use the
PIC to control access to the ram banks, etc.. I'm sure it'd be too
expensive to add a 512mb slot (a joke, guys).

I don't know, I'm not a hw or even much of a sw guy. (I jus got this
commodore, see....)

Sorry, I don't want to come off as a whinner. This is a fascinating
discussion. I know everybody and their dog has something stuck in their
cart port and since there's a zillion projects for it already, they
probably already did what I'm describing.