MOS6526 CIA detailed test vectors and models

Es gibt 44 Antworten in diesem Thema, welches 13.053 mal aufgerufen wurde. Der letzte Beitrag (12. Januar 2021 um 00:20) ist von merlintwa.

  • Hello all,

    I'm working on collecting detailed test vectors from a few MOS6526's I have lying around here, using an Arduino as a stimulator for the pins. What I want to do is create an as good as possible Verilog and C model from that which may be further refined based on other information like schematics. For that a good reference of cycle-by-cycle pin states is very useful.

    I'd like to hear from any of you if you have some other links or details to share, about existing test vectors or of the internal working of the CIAs. I've put details I've collected so far down below.

    The tests I do are cycle-by-cycle stimulation and observation of the IC after each cycle. There is a readReg(address, cycle) and a writeReg(address,data) function that do most of the work. With the cycle param you can chose to cyclle phi2 low once or keep it high.

    A very nice trick this IC allows is that when you keep PHI2 high, you can read the internal registers combinatorially without advancing any state machines internally (at least so far that seems the case). So with some simple arduino code and some pin toggling you can get a lot of insight to the visible registers after each cycle.

    So far, I'm quite happy with my progress. But I am now getting to a phase where I'd like to ask if anyone knows of any existing test vector sets, at a lower level than the C64 assembler level tests I've seen in VICE.

    References and details I have been able to find myself so far:

    - VICE has a C model, but it seems a bit too emulator centric to exactly follow for me.

    - In this forum AndroSID has posted several die shots, but I'm not really sure how to continue from those.

    - I've found 2 different synthesizable models: from the MIST project and from the c65gs project

    Bitte melde dich an, um diesen Link zu sehen.

    Bitte melde dich an, um diesen Link zu sehen.

    The vhdl model seems to have some nice additions, but the synthesized size is around twice as large as I'd expect. I've not checked the MIST model in detail for synthesis.

    My own measurements at cycle level seem to suggest some differences at least with both models.

    I've also built my own verilog model, which has some differences with both models, but that isn't salonfahig yet. I've been able to use Xilinx ISE 10.1 to synthesize to a 5V tolerant part, the old Spartan II XC2S30 - it needs around 280 flipflops. I've also made the Alarm and TOD latch registers optional for synthesis but this doesn't help enough to get it into a smaller part.

    Some low-level bugs/features I've noticed in my tests, not sure how much of this is well-known:

    // Conclusions so far:

    // Output mux from internal latches is just combinatorial, PHI2=1, CSN=0, RWN=1.

    // PA/PB inputs are latched on posedge of PHI2 and can then be read at negedge. Observable as a 1/2 cycle input delay.

    // PA/PBout are latches and can be written multiple times during 1 phi2=1. However outputs only change on next PHI2 negedge. So the PAout, PBout are 2 latches, you read the first latch back, the 2nd latch goes to the output.

    // IRQ bit 7 of ICR is one cycle later set also one cycle later cleared than a timer irq is set.

    // Reading ICR seems to reset the ICR the next cycle.

    // Writing ICR seems to also clear the ICR, 2 cycles later, in my 6526R4. (without any read)

    // IRQN output pin changes 0->1 and 1->0 on the negedge of phi2. IRQN output is made inactive immediately when reading ICR.

    // When TA generates an IRQ every cycle (TAL=TAH=0, start=1), a read from ICR will set IRQN high (inactive) for one cycle. But bit 7 of ICR stays high (active).

    // Test mode: A on phi2, B one-shot on A. When A reaches underflow:

    // tal: 01 05 05 04 //

    // tbl: 00 00 02 02 // one-shot reloads, clears start bit in crb.

    // icr: 18 19 9B 9B // B irq one cycle later than A.

    // crb: 49 49 48 48 // start bit is cleared

    // Bit 0 of CRA/CRB is cleared when in one-shot mode on the

    // Timers reload view must be due to reading back a latch halfway through processing.

    // latch-based counter: Cant be built. So 2 latches are used with a minus inbetween. What is read is the phi2=1 latch, timer out is then phi2=0 active

    // phi2=0

    // phi2=1 latch muxes ta_l, underflow

    // Toggle on PB and CNT in has no immediate same-cycle effect, PB unchanged and TAL unchanged. Both are re-synced.

    // Toggle on PB5 during PHI2=1, has no effect on next PB scan.

    // Toggle on PB5 during PHI2=0, before scan with PHI2=1, has direct effect.

  • I have also found some other related links to share:

    A software view for emulation: Now I'm not sure how accurate the schematics here drawn from a hardware point of view.

    Bitte melde dich an, um diesen Link zu sehen.

    One thing I have observed, is when timer A is in CNT input mode, when you pulse the CNT input (0->1->0) while keeping PHI2 high, that is counted as a rising edge of CNT. So I think the CNT input is indeed connected to the Set input of a latch, indeed, which is subsequently fed to a second latch. I see the same behaviour on the SDR input clocking.

    The C64 assembly testbenches I found are:

    Wolfgang Lorenz's C64 test suite has some CIA tests, I've not yet found a way to run them together with just a model. Probably needs a 6502 and some ram modelled next to the 6526.

    Bitte melde dich an, um diesen Link zu sehen.

    Programs CIA1TB123 and CIA2TB123 - CIA timer B 1-3 cycles after writing CRB

    Programs CIA1TA to CIA2TB - CIA timers in sysclock mode

    The VICE CIA testbenches:

    Bitte melde dich an, um diesen Link zu sehen.

  • I'd like to hear from any of you if you have some other links or details to share, about existing test vectors or of the internal working of the CIAs.

    Well, androSID has been working on this for years including decapping the chips.

    See this thread here: Bitte melde dich an, um diesen Link zu sehen.

    Bitte melde dich an, um dieses Bild zu sehen.

    '°'°'°'°'°'°'°'°'°Oo.~>| BASIC programmers never die! They just GOSUB without RETURN! |<~.oO°'°'°'°'°'°'°'°'°'

  • Yes, I'd seen the die shots androSID has posted, they look great, but I'm not sure how to proceed from the die shots.

    I would hope there is some tool to turn die shots into polygon files (one gerber file per layer basically) which can then be fed into some automated transistor schematic generation tool.

    When I was IC designer, we used LVS (Layout-Versus-Schematic) tools that could automatically extract transisitor netlists from gerber files and compare them to the Verilog netlist we'd be trying to implement. Highly specific and very expensive tools (millions of dollars per year) though. But I'd imagine making some extraction netlist tool could be automated.

  • Well, androSID has been working on this for years including decapping the chips.

    Well.... not really. ;) At least not currently as I'm too busy with other more interesting things. But I haven't abandoned the project

    and will resume work when the netlist generator is finished; see below.

    Reversing just from die shots is too cumbersome IMHO. I did it a few times (e.g. 6581, 8501 etc.) but it can take ages

    for even simple chips.

    But I'd imagine making some extraction netlist tool could be automated.

    There are some free tools to do that and I'm currently reinventing the wheel... ok... just joking:

    I'm actively working on a solution to turn my .SVG files (drawn with inkscape) into netlists as

    the free tools do not match my workflow.

  • Hi androSID, which tools are you re-inventing ?

    I've not seen much available - the process I found right now online here

    Bitte melde dich an, um diesen Link zu sehen.

    seems to be indeed roughly how we worked when I was IC designer. You set some basic technology parameters, specify the VDD and GND, add the labels for every IO pad, and then let the tool do its job. After that there was another tool that could find cells from the netlist, as long as you provided a set of cells to detect. (inverter, latch and so on)

    I'd imagine the test vectors I'm looking for and building, would be quite useful to check if the extracted netlist does indeed match the silicon.

    Cycle-level test vectors are easily suitable for spice-level simulation.

    What is the part of the work with these free tools you don't like?

    I've found from professional use, all the professional CAD tools really are extremely buggy.

    This doesn't stop people using them, as making your own tools is just too much work and introduces even more bugs. At least with existing tools, you have a set of mostly known bugs and workarounds....

    Would you be interested in some SVG-to-other converer?

  • What is the part of the work with these free tools you don't like?

    ......

    Would you be interested in some SVG-to-other converer?

    I cannot recall ATM which limitations I didn't like but some tools didn't work with .svg files and that's what I use in my workflow.

    Most of the professional tools I tried have *much* more features than I really need for the rather simple MOS/CSG chips.

    In fact I had to spend more time figuring out stuff like process parameters, W/L dimension settings

    just to get some simple gates to be detected correctly.

    Therefore I started programming a .svg to netlist converter (75% finished) which produces simple(!)* transistor netlists which

    I can then feed into my netlist simulator (already working but not very configurable; 95% finished).

    I also used some professional tools (which I rented due to the extremely high price) to convert

    several Original MOS/CSG GDS-II files (e.g. MOS6522 or CIA-Variant CSG8520R4...) into netlists...

    but I haven't fed the lists into my simulation yet. That means I don't know if they are correct (yet). :)

    The last tool I want to redo is something like the well known tools from Peter Monta.

    In other words converting the netlists into either synthesizable or (in a first step) simulatable

    Verilog code (due to the use of *really* bi-directional transmission gates).**

    *No W/L extraction; just transistors+nodes to be used for switch-level simulation.

    ** The transmission gates are used bi-directionally. It's not always clear which side is source and which side is sink.

    Therefore simulation would be the first step and if I lose motivation the last one. LOL

  • PS: Just to make it clear: I do not suffer from the common "not invented here syndrome".... :)

    I'm actually too lazy to re-do everything. I just didn't find the right tools that I could use without spending

    too much time working around limitations imposed by my already produced works or finding out things

    I could not find out easily without trial-and-error (which I detest).

  • Hello,

    The latest mega65 cia is at Bitte melde dich an, um diesen Link zu sehen., but it is still incomplete.

    You should talk to Gideon, as he has a much more accurate CIA implementation than we have.

    Yes, CNT can count at upto about 16MHz on real CIAs, as has been hinted at above.

    Paul.

  • Yes as far as I can remember, we used two different types of tools for layout extraction.

    One type is just for LVS, that would be closest to what is needed here, as all wires and capacitances are ignored. You do get the W/L information - which is just the polygon dimension of the actual gate of the transistor - which is needed in some cases like to know if a transistor is used as a weak pull-up or pull-down rather than as a driver.

    This extraction was fast if I recall. A few minutes I remember. I think Magic and other free EDA tools will have versions of that.

    The resulting LVS transistor spice netlist can be converted directly, 1:1, into a switch level verilog netlist for digital simulation - using pmos and nmos verilog primitives. But mostly we would use some tool to assembling inverters/latches etc into higher level cells.

    Then you had the full parasitic extraction, used to then subsequently create static timing analysis. That could take many hours to create and is really highly dependant on technology files, with full resistance and capacitance of all wires. You certainly don't need that detail.

    We would use verilog netlists for checks on IC scan chains to verify the automatically created production level test vectors work correctly. You can annotate the verilog with timing information, but that is not necessary for things like a test vector check. I think the available open source verilog simulators are fast enough for that, for sure, especially for these designs.

  • ... also, if you do it in VHDL instead of verilog, we'd love to put an improved CIA in the MEGA65 :)

    Paul.

    Verilog & VHDL are like the indenting wars ... 2, 4 or 8 characters.

    I personally dont like VHDL for the same reasons I don't like Java: Its extreme verboseness, very hard to check if an intended piece of code actually matches the intention.

    So for me its verilog all the way. Once they introduced the always @* construct, I've never looked back.

    Nevertheless, of course I've had to do a lot of VHDL coding too. In the MIST example above I included, there is a reference to a Work debugging library item used in simulation, called to_hstring. The way that has to be implemented in VHDL looks absolutely nothing like the intention of the function, and whatever I did I could not convince Xilinx to skip over this function for synthesis. In the end I resorted to manually commenting out all the report statements.

  • The latest mega65 cia is at Bitte melde dich an, um diesen Link zu sehen., but it is still incomplete.

    You should talk to Gideon, as he has a much more accurate CIA implementation than we have.

    Yes, CNT can count at upto about 16MHz on real CIAs, as has been hinted at above.

    Here is the direct link:

    Bitte melde dich an, um diesen Link zu sehen.

    Hmm I have not seen CNT having an effect twice in one PHI2=1 phase. When I did 0->1->0->1 4x while keeping PHI2=1 it still registers as one count in my test.

    I'd be surprised CNT works at 16MHz, while PHI2=1MHz. Is that really what you're saying? Could be a mistake in my tests then.

  • Maybe this can spark some inspiration. It did for me and me VICII dieshot projects, but the VIC II dieshots takes for ever to vectorize. The metal layer needs to be cleaned away.

    Anyway, this video shows a C++ program simulating the Z80 from vectorize layers. If you know a little C, you can see the routing, charge sharing, measurements test points hard coded.

    Bitte melde dich an, um dieses Medienelement zu sehen.

    Ps. I made a test cart for my VHDL6526 project. I might share it soon when I have finished the PCB.

    My Bitte melde dich an, um diesen Link zu sehen. and Bitte melde dich an, um diesen Link zu sehen. pages.

  • Ps. I made a test cart for my VHDL6526 project. I might share it soon when I have finished the PCB.

    Cool video! Works out the connnections from a couple of pngs, nice.

    Would be great if you are able to shares the tests, when ready.

  • There's already a Project going on at Lemon Forum.

    Search for JCIA. Prototypes are already in beta-testing phase.

    Vorstellung Raveolution BBS -> Bitte melde dich an, um diesen Link zu sehen.
    Raveolution BBS -> raveolution.hopto.org:64128
    Raveolution Gopher Hole -> gopher://raveolution.hopto.org:70

  • There's already a Project going on at Lemon Forum.

    Search for JCIA. Prototypes are already in beta-testing phase.

    I've seen some information on the J-CIA here Bitte melde dich an, um diesen Link zu sehen.

    But as the IC markings are removed I doubt very much there will be any open hardware coming out of that.

    I've seen a guess on forum64 that the ICs might be an FPGA and two level shifter ICs, but who knows. The middle IC looks like a QFN44 package to me, not sure if any fpgas come in that format. Some MCUs like Atmega32U4 do come in QFN44 though.

    I'd be hoping for a 5V tolerant fpga solution or a MCU based solution using a C model and no further parts to keep cost down. Some less used or unused features might be missing in the end.

  • Yes, my understanding is that the CNT pin can be clocked faster than PHI2. I can't remember if I ever did it, though. I think I looked at using it for 64NET for speeding up loading, by allowing the PC to use the shift register to push data over faster than via the parallel interface, but my recollection was that PCs at the time couldn't write to the ISA-connected printer port fast enough to make it beneficial.

    Now, as for making a CIA replacement chip, I'd use a cheap Lattice FPGA/CPLD part, some of which I think are already 5V tolerant, and add the necessary level converters. I'd also be tempted to implement the level converters using transistors rather than buffer chips for the IO lines, so that the full behaviour of the pins can be properly implemented. While you need 2 transistors for every IO pin, they are little and could be placed on both sides of the PCB.

    LG

    Paul.

  • Hi Paul, thanks for your reply!

    Yes, my understanding is that the CNT pin can be clocked faster than PHI2. I can't remember if I ever did it, though. I think I looked at using it for 64NET for speeding up loading, by allowing the PC to use the shift register to push data over faster than via the parallel interface, but my recollection was that PCs at the time couldn't write to the ISA-connected printer port fast enough to make it beneficia


    Now, as for making a CIA replacement chip, I'd use a cheap Lattice FPGA/CPLD part, some of which I think are already 5V tolerant, and add the necessary level converters. I'd also be tempted to implement the level converters using transistors rather than buffer chips for the IO lines, so that the full behaviour of the pins can be properly implemented. While you need 2 transistors for every IO pin, they are little and could be placed on both sides of the PCB.

    I have actually built a high-speed PC interface back in the day, based off of FCOPY-III, called FCOPY-PC. I remember the serial thing was not working well for me at the time, instead I used the 2 joystick ports to build a 2x 4-bit one-direction connection each with handshake pins. This left the userport free for the 1541 drive. I remember clearly at the time , 486/pentium days that even under MSDOS PCs could no longer run real-time stuff so needed bi-directional full handshaking. It had to do with the caches of the 486/pentiums being stalled whenever you did I/O and/or write to an area not in the cache, so could just be chipset stalling issues.

    Anyway I have tried in extensive tests to get my 6526 to run faster than 1 MHz on the CNT input. I've fount the max speed is 16 phi2 cycles for 1 SDR change, but not in a reliable way.

    What I have learnt from this:

    - You can toggle CNT 1->0->1 or 0->1->0 once, during either low or high phase of phi2. Any single pulse or edge will always be detected.

    - I can count Timer A at a max rate of phi2/2

    - The SDR input register while in input mode can change at a max rate of once per 16 cycles.

    What I conclude from this is that the CNT input is connected to an input S/R latch, which allows a change only to a value different than the value sampled before the last phi2 negedge or posegde (unclear which edge) value. So Set and Reset are both allowed only when different to CNT_RR. (2x delayed)

    I don't observe any anti-metastability type of delay here. Its all still 1 or 2 phi2 phases.

    I imagine this circuit is meant as a type of slow-input-flank detection to prevent against noise. Maybe when examining the silicon W/L it is found the input is a type of Schmitt-trigger trick, where the change of CNT_RR changes the Vth sensitivity of this input latch circuit. That would make CNT very well protected against very slow flank edges from external circuits. Maybe the TOD input has a similar protection, it would probably need it.

    On SDR, when I run at phi2/3 speed I find I can replicate the silicon perfectly, by sampling SP input 3 cycles after the CNT edge. I have setup the testbench to keep SP pin low on all phi2 stages except this 3rd clock and it is taken as a 1. The opposite also works, keep SP 1 for all cycles then 0 just on the 3rd cycle - it is seen as a zero.

    WHen I clock SDR in any faster than CNT three cycles low, three cycles high, then I get unreliable state. Something is still working but not reliably. This must mean the exact internal state is meant to go no faster than 3 cycles (which is 4 cycles of an external part which then could be 3 cycles occasionaly due to clock drift between systems)

    Ive tried looking for replacement parts 5V tolerant - I've not found many good options. Around 280 registers are required, that is more than the CPLDs allow that are still affordable. The only part I've found so far is the Spartan II at maybe 6 USD.

    Using individual FETs is also what I think, given there aren't much better 74xx parts for many individual separate output enables.

    Altough I'll prototype without any FETs. An open-drain solution with pull-up should be fine I hope. Maybe with a very short 10ns pull-high pulse followed then by a regular weak pull-up phase.

    Do you have a specific issue why FET's should be preferred to open-drain with pullup?

  • I think either FETs or open-drain with pullup would be fine. In fact, on reflection, I'd most likely go open-drain with pull-up, which is already how we do this on the IEC port etc on the MEGA65.

    Paul.