Virtual 32-bit register and Indexed Instructions

JesperGravgaard · 23. Juli 2020

Hi Team,

While looking at adding compiler support for 45GS02 I have been mapping the entire instruction set including the flat-memory access and virtual 32-bit register. Both additions to the instruction set seem very useful and it is nice to be able to combine them.

However, I do wonder if it ever makes sense to combine virtual 32-bit register and indexed modes, since the data in X/Y will both be used for indexing and as part of the virtual register. For instance ORQ ($12),Y will both use Y for indexing, but also as an argument to the OR instruction which very rarely makes sense.

Are there any use cases where indexed addressing modes makes good sense when combined with the virtual 32-bit register?

If not have you considered only supporting virtual 32-bit register for the addressing modes that do not include indexing? In practice that would mean not supporting the virtual 32-bit register for modes zp,X abs,X abs,Y (zp),Y (zp,X).

mega65 · 25. Juli 2020

Hello,

The indexed addressing modes ignore their index when used with the 32-bit pseudo register, or at leaste should do so. This is documented somewhere, and is in the process of being documented more fully in the MEGA65 Book. The indexed modes are left in, as it is easier to do so, than to disable them. But they will probably be officially considered unintended operations, subject to changed behaviour at any time. So you can warn if people try to use them. Or even just forbid them, except for ($ZP),Z / [$ZP],Z, which should instead be rendered and parsed as ($ZP) / [$ZP]. That one is needed, so that you can get 32-bit ZP pointers together with the Q pseudo register.

LG

Paul.

Mac Bacon · 26. Juli 2020

Zitat von mega65

The indexed addressing modes ignore their index when used with the 32-bit pseudo register, or at leaste should do so.

What about ldq (zp, s), y and stq (zp, s), y? Do they become

ldq (zp, s) and stq (zp, s)?

And what about the read-modify-write instructions, like LSRQ, ROLQ, INQ, etc.? I guess there the indexed addressing modes still work, right?

I'll update the ACME docs and sources realsoonnow...

mega65 · 26. Juli 2020

Arg! Corner cases!

We should work out what makes sense, and make sure that reality matches sensibility.

I think you are right, that the RMW instructions DO allow the intexing. We need to make a test programme for this.

Dredging through the VHDL, it looks like ,Z is the only indexing that actually gets disabled. The rest remain enabled, and so should be used with extreme care. The ,SP),Y ones should probably just be ,SP), or alternatively ,SP),Q perhaps?

Also, while I think of it, some opcodes, like $0A = ASL immediate mode should be rendered as ASL Q, not ASLQ A, I think?

LG

Paul.

mega65 · 26. Juli 2020

Meanwhile, I also just implemented BITQ.

Documentation updates coming for all this...

LG

Paul.

Mac Bacon · 28. Juli 2020

Zitat von mega65

Also, while I think of it, some opcodes, like $0A = ASL immediate mode should be rendered as ASL Q, not ASLQ A, I think?

ACME does not use the "accumulator addressing" syntax, so opcode $0a is written just as "ASL", not as "ASL A".

Therefore, I used "ASLQ" for the quad mode version.

Zitat von mega65

Meanwhile, I also just implemented BITQ.

I just added ASRQ und BITQ to ACME, but you need to compile it yourself. I wouldn't want to make an official release for such a small change.

mega65 · 28. Juli 2020

Thanks! I'll make a note that ASLQ is required.

LG

Paul.

mega65 · 28. Juli 2020

... also just fixed the ROL Q and other RMW Q instructions to all be ROLQ etc...

LG

Paul.

JesperGravgaard · 29. Juli 2020

Thank you for the replies Paul. Interesting corner cases you found Mac Bacon

Is this correctly understood as the updated opcodes/addressing modes for BITQ and ASRQ?

My original post suggested initially disabling the indexing options that does not make much sense. That will make it easier to extend the instruction set semantics for NEG NEG, when the optimal way of creating sensible "extended" opcodes that make sense is identified. Of course the you can always change the semantics later - but as long as the opcode exists I guess someone will find a way to use it in their code and be unhappy if the instructions are then removed/changed.

PS. As an example one idea for utilizing the addressing-modes that are not really useful for the Q-register could be to let these enable 2 virtual 16-bit registers instead (AX and YZ). If NEG NEG + addressing modes ending in ,Y were to enable 16-bit virtual registers then as an example NEG NEG LDA ($12),Y would become LDAX ($12),YZ. This would mean that we get 16-bit pointer addition much cheaper than otherwise possible - and that arithmetic/binary operations do not destroy the 16-bit index register contents, which enables iteration over data using the 16-bit index register.

mega65 · 29. Juli 2020

Hello,

Yes, I have been thinking about the 16-bit use-case, as this indeed has a bunch of use-cases.

What I suggest for both ACME and KickAss, that for now, you don't allow the indexed quad modes.

I'll update the documentation to mark those opcodes as RES, i.e., REServed.

LG

Paul.

mega65 · 29. Juli 2020

Ok. I have updated the documentation calling half of these opcodes RESQ and RSVQ, so that the table

isn't too tall to fit on a single page

Let's keep discussing what function we would like them to have.

I think the ,Y ones can certainly be used for 16-bit AX operations in a fairly logical way, analogous to the

quad ones.

LG

Paul.

Mac Bacon · 29. Juli 2020

Zitat von mega65

I have updated the documentation calling half of these opcodes RESQ and RSVQ

I think you may have overlooked 424282 STQ ($nn,SP),Y...

mega65 · 29. Juli 2020

I purposely left those stack relative ones, partly because the load is really handy for C implementations, and partly to remind me to think about how I could solve it for STQ stack relative. I should probably mark it reserved, though.

LG

Paul.

FeralChild · 29. Juli 2020

If you are extending the instruction set, maybe it's a good time to think about a better MAP? For me, the ideal one:

- would accept 16-bit pointer to the structure in memory (RAM or ROM, depending on the current mapping), which would describe the actual map

- would not touch or require touching Q (.A, .X, .Y, .Z), or the status register

- would not require EOM afterwards

- would allow to access whole Mega65 memory, without calling MAP twice

Currently I need quite a lengthy helper subroutines to transparently change memory mapping: https://github.com/FeralChild6…_m65/mega65_map_helpers.s

mega65 · 29. Juli 2020

Actually similar issues are coming up as I begin to look at making an efficient MEGA65 backend for the VBCC compiler.

Will need to keep thinking about this, but I am currently thinking that we could have:

MAPL #$xxxx - Set lower half mapping immediately, immediate mode

MAPL $xxxx - Set lower half mapping from memory, absolute mode

MAPL $xxxx,X - Set lower half mapping from memory, absolute mode, X indexed

MAPLI = MAPL, but requires EOM to allow interrupts again

then the same for MAPH*

These instructions would require 5 or 8 cycles, depending on immediate vs other modes, and would not touch the stack, so would be quite a lot faster in practice.

We could also have:

MAPL #$nX -- Set the four bank enable bit, without changing the banking address

(and the other 3 variants, at a cost of only 4 cycles)

Being able to bank one half at a time saves some cycles, but also allows an application to bank the lower-half, while the OS banks the top-half with impunity.

I should also revisit the far JMP, JSR, RTS stuff I did ages ago, to see if it is actually working or not.

It could also be interesting to have a "MAPL/H and Push" and "MAPL/H Pop" instruction pair, that make it more efficient to temporarily bank memory.

What do you think?

LG

Paul.

FeralChild · 29. Juli 2020

For me only the absolute/immediate modes are useful. Changing low/high mapping separately is not a problem for me at all - it could actually even be useful at some point in the future.

What do you think about PUSH/PULL of the memory map values (PHML, PHMH, PLML, PLMH)?

LGB-Z · 29. Juli 2020

Soon it sounds it's better to have some I/O space based approach, as then you can read the mapping setting back as well, also no need for so many special opcodes. However the downside: you need the I/O space, so the "legacy" $DXXX area must be there, or must access the MEGA65 I/O area at its mapped address, which again may need some mapping first. Hmmm. Btw I would opt even for the legacy physical address 0/1 ("CPU I/O port") to be optional (ie it can be turned off) and can be accessed via some I/O area setting, since it's often annoying to have "special functions' mixed with normal fast RAM area (ie, the same applies for the 2K "C65 colour RAM" at the end of the first 128K ...). But surely, it's another story. My worry here, that MEGA65 memory handling becomes more and more complex and hard to understand which so much possibilities, but sure the main force here is the need of the compatibility with the original C65 way, even if I don't find that particularly a good solution (but that is, what we must deal with, of course), for example the constraint of just two free mapping offset (for low and high), etc.

FeralChild · 29. Juli 2020

I/O space mapped in - for once, additional problem is that there are various personalities of the I/O space: VIC II, VIC III, VIC IV, Ethernet. Now, try to map another part of the ROM, call the routine which didn't fit in the base ROM, and then restore the old memory layout - that would became really complex (and make the ROM very slow).

I'm only glad that the original C65 ROM messes a little with MAP/EOM itself, so at least for the disk I/O the software has to be prepared, that it's memory mapping might get overridden.

LGB-Z · 29. Juli 2020

Yeah, that's why, it's rather complex, unfortunately C65 is already just too much "messy" in some sense already. But at least there can be some special opcodes (MEGA65 specific) can always address at least part of the I/O space always (regardless of mapping), then it's easier to assign functions for those I/O space ("by address") and only opcoded needed to access this space in generalthan assign opcodes for every functions as their own. Surely, just an idea.

For the VIC-III ROM mapping what C65 ROM mostly done ... Hmm, my problem with this, that's great for the ROM as it is, but not so useful to develop programs as your own. My general problem here that I tend like machines with "generic scheme of banking" rather than too specific ones maybe only good for their own ROM, but that's surely _MY_ problem what I like and what I don't like as much ;-P

mega65 · 30. Juli 2020

You can already ask the hypervisor to retrieve the current mapping.

As for pushing and popping the map, this makes some sense. Having a general POPM, POPML, POPMH set of instructions would make sense, and couple them with versions of the MAPL type instructions that do the mapping and then immediately push the previous map to the stack. The reason for doing it in this order, is that if you mess with the memory where the stack lives, you need the mapping data pushed to the newly active stack, so that you can retrieve it via the POPM instructions.

Also note that the far JMP/JSR/RTS instructions, if I confirm that I finished them, already give you a very efficient way to call routines in other banks.

Also thanks for the clarification that you don't see a need for the indexed versions of the MAPL/MAPH instructions.

LG

Paul.