Hello, Guest the thread was called932 times and contains 20 replays

last post from mega65 at the

So I have this compiler

  • So I've got this compiler I've been working on. http://cowlark.com/cowgol/ The language is vaguely Ada-inspired, with proper types, 8/16/32-bit arithmetic etc, designed to generate good code for old 8-bit platforms, but it ports reasonably well to other architectures. It currently targets 6502, 65c02, 6303, 8080, Z80, 80386, PDP11, 6502 p-code, C and Basic. The generated code is pretty decent given the size. The compiler is written in its own language so it will self-host on a machine with enough memory.

    Its big feature is that it will actually run _on_ 8-bit architectures: here's a life demo of it compiling a small program on a BBC Micro second-processor system (3MHz 65c02): https://bbc.godbolt.org/?&mode….com/cowgol/assembler.rom Press Shift+F12 to start the compilation script, and then type OUT once it's done to run the program. (Once it's finished, you can see the generated assembler source by doing DRIVE 1 and then EDIT S.TEST. You then almost certainly want to do SHIFT+F5 and select mode 0 to make text scrolling fast.)

    While it would be easy to add the C64 as a cross-compilation target, and in fact I have done so on an earlier version of the language, I've not really thought about it for actually running the compiler on: the 1MHz 6502 on the C64 is pretty slow for raw number-crunching and the disk file API is kinda lacking. But, the Mega65 is rather different. Cowgol should run on this pretty well.

    However, the compiler's written as a set of traditional command-line programs, which need to be run in sequence: the compiler itself is two programs, then there's a linker, and then finally the assembler. This is fine on the BBC Micro as it does provide a command-line environment. But, AFAIK, the C64 and the Mega65 don't. All you get is Basic.

    So, what I need to make this work is a way to write a script which runs a sequence of commands, loaded from disk, each with their own set of command line arguments. The BBC Micro has a OS feature called *EXEC which interprets the contents of a file as keyboard input; this is (almost) ideal. Is there a way to do this on the Mega65? Alternatively, is there a DOS or CP/M like environment which could be loaded instead of Basic which could be used instead?

    Also, it's not a requirement, but it'd help a lot of the Mega65 had a way to seek within SEQ files; the compiler doesn't require seekable files, but it'd speed up the linker a lot.

    Also also, have the 6502 extensions supported by the Mega65 been finalised yet? They'd be great to support in the code generator.

  • Fulgore

    Approved the thread.
  • First of all, good to see someone taking both time and effort to develop a new compiler for our old machines.:thumbup:

    Just a few comments:

    (Sorry if they sound negative. It's not meant to be, just musing ...)

    - Is there a cross-compiler that runs on Windows and generates code for the 6502/Z80?

    - What does the 6502 p-code look like?

    - What about constants?

    - "no recursion"

    Hmm, this I would consider a rather huge disadvantage as it makes writing otherwise simple code (e.g. an assembler with recursive expression parsing) a lot more complicated. Can I assume that all "local" variables are in fact global variables with just an access restriction, i. e. the compiler reserves space for them at a fixed address and writes data directly to this address?

    - "arbitrarily nested subroutines, with access to variables defined in an outer subroutine"

    On the other hand, this is a feature I wouldn't miss as it comes straight out of coding hell. Having access to a local variable from code other than the one which it was declared for means creating a new form of global variable resulting in possibly unwanted side effects and thus destroying the principle of locality. No idea, why Wirth put that into his Pascal language and derivatives. IMHO clean code shouldn't use it. If a subroutine has to pass its local variables to a nested subroutine it should do so by passing them as parameters.

    - "Cowgol supports single-dimensional arrays."

    Why just single-dimensional? ;( Shouldn't be so hard to implement more dimensional arrays...:/

    BTW is it possible to declare an array of records?

    - Is there a "REPEAT ... UNTIL <cond>" loop (i.e. post-test loop)?

    - A subroutine may have several input and output parameters, but what kind of variable passing is used? Just call by value or also call by reference?

    - How do you handle strings? C-style or Pascal-style? Fixed length? Any string functions?

    - Is there a way to divide code into several different namespaces for example by using a special symbol "library"? For example

    the compiler's written as a set of traditional command-line programs, which need to be run in sequence

    I guess all you need is a little machine program (not basic!) that works exactly like your batch file: it loads the tools one after one and executes them.

  • hjalfi: First of all: Welcome to Forum64! Good to have you here! As a Fluxengine user I can now nag you over here too and am not restricted to Github issues any more. :bgdev (I was nagging you about Commodore 1581 there.)


    Secondly: Just out of curiosity: Have you also considered C128 with 2Mhz or C64 with SuperCPU as potential platforms next to Mega65?

  • So, what I need to make this work is a way to write a script which runs a sequence of commands, loaded from disk, each with their own set of command line arguments.

    Have you considered to create something like a chain of commands by using a memory extension like GeoRAM/NeoRAM for C64/C128 that could be used as a RamDisk to fetch the things you need to execute and the data you want to process? A RAM extension would allow you to store away stuff that is not needed away from the main memory until it is needed again. But it would probably all need to be in one big "program" where the different tools are more or less stitched together to one big "binary".

  • Grr, can't figure out how to interleave comments, so I'll just have to cut-and-paste:

    - cross-compiler for Windows: no. It'd be easy to do, I just haven't done it yet (I develop on Linux). There's a 6502/Z80 cross-compiler which runs on 16-bit DOS, though, if that works. (It's not had a lot of testing.) (Cowgol's a pure cross compiler, and it builds all the combinations of host and backend currently, but some backends don't have a full file library yet.) (I was rather surprised to see that the compiler running on a 4.77MHz 8086 appears to be slower than on a 3MHz 65c02.)

    - 6502 p-code: it's essentially Forth. See https://github.com/davidgiven/…aster/rt/bbcti/cowgol.cos if you want to see the interpreter. I'm actually thinking of removing it as it doesn't produce much smaller binaries than real machine code, and is way slower.

    - Constants: for the 6502, 16-bit constants are inlined, 32-bit constants are out-of-lined. Most 32-bit operations are performed in little micro-loops which execute four iterations, to save space.

    - Recursion: well, I have just implemented a complete compiler, including parser, code generator and assembler in it, without needing recursion... but the main reason for it is that it avoids the need for stack frames, which are ruinously expensive on many 8-bit platforms. The compiler than walks the call graph and allocates local variables so that variables belonging to subroutines which cannot be called at the same time overlap. This dramatically reduces memory usage. e.g. the compiler backend itself only uses 32 bytes of zero page; and remember that on the 6502 pointers can only live in zero page.

    - Nested variables: saves hugely as you don't need to pass parameters into inner subroutines, as well as providing a very easy way to make state private to a particular subroutine. It also makes function pointers work. As the call graph must be known statically, being able to scope the type of a function pointer means we can control which functions can be called by it, which is vital. As for the style issue... I think the functional programming people would differ with you there!

    - Multidimensional arrays: possible but I haven't done it.

    - REPEAT-UNTIL: no; do it with loop...end loop and break. It'd be nigh trivial to implement, I just haven't done it. Likewise for.

    - Strings: C-style. Strings suck.

    - Namespaces: not as such, but modules are possible on a per-source-file basis. You need a header with the declarations, and a source file with the implementations, and subroutines can be declared public or private. The linker does global DCE. However, I'm not using this anywhere so it hasn't had a lot of exercise. Formal namespaces would be pretty easy to add.

    I've been considering the C64 family as a target, but TBH the problems with the Basic environment and the slow disk has put me off considering it as a host. The Mega65 seems fast enough and flexible enough for this not to be a problem. Even the 3MHz 65c02 on the BBC Micro second processor is too slow to use this for more than a proof-of-concept. A 40MHz 6502, particularly one with more instructions, should be a different matter.

    I also don't want to have to do a custom C64 version: I want to just build the existing compiler, targeting C64 binaries. I suppose it'd be possible to write a minimal DOS for the C64 which replaces Basic, and gives an empty 64kB address space with a CLI at the top of memory, but I'd much rather use someone else's...

  • Hi hjalfi ,

    i've moved the Thread to Software/ homebrew coding.

    You can quote individual sentences by just highlighting them (hold left mouse button and mark the sentence) then a popup shpould appear above the highlighted sentence.

    save Quote and insert quote. then it should work.

    And thanks for the post. sounds interesting !

  • First of all, thanks for your post! Your compiler looks quite intriguing! In my humble opinion, the MEGA65 is in desperate need of a modern programming language running natively on the system (after all, when the keyboard is *so* damn nice, we might as well do a little programming on the machine...), so the prospect of a compiler generating optimized 45gs02 code is a *very nice thing* :)

    So, what I need to make this work is a way to write a script which runs a sequence of commands, loaded from disk, each with their own set of command line arguments. The BBC Micro has a OS feature called *EXEC which interprets the contents of a file as keyboard input; this is (almost) ideal. Is there a way to do this on the Mega65? Alternatively, is there a DOS or CP/M like environment which could be loaded instead of Basic which could be used instead?

    Unfortunately, not at the moment (one could of course always use the KERNAL routine CHKIN to redirect standard input to a file, but that creates a whole lot of other problems, not the least of which is the fact that there's no standard of passing command line parameters to a program).

    But it's of course possible to write a simple shell for the MEGA65; shouldn't be much work... we'd only need to establish how to do the parameter passing.

  • ...so I wrote a shell. https://github.com/davidgiven/cshell

    It uses 316 bytes of RAM, and lurks at the top of memory. It has almost no functionality, simply allowing commands to be executed from disk, with a simple protocol for passing command line arguments in and exit status out. I haven't implemented batch commands yet (still debating whether to try and use CHKIN or to keep batches in a REL file, CP/M-wise). It runs with Basic paged out so you get everything up to 0xd000 for your own use. I couldn't find a way to get the I/O area at 0xd000 paged out while still keeping the kernal at 0xe000.

    Programs have to be bespoke, and are simple memory images loaded at 0x800. I'm not terribly happy with this, as it means that CShell programs can't be used if CShell isn't loaded. I might change it so they get loaded at 0x801 and the entry point is at 0x80d, because that's what most Basic-stubbed machine code programs do.

    Question: is there a way to suppress the 'searching for XXX... loading...' message when using LOAD? I've tried disabling control messages with SETMSG but it didn't help.

  • Here's the shell, running the Cowgol compiler itself... it took 55 seconds just to load the binary on a stock emulated C64! I haven't done the file access routines so it's never going to work, plus there's no ASCII-to-PETSCII conversion so the output is weird, but command line argument passing works and it succesfully errored out back to the prompt with a brk instruction.

  • So, cshell programs now load at 0x801 and their entry point is at 0x80d, so it's possible to create hybrid programs which can be executed from Basic with LOAD/RUN or from cshell with arguments; a signature is passed in on entry so that programs know which environment they're running under. This allows my compiler to produce 'generic' binaries which can nevertheless use cshell features if they're there. Here's a hybrid program example: https://github.com/davidgiven/cshell/blob/main/hybrid.asm

    Thinking about running scripts, this is harder than it looks. I can't just use CHKIN to redirect input from a file, because any program which does file access is going to call CHKIN to read characters, and then afters will put the input stream back to the keyboard, resulting in the script terminating.

    The way CP/M does it is that when you run a script with the SUBMIT command, the script is read and translated into a record-based file called SUBMIT.$$$, with each record containing one line of the script, in reverse. Then, each time the command shell hits the prompt, it tries to open SUBMIT.$$$, reads the last record, removes it from the file, and executes it. When the file is empty, it's deleted.

    But, Commodore's DOS is somewhat more primitive. I haven't found a way to truncate files, and sequential files can only be appended to and can't be seeked within. One thing I could do is to create a REL file, where the first record tells me which record of the script is being executed, so the command shell will open the file, read the first record and advance it, seek to the appropriate record in the file, and read and execute the script. When we reach the end of the file, we delete it. Except, that seems excessively complex and REL files apparently don't work on some systems.

    I suppose one option is to read the entire script into RAM and execute it from there but I kinda don't want to...

  • cshell has had a lot more polishing, I've implemented running scripts, and I've written a bunch of commands. You now get: dir type submit dos echo The commands are all tiny, the biggest being submit.com at 596 bytes, and even with an (emulated) C64 disk it loads in about a second. cshell now occupies a whole 401 bytes of RAM.

    There's also brk, which has one instruction in it, and hybrid, which can be run from Basic or cshell and which shows how to detect the runtime environment.

    Scripts are invoked with the submit command; they get loaded into memory just below cshell itself (and MEMTOP is moved down). You can pass parameters. The script is terminated when a command fails, which makes this better than CP/M.

    So this is now good enough to work with my compiler. I'll try and get it running on the emulated C64 and then come back for the Mega65.

    Regarding cshell itself: writing dir.com and submit.com has taught me that the C64 kernal disk API is somewhat... special; just detecting whether OPEN fails or not is hard. What this really needs is an abstraction layer over the top of the kernal API, similar to CP/M's BDOS, which programs can call to do things like parse filenames containing both a device and a drive, open files with error handling, etc. It wouldn't be particuarly difficult, even.

    The problem is that this now requires programs to be written targeting cshell, so as to use this API, and I'm not sure whether that's worth it. The advantage as it stands is that cshell can provide optional extras (argument passing etc) on top of a program which can be run normally from Basic. Requiring cshell's presence to run the programs essentially means setting up a whole new software ecosystem.

    OTOH if people want a CP/M-like DOS for the Mega65...

  • Just for the record - not that this is simplyfing anything. I just did a quick Google search and it turns out there even exists somthing called "geoShell", a CLI for GEOS: https://www.lyonlabs.org/commo…s_Development_Package.pdf

    I was just wondering if any kind of CLI existed and thought of GEOS and then found this. GEOS is absolutely not what you want to have to deal with with your context but I could'nt resist adding it here...

    EDIT: Better PDF: https://commodore.software/dow…oshell-v2-2-user-s-manual

    EDIT2: Then there ist stuff like ContikiOS für C64, LNG and GeckOS - all of them projects not very alive any more?




  • Additionally, for as long as you stick with the C64 (or even the C128) for your project I want to point out again that a GeoRAM/NeoRAM RAM extension could be beneficial for your intentions with your shell as there are tools that turn the RAM extension into a virtual floppy disk drive:

    Neoram-Drive V0.40 GPL (https://csdb.dk/release/?id=61122)

    Neoram-Drive 128 V0.40 GPL (https://csdb.dk/release/?id=187853)

    With these tools you may be able to extend your shell to write and read files from a RAM disk drive.

    I think that the Vice emulator easily emulates a GeoRAM/NeoRAM cartridge extension.

  • Howdy,

    I love your work on this, and it would be great to see it running natively on the MEGA65. A few general comments as I go along:

    1. For 32-bit arithmetic, you can use the MEGA65's CPU extensions. The already documented extensions in the MEGA65 Book can be considered "rather unlikely to change". They were partly designed to make compiled high-level lanugages much more efficient than on a stock 6502.

    2. Yes, CBM DOS sucks quite a lot. You would probably be best to make your own DOS layer that can trace through the file sector chains to allow random seeking. The MEGA65 Hypervisor also provides access to native FAT32 files, but also currently lacks a SEEK call, although one is planned.

    3. You can move ZP on the MEGA65 using the TAB instruction if you need to deal with ZP overflow.

    Anyway, keep up the great work, and I look forward to seeing it in action on the MEGA65!



  • ...so I've got good news and bad news:

    I have the compiler cross-compiling to the C64, complete with file access. Cowgol's assemblers work fine on small files. However, the compiler itself won't run. I'd misread the amount of zero page it uses; the front end wants 133 bytes and the backend wants 128 bytes, and AFAIK the only safe zero page to use is from 0x02 to 0x7f. It took me ages to figure this out because scribbling over the kernal's zero page made things behave very oddly.

    Also, the DOS has... definite problems. They say that behind every problem is an opportunity, but I suspect some of these opportunities are insurmountable. The most basic one is that I need to be able to seek in files. I could store all my source code in REL files, but it looks like support for REL files is extremely patchy: VICE's virtual filesystem seems not to support them, for example. You also still can't get the file length.

    So, essentially, what I need to make this work on the stock C64 is: a new operating system.

    Some of this would be a non-issue for the Mega65 but after spending way too much time fighting DOS (just detecting whether an OPEN or not worked is painful!) I'm not sure I want to go down the C64 environment route. The trouble is, the obvious thing to do is to run a completely different OS on the Mega65, either a classic one like Acorn's MOS or a bespoke one, and if you're going down that route you might as well load a different core into the FPGA which is easier to work with, and suddenly it's not interesting any more.

    Targeting a C128 might be feasible --- I could move zero page somewhere else, and I'd get much faster disk access. But moving zero page would lose me 256 bytes of RAM and it's already rather cramped. And VICE doesn't support it.

  • Hello,

    Sorry to hear of your battles here. Some thoughts:

    1. MEGA65 can move ZP around the place as well, which should solve your ZP pressure problem.

    2. You don't need a whole new OS, just a DOS library that can read D81 disk images and let you seek around in files.

    3. With (2), and if you don't use the C64/C65 KERNAL calls, you actually can get almost all the ZP back, anyway, which might relieve (1).

    So don't give up hope just yet!



  • I don't want to make things too weird --- I'd prefer to stick with the standard interfaces where at all possible. It'd be quite possible to create some user code which can be uploaded to the disk drive to do seeks... but then it wouldn't work on any other kind of disk drive, or on emulated drives.

    There is a CP/M clone for the 6502 which could be ported, OUP/M, using IEEE block access commands to create a CP/M filesystem on a standard disk, but it would require reimplementing the serial interface as you wouldn't be able to have the kernal in place at the same time. It also wouldn't run OUP/M binaries because the C64 video memory has to go <32kB; you'd have to recompile OUP/M programs to load at 0x0800. That's not actually a problem, as there aren't any! And it wouldn't help anyway as OUP/M uses 128 bytes of zero page...