Thursday, March 11, 2021

Why is CD-i emulation hard, or isn't it?

Recently, it has been argued on Reddit, Twitter and elsewhere that CD-i emulation in MAME is difficult and why this is unlikely to improve in the foreseeable future.

The main reasons given (paraphrased) are that

a)       there are different hardware implementations of the CD-i platform, and

b)      applications do not directly interact with that hardware but are mediated by the BIOS, therefore

c)       hardware access patterns look identical across the entire library of applications.

The conclusion from this is that

d)      an emulator developer cannot “more clearly infer certain subtleties of the hardware functionality”.

Note: This article uses the term “BIOS” for what I have called “CD-i system ROM” in the past. The latter is more descriptive, but the former appears to be what many other emulators use.

Some parts of the CD-i hardware are custom chips for which not much or only very skimpy documentation is available, some reverse engineering is therefore required. But is it really hard?

One of the ways to reverse engineer hardware is to look at the software that uses it. There are various ways of doing that, the classic MAME way is to look at the access patterns and the code immediately surrounding that. If you have a sufficient variety of accessing code, you will ultimately develop a good (emulated) approximation of the original hardware, by a sometimes lengthy process of trial and error. If this is your goal, the fact that all accesses go through the BIOS can indeed be a problem.

But is having such a good approximation actually necessary to have a working emulator? If every application can directly access the hardware, as typical for most consoles, certainly. But this is not the case for CD-i!

The fact that all CD-i specific hardware access is through the BIOS is actually an opportunity when viewed in the right light. It is not necessary to develop a generic “good” approximation of the hardware, the only thing that is needed is an approximation “good enough” for the BIOS!

Now, this goes against the stated goals of the MAME project which is actually *not* to produce a working emulator but to document the hardware legacy. From mamedev.org, right at the top:

MAME’s purpose is to preserve decades of software history […] by documenting the hardware and how it functions. The source code to MAME serves as this documentation. The fact that the software is usable serves primarily to validate the accuracy of the documentation.

Seen in this light, so-called “high-level emulation” (emulating the visible effects of hardware instead of the actual inner workings) is akin to blasphemy: it goes against the entire philosophy of MAME.

My own goal in developing CD-i Emulator was never to “document the hardware”, though. It was to have a good enough emulation to actually play CD-i discs. For this, it is technically not even necessary to emulate *any* existing CD-i specific hardware, as CD-i is basically a software API standard. If you have a correct implementation of that standard you are done (there are of course some complications).

It is interesting to note that CD-Ice, the first ever (public) CD-i emulator, did exactly this. The author used his knowledge of the CD-i specification (Green Book) to implement the API from scratch using his source code for “Rise of the Robots” as a working sample. Although often treated as a single-game emulator, this is not in fact correct as his emulator ran many other games as well.

The Green Book states that CD-i players run the CD-RTOS operating system using an “68000-family processor”. This is therefore the minimum amount of *actual* hardware that needs emulation, and it’s exactly what CD-Ice did. All other parts were implemented based on the specification and not on actually existing hardware.

Going the full HLE route like CD-Ice did has its own problems of course, mostly because CD‑RTOS incorporates significant parts of the OS‑9/68000 operating system and a full emulator would therefore need to implement that as well. Some features such as multitasking and interrupts can be particularly hard to tackle with a full HLE strategy. Using full HLE also reintroduces the “many access patterns” issue but on a different level. Since you are emulating the (software) interface against which developers work directly, you can only get it “good enough” by throwing lots of applications against it.

For all these reasons, I chose a different approach for CD-i Emulator. Knowing that fully implementing CD-RTOS would be hard, I wanted to use the existing BIOS software. This meant that I would have to emulate some hardware, but only to the extent that the BIOS uses it. And I would not just stare at access patterns and the immediately surrounding (disassembled) BIOS code but take a more holistic approach as described below. Of course, it also meant that I would need to emulate all the hardware versions, which in retrospect has been more work than expected but not intrinsically hard.

All the CD-i specific hardware accesses done by the BIOS are located in OS-9 driver modules which together are less than 20% of a typical BIOS (counted by module size in bytes). And significantly more then half of that is the video driver, mostly taken up with drawing code that is totally uninteresting from the emulation point of view. All in all, only about 5% of the BIOS is concerned with accessing CD‑i hardware, which is about 25KB of 68000 machine language, i.e. less than 10 000 lines of machine code.

The interesting thing about these drivers is that their service interface is specified in the Green Book. Starting from this documented interface you can infer at a high level what the hardware is supposed to be doing as a consequence of all the device register accesses done by the BIOS. Actually figuring out the functions of the individual device bits does require disassembling the drivers and tracing their logic and dataflow but with a symbolic disassembler that is doable.

Most drivers are (much) less than 2KB of machine language, meaning that the path between the documented software interface and the to-be-emulated hardware is typically less than a few hundred or so lines of machine code. This is not hard to trace through.

The only driver size exceptions are the Video driver and the CD+Audio driver. The video driver totals about 65KB of machine code but as said above most of that drawing code not relevant for hardware emulation. Moreover, the video hardware is actually fully documented starting from the Mono-I hardware revision which uses the Motorola MCD212 VDSC chip. Earlier hardware revisions utilize two Philips SCC66470 VSC chips that are also documented and a back-end processor that is technically undocumented but whose functional aspects follow completely from the VSC documentation and the Green Book specifications. So the Video driver is not a major hurdle.

That leaves the CD+Audio driver, arguably the most complex driver of the entire CD-i system. The size of this driver module varies somewhat with the actual chipset used (about 18KB for CDIC, 26KB for DSP and 20KB for CIAP, in all cases excluding downloaded microcode). There is some uninteresting stuff such as EDC/ECC error correcting code in there but most of this driver is actually relevant to emulating the hardware. This driver is therefore the single biggest stumbling block to good CD-i emulation, and it shows in the kinds of bugs and issues that CD-i emulators have.

The fact that the CD+Audio hardware actually has three different versions (in Philips hardware alone, non-Philips players use yet another few versions) makes proper emulation of all player models more work but not much harder, it helps that the actual drivers are descended from each other. Once you have a good understanding of cdapdriv (CDIC), it is not that hard to understand dspdriv (DSP) or ciapdriv (CIAP): the major structure is the same, only the details of driving the hardware differ.

All of the above is predicated on understanding drivers, i.e. *software*, not *hardware*. It therefore goes somewhat against the grain for MAME which has traditionally followed a hardware-centric view. Also, a relatively deep understanding of the Green Book and parts of OS-9 is required.

Given those prerequisites I would not say that CD-i emulation is hard. It is a lot of work, sure; getting a good emulation requires deep study of a few thousand lines of 68000 assembly language. But very doable.

The approach used by CD-i Emulator as outlined above also explains why I am not interested in non‑HLE emulation of the various microcontrollers and DSP chips. It serves no purpose in playing CD-i discs: getting good HLE emulations of these chips so the BIOS can do its thing is enough. As a not insignificant side bonus to this, no chip decapping or other exotic hardware exercises are required for this approach.

1 comment:

  1. This discussion clearly illustrates the differences between CD-i Emulator and MAME CD-i Emulation, for many of us this is very interesting to read. For archiving purposes and reference, you can read the original words by TheMogMiner in this article: https://cdii.blogspot.com/2021/03/the-developer-behind-mame-cd-i.html - This article by cdifan is an answer to his statements. I am happy it triggered you guys to talk about it though. In the meantime, let us all buy a coffee for cdifan here to show our support for all the hard work: https://www.buymeacoffee.com/cdifan

    ReplyDelete