A C runtime for Visual C++ to target OS/2

19 Sep 2023

The first step in using Visual C++ to write OS/2 programs is writing a C runtime library. I've written a small one before for Win32, and lots of code from it could be appropriated for use on OS/2. It is still much more difficult to write a runtime library for an OS without first having experience working with that OS.

While writing the C runtime, two main challenges emerged.

First, programs expect a memory allocator, but a memory allocator in 16-bit code needs to be segment aware. I ended up writing a "small" allocator that allocates from within a 64Kbyte segment, and a "huge" allocator that loops across segments looking for one that can satisfy an allocation. If no segments are found, it can dynamically add segments. As a future enhancement, this allocator should be able to dynamically add infinite segments and free on last use, at which point each process can have a single huge allocator and a modern looking malloc and free.

Second, like Win32, the 16 bit environment requires assembly functions to implement 32 bit math on 16 bit registers. This code could be lifted and adapted from Win32 but there are differences. The 16 bit model has an optimized path where an assignment operation, such as

x = x * 2;

can be implemented by passing a pointer to the variable where the operation code reads and writes from that variable.

The implementation here seems odd in that it doesn't exploit the optimization: the DOS/Windows C runtime just loads the variable into registers, computes on registers, and stores the result back. So it's unclear why the compiler couldn't do these reads and writes itself. The optimization exposes complexity, because the implementation needs to consider pointer size (far and near data pointers, far and near code pointers.) The DOS/Windows C runtime implements this four times. The compiler is smart enough to use near data pointers even in memory models that normally use far pointers, because it knows the ds register is set correctly.

Which leads to another observation, looking at 16 bit memory models decades later. The C model exposes four basic memory models: Small (near code and near data), Medium (far code and near data), Compact (near code and far data), and Large (far code and far data.) But as the calls above show, in the end, every function call and every data access is unique and can be made near if the caller knows that the code or data segment is set to the same segment. The C model didn't allow programmers to exploit this, by declaring that functions can only be invoked from functions in the same segment, or to explicitly switch data segment so that a bundle of pointer accesses can become near.

I suspect (but haven't looked) that many similar C runtimes of the era were written in assembly in order to express segment locality. For example, a memcpy could take two far pointers, check if they're in the same segment, then copy as near only. This cannot be expressed in C, but is a clear win in assembly.

On a positive note, a Win32 C runtime needs to support Unicode and ANSI functions. It's nice to reduce that matrix and have a single set of string manipulation routines.

Next, porting a real program to the OS/2 Family.

The code so far is available at https://github.com/malxau/os2api/.