What is the difference between 16-bit, 32-bit, and even 64-bit compilers?
July 6, 2007 5:34 PM   RSS feed for this thread Subscribe

programmingfilter: What is the difference between 16-bit, 32-bit, and even 64-bit compilers?

So I am in the midst of learning the C programing language and I have found that just because the language is a defined standard does not mean all compilers implement it the same way. Consider the Miracle C compiler. It is a 16-bit compiler. Most modern compilers, Visual C as an example, are 32-bit compilers. There are even 64-bit compilers. What are the implications of this difference? Recalling code generated by Miracle C, larger prediction errors were produced than did the code generated by Visual C. What difference would 16-bit vs 32-bit make to something like that?

Any links I should look at as a follow up? Thank you for your time.
posted by B(oYo)BIES to computers & internet (16 comments total) 1 user marked this as a favorite
The number of bits refers to the addressing. Basically, in a 16-bit compiler the pointers (pointers into addressable memory) are 16 bit (and probably the ints, etc). In a 32-bit compiler the pointers are 32-bit...and so on.
posted by ill3 at 5:37 PM on July 6, 2007


Well, if we're only talking bit size, the number is the number of bits in an int (well, in a CPU register, to be pedantic). The number of bits in an int is also likely to be (though the Standard doesn't mandate it) the number of bits in a pointer; the more bits in a pointer, the more uniquely addressable memory.


But don't worry about this. Instead, concentrate on writing Standard-conformant portable C, and let the compiler worry about translating that to efficient representations on your target architecture.

Your compiler knows what the bit size is, because it defines it as CHAR_BITS in the Standard include file limits.h.

You can refer to this in your programs if you need to.

If what you're writing relies on bit sizes, use a typedef to make it portable, and then write portable code.

Here's a good summary from a respected C programmer.

If what you're writing depends on register size in other ways, you should probably be writing in assembly language for that particular machine architecture.

Since you're still learning C, make sure you have as Standard-conforming a compiler as you can reasonably get, concentrate on writing Standard and portable C, and don't worry about bit sizes. Actually, it's probably good to try writing code that compiles on several compiler and produces the same outputs given the same inputs.
posted by orthogonality at 5:55 PM on July 6, 2007


ill3's summary is moderately correct but not really quite. The bit-ness of the compiler refers to the bit-ness of the machine for which the code is being created, and it does also relate to the physical size of certain standard types.

But it isn't quite as simple as he makes it sound. For instance, a pointer in one 16-bit mode for the x86 was 32 bits. The upper 16 bits were shifted to the right by 12 positions and added to the lower 16 bits to yield a 20-bit address. (That's the mode for the original 8086.)

There was a different 16-bit mode for the x86 where pointers were 32 bits. In that one the top 16 bits were an index into a hardware page table which yielded a 32-bit base address, and the lower 16 bits were added to that. (That's a 80286 mode. Gad, the bad old days.)

As a practical matter saying that a given compiler is "16-bit" or "32-bit" is shorthand that gives you some idea of how big an "int" is, but to find out the details for sure you have to consult the in the documentation.

This is the reason why most portable code has an include file that typedefs certain sizes to unambiguous names e.g. int16, int32, int64. When porting to a new machine where sizes change, you do the research to figure out the compiler's way of declaring each of those, modify the typedefs accordingly (probably with a branch of a conditional compile #if-#else tree).
posted by Steven C. Den Beste at 5:59 PM on July 6, 2007


what orthogonality said. You should be writing code that compiles and runs on whatever, although I wouldn't worry about 16 bit compilers unless you have some special hardware you want to run it on.
posted by delmoi at 6:07 PM on July 6, 2007


64-bit computing is really only useful at the moment for very specialized things. You'd probably know it if you were going to be doing it. SSE/SSE2/SSE3 already provide a lot of functionality for dealing with larger chunks of data... a 64 bit compiler with a 64 bit OS just makes it a bit easier. Practically, you don't have to worry about it, /especially/ if you use what's given to you in limits.h (INT_MAX/etc). A big exception to this is network programming, but there's plenty of documentation around that.

And Beste's answer isn't quite right, either... the output from a 64 bit compiler will only run on a 64 bit OS on a 64 bit platform. Fun, eh?

Oh, and your address space gets a lot bigger. Lets hope you don't have to worry about that one.
posted by devilsbrigade at 6:57 PM on July 6, 2007


Maybe it's useful to think of it this way: C may be standardized, but computer architecture is not. New architectures are developed as technology and needs change. So you need to take a constant high-level language and translate it into the variable low-level language of the processor. Somewhere the constant has to be mapped onto the current incarnation of the variable. This is accomplished by having a compiler that can produce code for a given processor's architecture. As bit length of registers is an aspect of architecture, you need different compilers for 16/32/64-bit architectures. (Though note above-mentioned exceptions and stuff like how 32-bit x86 code can run on 64-bit x86-64 machines).

People have been saying that you can think of the number of bits as the size of an int. This isn't typically true for 64-bit C compilers, AFAIK, which tend to use 32-bit ints and 64-bit longs and pointers.
posted by epugachev at 7:36 PM on July 6, 2007


Mostly echoing what SCDB says. "32-bit compiler" isn't an informative enough phrase to be really useful unless you're only talking about a small number of possibilities and just need some shorthand.

It's really not about the compiler so much as it is about the target environment, that is, what the compiler is compiling for. (It's common and easy to have a compiler running on a 64-bit system compiling for some other system entirely.) Some interesting features of a given target are how many bits wide a pointer is, how many bits wide an int is, and how many bits wide a long is. That directly affects how you program, because it tells you the range of possible values you can conveniently keep in a variable.

People have gotten pretty used to the "all the world's a VAX/386/whatever" mindset, in which ints are always 32 bits, as are pointers; but this isn't required by the C language standard, any more than it requires that your source code is in ASCII or that your path separator is a forward-slash.

In the specific case of Intel and AMD processors, there's another change which is much more important for most programs than the width of a register, and that is the number of registers available. The 64-bit modes of those CPUs have finally ditched the horrid, inefficient 1970s-era register sets in favor of a much more modern (1980s/90s) setup. This allows the program to spend much more time doing actual computation instead of shuttling things back and forth between memory and registers. However, this change is invisible to the programmer; you just get faster execution if your target environment has a reasonably-sized register file, because the compiler can generate more efficient code for it.
posted by hattifattener at 7:41 PM on July 6, 2007


You might be interested in these pages from a handbook for people moving their code from ILP32 (the typical 32-bit data model) to LP64 (the typical 64-bit data model):

http://docs.hp.com/en/5966-9844/ch03s01.html
http://docs.hp.com/en/5966-9844/ch03s02.html
posted by epugachev at 7:44 PM on July 6, 2007


The major take-home with compiler/architecture bitness is having to worry about overflowing the number bits you have to work with, ie. trying to store or jump to a value that is outside what the variable can hold or reference.

On 16-bit machines (literally museum pieces now, but you asked), the overflow happens at 2^16 for signed integers and 2^32 for unsigned integers (and addresses).

From what I heard (being a Mac programmer during this time meant I never had to deal with this Intel BS) various compiler conventions and tricks were used to work-around the 16-bit limitations (NEAR/FAR pointers, inline jump tables to reach addresses outside what a signed 16-bit offset could reach, etc).

32-bit programming puts these overflow conditions at 2GB & 4GB, so you really have to go out of your way to reach them (eg. working with DVD-sized disk images). IME 32-bit compilers can handle 64-bit integer and FP quantities transparently, so even that isn't a big deal nowadays.

If you are learning programming you should not be arsing around with 16-bitness. The last time I saw 16-bit code was 15 years ago.

This allows the program to spend much more time doing actual computation instead of shuttling things back and forth between memory and registers

The [post]-modern Intel/AMD x86 architecture of the past decade has had fancy register files etc. on the back-end to mitigate this busy-work required by the user-visible ISA.
posted by Heywood Mogroot at 7:56 PM on July 6, 2007


the output from a 64 bit compiler will only run on a 64 bit OS on a 64 bit platform.

...unless you're working with an embedded processor, in which case there is no OS.

On 16-bit machines (literally museum pieces now, but you asked), the overflow happens at 2^16 for signed integers and 2^32 for unsigned integers (and addresses).

Um, no. For a 16-bit signed integer, legal values range from -32767 to 32768. Unsigned 16-bit integers range from 0 to 65535. An "int" and an "unsigned int" will be the same size. However, a "long int" can be bigger. (It doesn't have to be, however. I've worked with target processors for which a "short int", a "long int", and an "int" were all the same size.

If you are learning programming you should not be arsing around with 16-bitness. The last time I saw 16-bit code was 15 years ago.

Or if you're working with embedded code. The Intel 8051 is still a very important processor, even though the instruction set for it was designed 30 years ago. Of course, the modern hardware implementation bears no resemblance whatever to the original silicon, but it's still essentially the same architecture.

And why would someone use such a primitive beast? Because they don't need any more, and because they can buy (or license) the 8051 for less than a buck a-piece.

Embedded software is a huge part of the software business, and it's really a lot different than doing applications for major operating systems. (I'm a bit sensitive about this because I spent most of my career working on embedded code.)
posted by Steven C. Den Beste at 10:42 PM on July 6, 2007


By the way, the best selling 32-bit processor architecture in the world is from ARM, not Intel. ARM doesn't own a fab and they don't sell silicon; they design cores and license them to others. But they're really good cores, small and low power and fast, with clean architectures, and they don't charge ridiculous amounts.

My former employer Qualcomm ships more 32-bit processors per year than Intel does. That's because they licensed an ARM core and put one (or two) of them in every ASIC Qualcomm produces and sells. And Qualcomm isn't ARM's biggest customer (though it's right up there).

ARMs are invariably used in embedded applications. (Or in PDAs, which I consider to be embedded apps.)
posted by Steven C. Den Beste at 10:56 PM on July 6, 2007


This thread has lots of theory and such, what it boils down to when programming in C is what the int variable actually means.

The code:
int foo; isn't defined very clearly in the official C spec. On 32 bit computers (almost all of them minus the really old and the mostly new 64 bit type) it means that 32 bits are assigned to the register. But C doesn't force that, it could be 16 bits, 8 bits, or 64, all depending on what the compiler thinks it should be.

This has a few practical implications:
1) When you make numbers, you can overflow them without realizing. If you add 1 to MAX_INT, you get MIN_INT as the output and the program keeps humming along. You can get around this with the int32 and similar typedefs that give you an exact number of bits.
2) When you use malloc to assign memory you need to be careful. For instance, an array of 10 integers is how many bytes? Well, it depends. The solution is to use sizeof(int). sizeof() is a compiler command, and not a normal function call. It gets evaluated only at compile time, and never again.
posted by cschneid at 11:16 PM on July 6, 2007


int foo; isn't defined very clearly in the official C spec.

The specification for the C language says that the compiler writers should use a length for an "int" which permits the compiler to produce the most efficient code.
posted by Steven C. Den Beste at 12:05 AM on July 7, 2007


I don't know anything about 16 bit compilers. For modern c compilers you need to know, as has already been mentioned, how many bits are in the various types, and that is under specified by "x bit compiler". As I understand it:

32 bit windows and Linux both use 32 bits for int, long, and pointer types.

64 bit windows uses 32 bit int, 32 bit long, and 64 bit pointer types.

64 bit Unix uses 32 bit int, 64 bit long, and 64 bit pointer types.

All the other types remain the same (char = 8 bits, short = 16 bits) regardless of OS or pointer size.

Note that int is consistently 32 bits in length, but pointers can be 32 or 64 bits long. Since there is some disagreement about the size of a long, you probably should not use one.

What should you cast pointers into or visa versa? size_t
What if you want to subtract two pointers? ptrdiff_t
What if you care how many bits your integers have? int8_t int16_t int32_t int64_t uint8_t uint16_t uint32_t uint64_t

You will often see code that casts between an int and a pointer. That code is wrong. Use size_t.
posted by treeshade at 6:51 AM on July 7, 2007


Listen to treeshade. It will save you pain.
posted by amery at 10:59 PM on July 7, 2007


Actually, treeshade is wrong about that detail. If you want to hold a pointer, use intptr_t (which is also defined in inttypes.h along with the other types he mentions). size_t has a slightly different purpose, though on any sane architecture they'll be typedefs of the same underlying type.

However, there are many non-sane architectures in the world. Most of them seemed like a good idea at the time.
posted by hattifattener at 10:01 PM on July 12, 2007


« Older Does anyone know a place where...   |   I'm looking for a quote I once... Newer »
This thread is closed to new comments.