THE PSYCHOTIC STATE Number 10 June 24, 1999 Why I Still Like C This month I am more cranky than psychotic. Perhaps that's wrong: in my case psychosis and peevishness are always closely related. What I mean actually is that right now I am especially petulant, even more than normal for me, and more likely to say things that grate. This month I'll be more controversial than usual. I am going to tell you why I still like programming in C. I've had a pretty long love-hate relationship with these infernal computing machines. Like most geeks of my age, across the years I've picked up a certain familiarity with different programming languages. I've slung my share of PL/I, and I've labored through RPG II. I am, therefore, old enough to be cranky. Here are some crochety-old-man statements: *) At some point in their life, programmers should learn to code assembler. *) Ideology should be left to politics, not programming. *) I still like C. Wow, these are not popular positions nowadays. (There may be those among you who have noticed the correlation between the subject of this month's diatribe and the impending OOAD class. Read on, my position is more subtle than it appears so far.) Let me take these three statements in turn, and you'll gain some insight into my deluded mindset. *) Assembler: Suppose your car got towed into an auto mechanic who told you that the way he did his repairs was to plug in his computer and let the machine diagnose the problem. He told you that he really didn't understand what all those piston thingys did, and was baffled by a carburetor. You'd be pretty unimpressed, right? For a taxi driver to say these things is fine - all he has to do is drive - but you'd kind of like a mechanic to be a bit more familiar with what's going on under the hood. I've met programmers who don't really know what it is that a computer actually does. Programmers like abstraction, and sometimes they deal with the machine at such a high level that they have no idea what it actually is. I learned C++ only about 3 years ago. About the same time I hired a programmer who was very steeped in object-oriented programming. It was amazing to watch him code. He would go to the whiteboard and draw a bunch of circles and lines. He would stare at these for quite a while before typing anything. I guess these were all the "IS A"s and "HAS A"s being charted out. I was impressed and a little bit intimidated. Over time, we got into discussions. It turns out that this fellow never really understood why C++ had types. Why are there things called int and double and char? Why isn't everything just an object? I realized that he really didn't understand that a computer had registers, and that really, these didn't deal with objects, but with int and double and char. A computer's bus is 32 bits wide (or at least, his computer's was) and that meant that not all data was created equal. Over time, the mutual holes in our knowledge became evident to both of us. He taught me some C++, and I lent him a book about the guts of the IBM PC. Those who have spoken to me over the last few months know that strings have always been a peeve of mine. In BASIC and perl and Java, you are encouraged to forget what a character string really is. You don't pay attention to the fact that some operations will result in copying lots of stuff from here to there; other things will result in memory being allocated and deallocated. Here's the point: if you've ever written a program in assembler, you know that there are some things that are hard for the computer to do, and some are easy. You'll have a pretty good idea what sorts of operations are generated when you write certain types of statements in the higher level language. In my worldview, learning assembler is like knowing the operation of the four strokes of the internal combustion engine. Most of the time they're not important, but if you're a really good mechanic they're always in the back of your mind, even when you're just driving. *) Ideology: In the 1960s there was something of a revolution in computer science. It was called "structured programming", and I am just old enough to remember the tail end of the debate. On the one side there were the inventors of Algol, and later, Pascal, who thought that the language needed a limited number of flow control statements, such as while loops, if-then-else blocks, and the like. On the other side were the traditionalists (who at that time included Donald Knuth) who did all this with simple GOTO and CALL statements. The traditionalists felt that structured programs would be woefully inefficient. The end of the story is that the traditionalists were vanquished, more or less. Every programming language nowadays, even FORTRAN 90, has these sorts of flow control blocks, not just GOTO statements. I say "more or less", however, because the most ardent philosophies of structured programming also wound up in the dustbin of programming history. (To be garbage collected?) The early, ideological versions of structured programming were Maoist in their severity. The authors of Pascal not only frowned on GOTOs, they left them out of the language altogether. Furthermore, you weren't supposed to leave a loop anywhere but from the top or bottom, so the language didn't include any structures for doing so. The idea was that to allow such revisionist things would yield "buggy" code. There never was too much research to back up this contention, but I guess in the heat of battle with the traditionalists the revolutionaries went a bit overboard. C was invented a bit later, and both these constructs were added back. (eg. "break" in C and "last" in perl). C then represents a later, more moderate evolution of programming style, freed from some of the strictures of the ideology. Most programming langauges come with a touch ideological flavor. After all, if one thinks that all is right with the world, that the existing programming languages are fine, what's the point of designing a new one? BASIC was originally conceived to make FORTRAN "easier", though in the end I'd venture to guess that most people would find pure FORTRAN easier than the tangle that BASIC evolved into. C++ was invented by Bjarne Stroustrup. If you read his book "The C++ Programming Language" you'll note that not all of it is devoted to the definition or explication of the language itself. There's a good deal of ideology in there. As OOP conquers the world, most of what Stroustrup put in C++ is being univerally adopted. Java throws some out, however, and I wouldn't be surprised to see other things fall by the wayside as well. *) C What I like about C is that you can't really learn it without understanding what the computer is doing. To really get the language, you have to comprehend just exactly what is going on under the hood. Because of that, it's not an easy language for a novice to understand. C makes you confront memory and pointers in a way utterly unlike other languages. In fact, that's one of the biggest complaints about it: it forces you to make sure that everything you do with the pointers is valid. Java, for example, completely removes pointers from the language. Except that it doesn't: quietly it makes _everything_ a pointer, and then sort of hides that fact from you. You make a trade: you don't have to about allocating and freeing memory, about array references out of bounds. Instead, you inevitably have to concern yourself with the even more subtle question of how and when the automatic garbage collector operates. You forget totally that _every_ operation on a member of a class involves a complicated pointer dereference. Except when the performance is terrible; then you have to worry through it all over again. In my view, C comes without too much ideology. It's essentially just what the computer does - assembler - with a structured programming interface on it, to keep us from having to worry through loading and unloading registers and pushing and poping from the machine's stack. So where does this all leave us? Am I really so much of a curmudgeon that I advocate coding everything in C? Not at all. I'd be pretty displeased if we adopted C as AGENCY.COM's methodology. It's pretty hard to use. It's like a very sharp knife: you can cut yourself as well as whatever it is you're trying to slice. It's really easy to write broken or inflexible code in it. But that's not the point. A programmer who is good at C is usually a good programmer, or at least s/he can become one. One can build a solid understanding of programming on top of a foundation of understanding what a computer does, what a pointer is, how to allocate and free memory. I think that everyone should strive to be good at C, even if we never use it. Let me give another example. I believe that everyone who codes software for the web should understand HTTP. It's remarkable how often it comes up: - I can think of at least two times in the last two months where I could not solve a problem without telneting to port 80, typing GET / HTTP/1.0 and hitting return twice. Is it necessary to know how to do this? No. But you should; somewhere in the back of your mind it's just good to know what the web browser is really doing. OOP is _a_ way to look at programming problems. It's a great way, a powerful way. We should all learn it. We will use it frequently. It is, however, not _the_ way to look at programs. Mathematicians, for example, don't think of data that way. Nor is data stored as objects in databases. Learn C. Whenever you code, in whatever language, keep in mind what it is that you're telling the computer to do. Then use the right tool for the job. Postscript: Two C puzzles: *) Relatively easy: What is the difference between the variables p, q, and r? const char *p = "hello, world!"; char * const q = "hello, world!"; const char * const p = "hello, world!"; *) This should be easy, but I swear I've had huge arguments with people who should know better. If a program declares int i[10]; what is the meaning of the construction &i ?