Programmer Musings: December 2004 Archives

This site will look much better in a browser that supports web standards, but is accessible to any browser or Internet device.

December 12, 2004

Origin of The One, Right Place

Back in September, I talked a bit about The One, Right Place and what a useful concept it is. I'm now reading the second edition of Code Complete and ran across this concept once again. More importantly, McConnell references the book where I first read about the concept: Programming on Purpose: Essays on Software Design. This was the first of three books collecting together articles by P. J. Plauger from Computer Language magazine.

The fact that I had forgotten the origin of the One, Right Place tells me that it's time to re-read these books. Hopefully, I'll get a chance soon and can review them properly for my site.

Posted by GWade at 05:22 PM. Email comments | Comments (0)

December 11, 2004

On Programming Languages

I find that I have a different viewpoint about programming languages than many programmers I've met. I feel that a programming language embodies three different things:

a tool for solving problems
an approach to thinking about solutions
a notation

I don't believe that there is any language that is the best or most powerful in all situations. Trying to say one language is better than another is like saying a screwdriver is better than a hammer. Without knowing what problems you want to solve, the conclusion is meaningless. Each language has strengths and weaknesses. Although you can write any program in any language, some programs are easier to write in one language than another. That is one reason why learning multiple languages is a good thing, more tools for your toolbox.

More importantly, I think that learning many different kinds of programming languages helps improve your programming by making your mind more flexible. I believe that knowing more than one different kind of programming language makes you a better programmer than knowing any particular language.

Many people in the programming field who are more experienced and probably smarter than I am have recommended learning multiple different kinds of languages to make yourself a better programmer. In many cases, I have seen people discard this advice based on the conclusion that all you need is to pick the best language and use it for everything. This approach falls apart when you realize that part of the purpose of using different languages is to improve how you think.

In my mind, it's the multiple kinds of languages that is important, not the particular language. The corollary is that knowing one language is does not really improve your programming ability no matter what language it is. Knowing only Java is not much better than knowing only Fortran 77. In both cases, you are limited to only one view of the world.

This is one of the disagreements that I had with Paul Graham's book Hackers & Painters. He argues that since Eric Raymond says that learning Lisp will make you a better programmer, you should always use Lisp since it is the best tool for the job. I did learn Lisp and found it to be extremely enlightening. Not because I was more effective in Lisp, because I wasn't. But, learning to program well in Lisp gave me a new perspective on approaching problems that I could take back to whichever language I am using.

The useful thing about Lisp is that it approaches problems from such a different direction that you are forced to see problems and solutions in a completely new light. This new approach drastically improved my ability to solve problems in both Perl and C++, since both of these languages provided the appropriate tools to allow me to exploit this new viewpoint. Amusingly enough the macros concept that Graham makes such a big deal about in his book was not new to me. I'm an old Forth programmer and that sort of approach is standard fare in Forth.

So programming languages are tools that we use and different languages encourage different ways of looking at problems and solutions. The third face of a programming language is notation. The language provides a way of expressing a solution. Not all notations, or ways of expressing a solution, are equivalent.

When you are doing massive amounts of text processing, regular expressions are a wonderful thing. Although the regular expressions may look somewhat arcane, they are an extremely succinct and (sometimes) readable way to express a search for an arbitrary string in a piece of text. It's not that you couldn't code the same search without regular expressions. you could write large amounts of special-purpose code to search for a particular string, but the regular expression syntax will almost always beat the handcrafted code in usefulness because the notation is expressive and succinct.

You should learn multiple programming languages, and use each one enough to really understand it. There are three major reasons to do this.

Multiple tools allow you to select the best tool for a job.
Different languages encourage different approaches to problems and solutions.
Different notations help you express different solutions clearly.

The second two, of course, explain the first. All programming languages would be equal if the provided neither differing viewpoints or different notations.

Posted by GWade at 11:24 PM. Email comments | Comments (0)

December 01, 2004

Review of Hackers & Painters

Hackers & Painters
Paul Graham
O'Reilly, 2004

I was really looking forward to reading this book. I had read a few of Graham's essays in the past and found his ideas to be thought-provoking. I expected some to find more of the same.

Instead, this book ranged over a lot of topics, not all of them big ideas. Some of the chapters did make me think, some insights were definitely worth considering. Unfortunately, some of the chapters were based on extremely simplified views of the past couple of decades.

Graham's chapters on business and making money ignore the cut-throat business practices that many companies thrive on. He suggests that CEOs really are worth 100 times as much as the average employee, because they direct the company as a whole. Unfortunately, all you have to do is look at the news over the last few years to see that belief is flawed. He describes Microsoft as being run by a brilliant and lucky technologist. In the process, he ignores the way they have often used legal clout and monopolistic practices to stay on top.

Graham's assertion that all applications will move to the web and all data will reside on the servers also seems a bit premature. He asserts that servers will be better maintained, backed up, and protected from viruses. We have seen plenty of cases where that wasn't true. He also dismisses the security implications of someone else owning your data as if it were a minor annoyance.

Graham is obviously enamored with Lisp. I agree it is a powerful language, but I'm not sure that it is the most powerful language as he suggests. I find it funny that that he says the ability to provide code for read-time, compile-time, and run-time is a feature in no other language except Lisp. But, I used that ability for years in Forth. He also makes the mistake of assuming that the power of a language is a single dimension, without taking into account that some languages are more powerful than others for a particular job.

All in all, I found this book to be a disappointment. If you were already interested in Graham's views, I wouldn't try to convince you not to read the book. But, I'm afraid that I would not recommend the book.

Posted by GWade at 09:01 PM. Email comments | Comments (0)

Mutexes Protect Resources

This is a very simple concept that seems to escape many people. In fact, most of the multi-threaded code I've seen shows a distinct misunderstanding of this basic fact. Very often people seem to think that a mutex should protect a section of code. I think this misconception may have been caused by the term critical section or critical region.

In my computer science courses, the concept of a critical region was used to describe a section of the code that must not be interrupted. Somewhat later, this concept was weakened to say that we did not want some certain other sections of code to execute at the same time as this one. In some forms of programming like embedded systems, real-time systems, and some kernel development, you really have sections of code that are timing sensitive and cannot be interrupted. Most of the rest of us don't really have those kinds of limitations.

In most cases, what we really need is a way to prevent two threads from performing conflicting operations on some resource. The aim is the same, but the focus is different. This change in focus helps solve many of the mutex-related problems I have seen in multi-threaded code.

Other than just forgetting to unlock a locked mutex, almost all of the mutex-related problems I've seen fall into three major categories:

Not locking everywhere locking is needed.
Trying to do too much inside the lock.
Smearing mutexes on some code to make it thread-safe.

Each of these bugs can be traced directly back to not connecting the mutexes with a resource. The first problem is usually caused by locking code that changes a resource without properly locking the code that reads the resource. I've also seen two (or more) mutexes used to lock different forms of access to the resource. This makes sense if you are preventing a particular piece of code from executing in two threads at once, but it does not properly protect the resource.

The second problem is often shown by either acquiring too many mutexes or attempting to execute a large amount of code without interruption by holding a mutex. Another symptom of this problem is acquiring the mutex in one function/method and then having to remember to call another function to release it. This is often caused by trying to protect code instead of a resource.

The last problem is particularly a problem for people who only vaguely understand what goes on in a multi-threaded program. These programmers start out by adding a mutex to prevent some problem that they are seeing and acquiring and releasing that mutex in various places until the program seems to work. When another problem comes along, they repeat this "successful" strategy to solve the new problem. Eventually, the code is peppered with mutexes, locks, and unlocks, and no one can predict how it will work.

In trying to solve these problems, you normally have to apply some relatively simple rules to your use of mutexes.

Only acquire the lock when necessary.
Acquire the lock everywhere that is necessary.
Don't hold the lock longer than necessary.
Don't acquire more locks than you need.

By focusing on the resource instead of the code that we don't want to interrupt, a simpler set of rules arises.

One one mutex per shared resource.
Acquire the lock everywhere the shared resource is accessed.
Don't hold the lock longer than necessary.

The first important point is that only shared resources need protection. If the resource is never accessed from more than one thread, it doesn't need this kind of protection. The second important point is that access to the shared resource should be controlled through a small number of methods. This allows you to control the acquisition/release of the mutex much more carefully. If you can't control access to the shared resource and must have acquire/release code scattered around, then you need to revisit your design.

In my experience, the simpler the use of shared resources and their associated mutexes, the more likely you are to get them right. In single-threaded code, a simpler design and implementation is more likely to be correct. Simplicity is even more important when multiple threads are involved. Using mutexes only to protect resources seems to generate a simpler design.

Posted by GWade at 08:28 AM. Email comments | Comments (0)