This site will look much better in a browser that supports web standards, but is accessible to any browser or Internet device.

Anomaly ~ G. Wade Johnson Anomaly Home G. Wade Home

December 20, 2005

Review of Secure Coding in C and C++

Secure Coding in C and C++
Robert C. Seacord
Addison-Wesley, 2006

One very real problem in software today is the rise in security exploits of one kind or another. Gone are the days when we can just assume that no user of our software will try to break it, or use the software to compromise an entire system. The more immediate problem is that most of us have no training in preventing security vulnerabilities in our code.

This book does a fairly good job of covering a number of sources of security problems and explaining how they can be exploited. Using this information and the recommended practices in the book, you can make your code much more secure. The book has a chapter devoted to each of several vulnerabilities. The author examines the reason for the problem, how it is likely to manifest, and the kinds of exploits that can be applied. He then makes suggestions for tools and techniques to use to reduce these problems.

The book covers topics such as strings, integers, dynamic memory, and formatted I/O, as well as others. In each case, the book carefully explains where the potential problems lie. In some cases, the author shows actual examples from code that was in live use. Although the delivery can be a bit dry at times, the material itself is sometimes scary in its implications.

Possibly the most important chapter in the book is the final one Recommended Practices. This chapter covers more than just techniques for solving a particular kind of pointer bug. This is the chapter that covers overall strategies, such as threat modeling, data sanitization, and defense in depth. If you have any background in computer security, these concepts will probably be familiar. If not, they are the most important things for you to learn from the entire book.

This book should be a requirement for anyone who develops software that will be used by more than just his co-workers. This includes software available over the web.

Posted by GWade at 09:56 PM. Email comments | Comments (0)

October 10, 2005

Review of Beyond the C++ Standard Library

Beyond the C++ Standard Library
Bjorn Karlsson
Addison-Wesley, 2006

If you've been programming in C++ in the past few years you've probably heard of the Boost library. Boost is a large peer-reviewed set of classes and libraries designed to augment the C++ standard library. Beyond the C++ Standard Library serves as an introduction to some of the classes and libraries available through the Boost project.

Although the book does not cover every library developed by the Boost community, it does introduce many of the most generally useful. Several of the libraries covered in this book have already been accepted by the C++ Standard committee as official extensions to the standard library under Technical Report 1 (TR1). So these libraries will soon be coming to a C++ compiler near you.

Almost every chapter began with two important sections:

  • How Does the X Library Improve Your Programs?
  • How Does the X Fit with the Standard Library?

The first section shows the benefits that you and your programs will gain from using this library. This section serves to describe the library's reason for existence. In some cases (like the shared_ptr classes), the purpose of the library is to replace hundreds of slightly incompatible, possibly broken, shared pointer implementations with a set that really work. In other cases (like the Lambda library), the purpose may be to provide you with functionality that you may not have even believed was possible in C++.

The second section shows how the library integrates with or enhances the functionality in the standard library. This may be the biggest difference between the Boost libraries and those written in your average programming shop. Many of the people that work with the Boost project were part of the C++ standards process. This means that they truly understand the library. These sections describe difficulties with the standard libraries that the library is meant to overcome or ways where the standard library can be improved with the benefit of almost a decade of real-world use.

These two sections are probably the most important part of each chapter. The rest of each chapter provides an overview of the library and some examples of usage. The overview lists the important members of each class with descriptions. The examples are generally useful and clear.

This book is definitely more than a rehash of the documentation at the Boost site. Each library is covered with enough information and examples to get you started working with the library.

If you are thinking about using the Boost libraries in any of your projects, I would recommend this book. I would also recommend the book if you are interested in becoming familiar with the TR1 additions that will soon be part of the standard.

Posted by GWade at 08:25 PM. Email comments | Comments (0)

August 18, 2005

Review of C++ Common Knowledge

C++ Common Knowledge
Stephen C. Dewhurst
Addison-Wesley, 2005

The subtitle of this book sums it up nicely, Essential Intermediate Programming. If someone has not mastered, or at least understood, the material in this book, he or she is still a junior C++ programmer. Although this material is necessary, it is not sufficient to make someone an intermediate-level C++ programmer.

As explained in the preface, Dewhurst wrote this book partially to save himself from explaining these same topics every time he deals with a new set of programmers. He also explains that he has not covered every important topic in the book. In order to make the book more usable, it has been reduced to 63 core points that are either central to your understanding of C++ or often misunderstood.

Although he does not go into extreme depth on every one of these subjects, Dewhurst does capture enough of the details to help you understand why the point matters and why it works the way it does. I have read almost all of these items in other books, sometimes in more detail. But, there were still a few points that I feel I understand better after his explanations.

Dewhurst begins with some topics that any one who programs in C++ should be at least somewhat familiar with: "Data Abstraction", "Polymorphism", etc. and works up through "Template Argument Deduction" and "Generic Algorithms". None of these chapters can be considered the definitive, be-all-end-all explanation of its topic. However, each is concise and covers the minimum you need to understand about that topic.

The only reason I found this book to be less useful than many of the books I've read recently is because I already understand most of the topics well. There are a few of the template chapters that I felt extended my understanding a bit, but the rest were covered in more detail elsewhere. That being said, I can see this book being of real use to any junior or intermediate level C++ programmer. If you are a senior level programmer, you might find this book useful as a reference for the more junior programmers you work with. I also think this book helps a more senior programmer recognize some of the points where a junior programmer is likely to have problems.

Posted by GWade at 05:25 PM. Email comments | Comments (0)

August 12, 2005

Review of Exceptional C++ Style

Exceptional C++ Style
Herb Sutter
Addison-Wesley, 2005.

Once again, Herb Sutter provides us with a set of problems that teach important lessons in C++. Each problem in the book covers some problem that a C++ programmer might see in a particular program or design. As Sutter solves each problem, he gives insight into the concepts surrounding the problem and the pitfalls that may trip up an unwary programmer.

As usual for one of his Exceptional C++ books, Sutter spends time covering some areas of C++, like exceptions, memory management, and templates, where programmers often have problems. But, unlike many books that teach the syntax of the language, he goes deeper to improve your understanding of how various features work and why. His explanation of exception specifications and why you should not use them is extremely well done. Sutter also explains why the standard streams could be considered a step backward from printf, but that there is hope on the horizon for solutions that support the best of both worlds.

Sutter ends the book with a critique of the std::string class, showing how it could have been better designed based on what we now know of C++. For many programmers, this section alone should be worth the price of the book. The author goes through many of the design tradeoffs with an eye towards simplifying the interface without loss of functionality or efficiency. It is rare that you get a chance to sit with an expert programmer and get him to explain the design of a non-trivial class.

In addition to hard-core technical information, there is a fair amount of style advice and fun examples (how many '+' characters can you write in a row in a legal C++ program). All of which give you more insight into this powerful language. And throughout the whole book, Sutter's interesting humor lightens what could otherwise be a very heavy read.

If you are trying to improve your understanding of C++, this book will explain parts of the language that were never quite clear. Although I would not recommend this book for a novice C++ programmer, I think any intermediate to senior C++ programmer would be well-served by reading it. I plan to recommend it to the C++ programmers I know.

Posted by GWade at 08:44 PM. Email comments | Comments (0)

July 10, 2005

Review of Effective C++, Third Edition

Effective C++, Third Edition
Scott Meyers
Addison-Wesley, 2005

I was really excited when I found out that Scott Meyers was releasing a new edition of Effective C++. The first edition provided a major step on the path for many of us from writing code in C++ to actually programming in the language. What surprised me was the fact that this edition was a complete rewrite of the original. As Meyers puts it, the things that programmers needed to know fifteen years ago to be effective are not the same things we need to know now. Many of the items from the original are no longer new or different, they are the accepted ways to program in C++.

As always, Meyers provides practical advice and sound explanations of his reasoning. Meyers also has an extremely readable writing style that does not get boring after the first chapter. In the first edition, some of the advice went against standard practice of the day, but Meyers did such a good job of explaining his rationale that you had to consider his position. In the latest edition, I found less of his advice to be surprising, but every bit as important. Even though many others have explored some of this territory, I see lots of examples of programmers who violate many of these rules and later regret them. Like the earlier editions, Effective C++, third edition serves as a great description of best practices for C++. Furthermore, if you haven't seen these best practices before, or you were not convinced by seeing some of this elsewhere, Meyers will make a good attempt to convince you.

As with the earlier editions, each item covers a specific aspect of programming in C++ that you must be aware of in order to make effective use of the language. Although it would be possible to gain some of this information by carefully reading the standard reference works, it would be hard to beat the clarity and focus of this book.

One of my favorite items in the book is number 1 where Meyers describes the richness of C++ and some of the pain that comes from dealing with the different facets of this complex language. He suggests treating C++ more as a set of related languages than as a single entity. In the process, he manages to reduce some of the syntactic and semantic confusion by showing consistency within each sublanguage. I am not doing his description justice, you need to read Meyers' version to be properly enlightened.

Meyers does not just focus on usage of the C++ language, he also touches upon important idioms applying to the standard library. He also spends some time on classes expected to join the standard library in the near future, like the Technical Report 1 (TR1) extensions. He also suggests checking out the classes available on Boost.org as a way to see where the language is going.

Another point that impressed me about this book is the level of professionalism in the editing. Lately it seems that spelling and grammar errors have become the norm in technical books. Personally, I find these kinds of errors distracting. I found one typo in the entire book and only a couple of spots where I needed to reread the text to understand what Meyers meant. In today's environment of hundreds of computer books coming out in a year, it is particularly nice to see this kind of attention to detail.

If you program in C++, you need to read this book. Unlike you might have expected, it is not a simple rehash of the earlier editions. Instead it is more of a completely new book in the series. Novice programmers can learn the correct ways to use the language. Experienced programmers can gain better arguments for best practices they are trying to establish and insights into practices they may not be as familiar with.

I can't possibly recommend this book too highly.

Posted by GWade at 10:38 PM. Email comments | Comments (0)

May 30, 2005

Joel on Exceptions

As I said in Joel on Wrong-Looking Code, I find Joel on Software to be a good source of information on programming. However, I don't always agree with him. In Joel on Software - Making Wrong Code Look Wrong, Joel states that he does not like exceptions. He also refers to a previous article Joel on Software - Exceptions, where he first talked about his dislike of exceptions.

Rather than attempting to explain his rationale, I suggest that you read his essays before continuing. I think he makes some valid points and argues his case well, even though I disagree with his conclusion. I do not intend to try to prove him wrong, but to give a different view of exceptions based on my experience and understanding.

Joel's arguments against exceptions fall into five categories:

  • the goto argument
  • too many exit points
  • it's hard to see when you should handle exceptions
  • not for mission critical code
  • they're harder to get right than error returns

And, just to show that I'm not blindly defending exceptions, here's a few that he didn't cover.

  • not safe for cross-language or cross-library programming
  • possibly inconsistent behavior in threaded programs
  • lack of programmer experience

Let's go over these points one by one.

The goto Argument

Let's start with the goto argument. Joel states that exceptions are bad because they create an abrupt jump from one point of code to another (from Exceptions, above). He does have a point. However, the if, else, switch, for, and while all create jumps to another point in the code. For that matter, so does return.

The original letter about the go to did not refer to every kind of branch in code. The fact is that Djikstra did state that limited forms of branching like if-else statements or repetitions like while or for don't necessarily have the same problems as the go to, His point was that unconstrained jumps to arbitrary locations make reading the code almost impossible.

The thing that separates an exception from a go to is the limits placed on where the exception can transfer control. A thrown exception is in many ways, just a variation of return. It may return through multiple levels (unwinding the stack as it goes), but an exception does not jump arbitrarily in your code.

Obviously, we wouldn't be willing to give up if in order to remove an abrupt jump to another place in the code. The difference between the goto and the if is limits. The if is constrained in where it is allowed to jump. Likewise, any exception is only allowed to transfer control back up the call stack.

While it is obvious that throwing an exception is more powerful than an if, it should also be obvious that it is much more constrained than an arbitrary go to construct. So the comparison of exceptions to go to is at best an appeal to emotion. Unfortunately, it is an argument I've seen far too often.

It's Hard to See When You Should Handle Exceptions

In Joel on Software - Making Wrong Code Look Wrong, he states that one of the reasons he doesn't like exceptions is because you can't look at the code and see if it is exception-safe. Given his earlier statement about how important it is to learn to see clean, I find this statement particularly interesting.

Although I understand Joel's commentary about the difficulty of making certain that code is exception-safe, I maintain that a good use of exceptions can help create cleaner code, that is easier to maintain. The key to creating exception-safe code is using the guarantees provided by the exception mechanism. Unlike Java, C++ makes a hard guarantee, Any local variables are destructed properly when an exception leaves the scope they were created in. This has lead to the resource allocation is initialization idiom (RAII). I'll use Joel's example to explain:


dosomething();
cleanup();

This actually jumps out at me as incorrect code. What am I cleaning up? This sounds like either a resource cleanup issue or a dumping ground for a collection of things that happen to be done at this time. If it's the first case, code probably looks more like this:


setup();
...
dosomething();
cleanup();

In which case, the RAII approach would be to create an object that manages the resource. The equivalent of setup() is the object's constructor. The equivalent of cleanup() is the object's destructor. This changes the code to:


Resource res;
...
dosomething();

This code is safe in the face of exceptions and cleaner to boot. More importantly, in my opinion, the setup and cleanup of the resource is now defined in the one right place, the definition of this class. Now if I ever use the resource, I'm guaranteed that the cleanup code will be called at the appropriate time. There's no need to remember to call cleanup() at the right times.

Unfortunately, this idiom will not work for Java, since that language specifically does not promise timely destruction of objects. In that case, as Joel points out, you are required to use the try ... finally mechanism to ensure cleanup. For this reason, Java does not always allow exceptions to clean up the code.

Not For Mission Critical Code

Joel goes on to suggest that exceptions would not be useful on larger or more critical systems. In fact, a large portion of my career has been focused on long running programs (including systems with 24x7 uptime requirements). What I've found is that the code without exceptions tends to fail in strange ways because error returns are tested inconsistently, and error codes are handled differently in different places in the code. Keeping all of these different places consistent requires a large investment in time as the code changes.

Now, one could make the argument that if all of the error returns were tested and if all of the errors were handled consistently and properly, the error-return based programs would be perfectly robust. Then again, you could also make the argument that if the exceptions were all handled correctly, the code would be perfectly robust, too.

The main difference here is that if the programmer does nothing, error returns are lost by default and exceptions terminate the code by default. For this reason, exceptions tend to be more likely to be handled because they would otherwise terminate the code. If an error return check is accidentally left out, no one may notice and testing may not point out the problem. If an exception is not caught, the program is likely to terminate during testing and, therefore, the problem will be found.

I do realize that either error returns or exceptions can be ignored or handled badly by any particular programmer. But in my experience, exceptions have been harder to ignore. While I probably would not say that exception code is easier to get right than error return handling, I would say that error return code is easier to get wrong.

Harder to Get Right Than Error Returns

Handling error returns explicitly in the code tends to increase the amount of error handling code spread throughout the program to the point that it hard to see the actual code logic for the error handling logic. I agree with Joel that your best programming tool is the pattern recognition engine between your ears. I find it much easier to understand and fix code logic if I can see it, without a large amount of extraneous code in the way.

In many ways, error return checking and handling code (if done completely) can obscure the main logic of the code to the point that maintenance becomes a nightmare. To extend Joel's cleanup example a bit, let's see what error return checking code does when you need to perform cleanup.


setup();
doSomething();
doSomethingElse();
doOneMoreThing();
cleanup();

Let's say that each of the functions above returns an error condition that we need to check. Let's assume further that we need to return the error from this routine. Finally, we will need to perform the cleanup in any case. So the code might end up looking like this.


ErrorType err = SUCCESS;
if(SUCCESS != (err = setup()))
{
// no need to clean up.
return err;
}
if(SUCCESS != (err = doSomething()))
{
cleanup();
return err;
}
if(SUCCESS != (err = doSomethingElse()))
{
cleanup();
return err;
}
if(SUCCESS != (err = doOneMoreThing()))
{
cleanup();
return err;
}
cleanup();

or like this


ErrorType err = SUCCESS;
if(SUCCESS != (err = setup()))
{
// no need to clean up.
return err;
}
if(SUCCESS != (err = doSomething()) ||
SUCCESS != (err = doSomethingElse()) ||
SUCCESS != (err = doOneMoreThing()))
)
{
cleanup();
return err;
}
cleanup();

or any of several variations on this theme. In any case, the call to cleanup() ends up duplicated.

In my experience, anything that is duplicated will probably be left out at some point during maintenance. More importantly, if the cleanup is more than one statement instead of a single function call, the different places where it is written are likely to get out of sync. Careful code reviews can reduce this problem, but it still hangs over the process.

More importantly, to my mind, is the fact that the actual logic of the program is now obscured by the mechanics of the error handling. This is the problem that exceptions were created to reduce.

If we once again apply the RAII idiom this code becomes.


Resource res;

doSomething();
doSomethingElse();
doOneMoreThing();

In this code, it is much easier to see the main line of the logic. What you can't see is exactly what exceptions might be thrown. But, the original code did nothing specific with the error returns, it just passed on the error condition to the calling code. This code does the same, it just does it in terms of exceptions.

Cross-Language/Library Programming

Unfortunately, the C++ standard cannot promise compatibility of exception across language boundaries. There are almost no guarantees if your C++ code calls a C routine that uses a C++ callback that throws an exception. Obviously, this is not a situation you'll run into every day, but it is a problem. This issue may also apply to JINI code that uses C++.

A bigger problem is the fact that exceptions are not necessarily binary-compatible between modules, even in C++. It is possible for a library compiled with different options to produce exceptions that do not quite match the exceptions from the rest of the code. (See C++ Coding Standards for the details.)

In the worst case, this just means dealing with error codes at the boundaries of these two cases. So it is no worse than dealing with error returns normally.

Behavior in Threaded Programs

Some C++ implementations may have difficulties with exceptions thrown in threaded code. In the past, I've seen exceptions improperly propagated into the wrong thread. Most modern C++ compilers and libraries should have solved these problems. As far as I know this has never been a problem in Java.

Lack of Programmer Experience

This is probably the biggest problem with exceptions. A large number of programmers are not very experienced in using exceptions correctly. I've seen exceptions used in cases where an if or switch would have been much more appropriate. I've seen cases where an exception was caught as a generic exception and then a series of type tests were performed in the catch block. I've seen people use empty catch blocks to ignore exceptions.

The solution to this problem is education and experience. Unfortunately, this one takes time. Fortunately, many of the bad practices above can be found relatively easily in code reviews.

Conclusion

I would like to conclude this (overly long) essay with a summary. My goal in this essay was not to prove Joel Spolsky wrong in his dislike of exceptions. My purpose is to give an alternate view. My experience has shown (to me at least) that there is quite a bit of truth in the arguments against exceptions. On the other hand, I have seen several cases of exception usage increasing the robustness and correctness of large and small programs.

To link this back to Joel's essay, the main issue here is the need to learn to see well-done exception handling. This is something that out industry needs a lot of practice with. However, exceptions are a powerful tool and ignoring them may not be a viable technique in the long run.

Posted by GWade at 04:00 PM. Email comments | Comments (0)

May 23, 2005

Joel on Wrong-Looking Code

I want to preface this article with the comment that I really enjoy reading Joel on Software. I find his essays to be knowledgeable, well-thought-out, and well presented. While I don't always agree with his conclusions, I always feel that any of his articles is worth a read.

So, recently I ran across Joel on Software - Making Wrong Code Look Wrong. The title looks like something I've tried to explain to junior programmers before, so I figured I'd take a look. If nothing else, I thought he would give me new arguments to use. You should definitely read his essay before continuing. I want to be sure not to ruin his set up and delivery. I also want to make sure that you understand his points before I explain where and why I disagree with him.

The essay started off with some commentary on code cleanliness and an interesting anecdote from his early work life. So far, so good. Then, he throws in his curve ball. In this essay, he plans to defend Hungarian Notation and criticize exceptions. I have to admit that I thoroughly dislike Hungarian Notation and I find exceptions to be a pretty good idea. If anyone else had stated this as a goal, I probably would have just left the page. I've got plenty of other stuff to read. But, this I had to see.

After an interesting example, Joel goes on to explain that what we all know as Hungarian Notation is actually a corruption of the original Hungarian Notation. The original does not add a wart for type, which is useless and leads to bad code. It added a prefix for the higher level concept that the variable represents. A good simple example is using col on the front of any variable referring to a column, and row on variables referring to rows. This makes it obviously wrong when you assign a row value to a column. The goal is to add semantic information into the variable name. Unlike, syntactic information there is no way for the compiler to help you with the semantics. This version of Hungarian Notation actually has the possibility to help programmers, rather than just creating unreadable variables.

The funny thing from my point of view is that this idea is the only vestige of Hungarian Notation that I kept from a brief stint of using it years ago. Apparently, I (and probably loads of other programmers) accidentally stumbled across what Simonyi had originally intended, despite loads of literature and examples misusing it. So, by the end of this part of the article, I find myself agreeing with Joel, despite the fact that I was adamantly against what I thought was his position.

As the article continues, Joel goes on to bash exceptions (his words, not mine). In keeping with the topic of his essay, he states that one of the reasons he doesn't like exceptions is because you can't look at the code and see if it is exception-safe. Given his earlier statement about how important it is to learn to see clean, I find this statement particularly interesting. Since it would take another whole article to refute all of his points, I'll save that for another day.

Posted by GWade at 07:02 AM. Email comments | Comments (0)

July 06, 2004

Review of Compiler Design in C

Compiler Design in C
Allen I. Holub
Prentice Hall, 1990

I decided to take a break from the relatively new books I've been reviewing and hit a real classic.

Over a decade ago, I saw Compiler Design in C when I was interested in little languages. A quick look through the book convinced me that it might be worth the price. I am glad I took the chance. This book describes the whole process of compiling from a programmer's point of view. It is light on theory and heavy on demonstration. The book gave an address where you could order the source code. (This was pre-Web.) All of the source was in the book and could be typed in if you had more time than money.

Holub does a wonderful job of explaining and demonstrating how a compiler works. He also implements alternate versions of the classic tools lex and yacc with different tradeoffs and characteristics. This contrast allows you to really begin to understand how these tools work and how much help they supply.

The coolest part for me was the Visible Parser mode. Compilers built with this mode displayed a multi-pane user interface that allowed you to watch a parse as it happened. This mode serves as an interactive debugger for understanding what your parser is doing. This quickly made me move from vaguely knowing how a parser works to really understanding the process.

Many years later, I took a basic compilers course in computer science and the theory connected quite well with what I learned from this book. Although the Dragon Book covers the theory quite well, I wouldn't consider it as fun to read. More importantly, nothing in the class I took was nearly as effective as the Visible Parser in helping me to understand the rules and conflicts that could arise.

Although this book is quite old, I would recommend it very highly for anyone who wants to understand how parsers work, in general. Even if you've read the Dragon Book cover to cover and can build FAs in your sleep, this book will probably still surprise you with some fundamentally useful information.

The book appears to be out of print, but there are still copies lurking around. If you stumble across one, grab it.

Posted by GWade at 10:29 PM. Email comments | Comments (0)

March 07, 2004

Review of Modern C++ Design

Modern C++ Design
Andrei Alexandrescu
Addison-Wesley, 2001

This book just blew me away. I've had access to compile-time programming in other languages and had worked pretty hard to understand templates. I felt I had a better than average grasp of how C++ templates work and are used. The techniques in this book were astounding. I have since found many sites devoted to these techniques, but I remain impressed with the way Alexandrescu explains the basics of these techniques.

Warning: This book is definitely not for everyone. But if you really want to push the limits of what you can do with C++, you need to read this book.

Posted by GWade at 01:17 PM. Email comments | Comments (0)

February 11, 2004

Object Death

What does it mean for an object to die? In C++, there are several distinct and well-defined stages in the death of an object. Other languages do this a little differently, but the general concepts remain the same.

This is the basic chain of events for an item on the stack.

  1. Object becomes inaccessible
  2. Object destructor is called
  3. Memory is freed

Once an object goes out of scope it begins the process of dying. The first step in that process is calling the object's destructor. (To simplify the discussion, we will ignore the destructors of any ancestor classes.) The destructor should undo anything done by the object's constructor. Finally, after all of the destruction of the object is completed, the system gets an opportunity to recover the memory taken by the object.

In some other languages, a garbage collection system handles recovering memory. Some systems guarantee destruction when the object leaves scope, even with automatic garbage collection. However, some of them focus so hard on memory recovery that they provide no guarantees about when, or even if, destruction of the object will occur.

Although many people pay a lot of attention to the memory recovery part of this process, it seems to be the least interesting part of the process to me. The destruction of the object often plays a vital role in the lifetime of the object. This destruction often involves releasing resources acquired by the object. Sometimes, memory is the only thing to be cleaned up, but many times other resources must be released. Some examples include

  • closing a file
  • releasing a semaphore or mutex
  • closing a socket
  • closing/releasing a database handle
  • terminating a thread

These are all issues that we would like to take care of as soon as possible. Also, they result in some consequence if the cleanup step is forgotten or missed.

Anytime I have a resource that must be initialized or acquired and shutdown or released, I immediately think of a class that wraps that functionality in the constructor and destructor. This pattern is often known as resource acquisition is initialization. Following this pattern gives you an easy way to tell when the resource is yours. Your ownership of the resource corresponds to the lifetime of the object. You can't forget to clean up, it is done automatically by the destruction of the object. Most importantly, the resource is even cleaned up in the face of exceptions.

In the systems where destruction may be postponed indefinitely, this very useful concept of object death and the related concept of object lifetime is discarded.

Posted by GWade at 05:49 PM. Email comments | Comments (0)

February 07, 2004

The Forgotten OO Principle

When talking about Object Oriented Programming, there are several principles that are normally associated with the paradigm: polymorphism, inheritance, encapsulation, etc.

I feel that people tend to forget the first, most important principle of OO: object lifetime. One of the first things that struck me when I was learning OO programming in C++ over a decade ago, was something very simple. Constructors build objects and destructors clean them up. This seems obvious, but like many obvious concepts, it has subtleties that make it worth studying.

In an class with well-done constructors, you can rely on something very important. If the object is constructed it is valid. This means that you generally don't have to do a lot of grunt work to make sure the object is set up properly before you start using it. If you've only worked with well-done objects, this point may not be obvious. Those of us who programmed before OO got popular remember the redundant validation code that needed to go in a lot of places to make certain that our data structures were set up properly.

Since that time, I have seen many systems where the programmers forgot this basic guarantee. Every time this guarantee is violated in the class, all of the client programmers who use this class have a lot more work on their hands.

I'm talking about the kind of class where you must call an initialize method or a series of set methods on the object immediately after construction, otherwise you aren't guaranteed useful or reliable results. Among other things, these kinds of objects are very hard for new programmers to understand. After all, what is actually required to be set up before the object is valid? There's almost no way to tell, short of reading all of the source of the class and many of the places where it is used.

What tends to happen in these cases is the new client programmer copies code from somewhere else that works and tweaks it to do what he/she needs it to do. This form of voodoo programming is one of the things that OO was supposed to protect us from. Where this really begins to hurt is when a change must be made to the class to add some form of initialization, how are you going to fix all of the client code written with it. Granted, modern IDEs can make some of this a little easier, but the point is that I, as the client of the class, will need to change the usage of the object possibly many times if the class implementation changes.

That being said, it is still possible to do some forms of lazy initialization that save time at construction time. But, the guarantee must still apply for a good class. After construction, the object must be valid and usable. If it's not, you don't have an object, you have a mass of data and behavior.

The other end of the object's lifetime is handled by a destructor. When an object reaches the end of it's life, the destructor is called undoing any work done by the constructor. In the case of objects that hold resources, the destructor returns those resources to the system. Usually, the resource is memory. But, sometimes there are other resources, such as files, database handles, semaphores, mutexes, etc.

If the object is not properly destroyed, then the object may not be accessible, but it doesn't really die. Instead, it becomes kind of an undead object. It haunts the memory and resource space of the process until recovered by the death of the whole process. I know, it's a little corny. But, I kind of like the imagery.

This concept also explains one of the problems I have with some forms of garbage collection. Garbage collection tends to assume that the only thing associated with an object is memory. And, as long as the memory is returned before you need it again, it doesn't really matter when the object dies. This means that we will have many of these undead objects in the system at any time. They are not really alive, but not yet fully dead. In some cases, you are not even guaranteed that the destructor, or finalizer will be called. As a result, the client programmer has to do all of the end of object clean up explicitly. This once again encourages voodoo programming as we have to copy the shutdown code from usage to usage throughout the system.

So keep in mind the importance of the lifetime of your objects. This is a fundamental feature of object oriented programming that simplifies the use of your classes, and increases their usefulness.

Posted by GWade at 12:16 PM. Email comments | Comments (0)

January 14, 2004

The Smite Class

In attempting to do Test Driven Development, we noticed that one of the problems with testing object validation code was the necessity to have broken objects to test with. This is particularly important in cases where the internals of an object may come from somewhere uncontrolled. For instance, objects that may be read from disk could be restored from a damaged file, resulting in objects that should not occur in normal practice.

In many cases, you would just generate an isValid() type method that could be used to detect the invalid condition and let the user of the object deal with the situation. The question remains, how do you validate the object validation code?

Obviously, you do not want to expose your private data to access from the outside world. You may not even want to expose it to your descendents. You certainly do not want to expose methods publicly that could be used to generate an invalid object. That defeats one of the purposes of having a class.

A Smite class is a derived class that accesses a protected interface and has the ability to damage an object or generate inconsistent state in the object. This class would not be part of the public hierarchy, but it would be available for testing.

You might ask why we called it the Smite class.

One definition of smite is To inflict a heavy blow on... It may also mean to kill.

The particular image I have of the smite is from an old Far Side cartoon that was labelled God at his computer. In the picture, God in the form of a bearded, white-haired old man is looking at a computer screen. On the display, some poor slob is walking down the sidewalk and is about to pass underneath a piano hanging from a rope. "God's" finger is poised over a key labelled smite.

A Smite class is a derived that can inflict heavy damage on the internal data of the object. When this object is used as it's base type, it should be damaged or inconsistent. This allows for testing of validation and/or recovery code.

Posted by GWade at 08:16 PM. Email comments | Comments (0)