Programmer Musings: February 2005 Archives

This site will look much better in a browser that supports web standards, but is accessible to any browser or Internet device.

February 27, 2005

Review of Code Complete, Second Edition

Code Complete
Steve McConnell
Microsoft Press, 2004

Ever since I learned that a new edition of Code Complete was to be released, I have been looking forward to it. The first edition has been one of the most comprehensive books on the actual practice of writing code. We see a large number of books on methodologies and the high-level aspects of software development. But, if there is to be any software at all, someone must write it. This book covers that part of the process.

The second edition manages to update the information from the first edition to present practice. Many practices have emerged in the ten years since the first edition, and McConnell covers them extensively. Unlike a lot of books in our field that are filled with opinion and generalities, McConnell backs up what he says with hard numbers where possible and loads of research in any case. If you are looking for an authoritative reference, the bibliography of Code Complete fills 21 pages including several hundred references that span almost fifty years of research.

In the book, McConnell not only gives rational explanations for his recommendations, he also tells you where to look if you wish to dig further. While this is not unusual in academic literature, it is definitely not the norm for our field. McConnell covers a large percentage of best practices in the field. He is also not too proud to admit mistakes. In a few places in the text, he points out recommendations from the first edition that he says did not work out as well as he would have liked in the intervening years. He also points out a few mistakes, as well as changes in the industry that require modifications to his older advice.

Although I don't necessarily agree with everything McConnell says, the points we disagree on are so minor as to not warrant discussion. If I could only recommend one book to a new or intermediate-level programmer, this would be the book. I would be willing to live with the minor disagreements, if the programmers I work with would learn and apply the rest of the book.

If you are learning to program, get this book and read it. I realize that it is huge (over 900 pages), but learning the lessons in this book will greatly improve your skills and knowledge in software development. If you are a senior developer and haven't read the first edition, this book will still help you improve your skills. If you read the first edition, this book is still worth a read.

I don't think I could recommend this book too highly.

Posted by GWade at 03:06 PM. Email comments | Comments (0)

February 24, 2005

Review of Software Exorcism

Software Exorcism
Bill Blunden
Apress, 2003

I was really looking forward to this book, having spent a fair number of years maintaining legacy code in different languages and environments. This made the disappointment even more acute when the book did not live up to its subtitle: A Handbook for Debugging and Optimizing Legacy Code.

The beginning of the book looks promising. The author begins by looking at some of the non-technical forces that influence the success of a software project. He covers the need for sign-off and the importance of a paper trail. He talks about scope creep and complexity. This seems like a really good introduction to the parts of software development that most developers don't think about nearly enough.

The last chapter of the book ties back to these concepts, extending the list of non-technical challenges that hamper getting work done. Topics like fashionable technology, the lack of privacy at work, marketing hype, and other issues that we technical types tend to ignore have a huge impact on your ability to get work done. This chapter does a decent job of making you aware of each of these issues without digging into too much depth. Some of these issues are touched in other parts of the book as well.

Unfortunately, the author does not develop these points further in the book. These topics would be a wonderful education for people coming into software development. It would also serve as vindication for those of us who have seen more of this and realize how much these issues overwhelm the technical issues. What really bugs me is that this isn't the material the book claims to cover.

Starting in the middle of chapter 1 and extending through chapter 6, is what should be the meat of the book. The author's discussion of coding techniques and tools is weak. The book contains technical errors. Some of the source code has major flaws, like refactorings that generate code that is as bad as the original. The best is the memory pool example in chapter 5 that doesn't even use the pool code he wrote.

The author regularly states his opinions as fact without supplying any proof or hard statistics to back up these claims. I was amused by the assertion that the original developer would know enough to make changes to the project without having to relearn everything, but they are never around. He also points out that there wouldn't be bugs in the programs if the original developers had just done things correctly. Actual experience does not support these claims.

I think my favorite unsupported claim is the suggestion that virtual memory is an anachronism, that is unnecessary now that we can get machines with 2GB of RAM. He seems to be forgetting that modern multi-tasking operating systems would be pretty much useless without virtual memory.

In chapter 1, the author develops a logging library, a unit testing library, and a profiling tool. Yet, he spends no time talking about free tools that already fill these needs. He focuses a significant number of pages in the book to constructing some rudimentary debuggers to show how debuggers work, but spends almost no time discussing features of real debuggers that could actually help a maintenance programmer solve problems.

His debugging advice was relatively straightforward. The best of his advice is covered more completely by other books (like Code Complete or The Pragmatic Programmer), some of which have been out for many years. His optimization advice is better ignored. After suggesting that minor tweaks are a bad idea, he spends around a hundred pages telling how to save 5 machine instructions here and 10 over there. After all of these he mentions the concept of picking better algorithms as a quick fix.

All in all, the book does not do a very good job at what it claims to do. The debugging and optimizing information just isn't there. It does have some good information about the non-technical challenges of development; but I can't see wading through the rest of the book for these nuggets. Hopefully, the author will combine the first part of chapter 1 with chapter 7 as the core of a really good book on the non-technical part of development.

I really hoped I would be able to recommend this book, but I can't.

Posted by GWade at 10:29 PM. Email comments | Comments (0)

February 13, 2005

Conversion to Subversion: Tags

In the first article of this series, Conversion to Subversion, Part I, I described the problem I found in trying to convert a project from my CVS repository to Subversion. In my last article, Conversion to Subversion: The Project's Trunk, I described the solution that I used to convert a basic project with no tags or branches. This time I'll discuss the converting the tags on a project from CVS to Subversion.

As stated before, the tag directory structure generated by cvs2svn was not what I wanted. Assuming a CVS module of project1 and a tag of FIRST_RELEASE, the dump file would have a directory structure of tags/FIRST_RELEASE/project1. I wanted project1/tags/FIRST_RELEASE.

At first, I thought I could use the same approach that I had used for the main project. I would

Filter the dump file to keep only the project and tags I care about.

Copy the project and the tags over.

Update the project path as before.

Change the tag directory from tags/FIRST_RELEASE/project1 to project1/tags/FIRST_RELEASE.

Unfortunately, searching through the dump file did not turn up a tags/FIRST_RELEASE/project1. So I began looking at the dump file a little harder. The result was a little confusing. Apparently, cvs2svn treated each tag as if the entire repository had been tagged (everything in trunk was copied to tags/FIRST_RELEASE). Then, everything except the project1 directory was deleted. This generated a large number of extraneous revisions that do not accurately reflect what happened in the repository. The end result would have been correct in the repository with the old directory structure; but it wouldn't work with the new structure.

I modified the initial copy from trunk to copy from project1/trunk to project1/tags/FIRST_RELEASE in the dump file. Then, I deleted all of the extraneous delete directory commands in the dump file.

The new modified dump file would build the project with the tags I required. Just as importantly, the extraneous manipulation used to clean up the initial strange tagging request have been removed. This also solves the problem that would have been caused by attempting to change directories that had been filtered out of the dump.

I incorporated this change into my script that fixes up the dump file before I do the load. It seems to be working quite well. The new projects I've added with these changes appear to be intact with the appropriate tags in place. If I had any branches I wanted to keep, I could apply an equivalent approach to fix up the branches before loading.

Update

This entry has been updated a bit in the entry Conversion to Subversion: Tags Revisited to answer questions I've received by email.

Posted by GWade at 09:45 PM. Email comments | Comments (0)