How do I become a better programmer?
February 5, 2007 7:24 AM   Subscribe

How do I learn to be a better programmer (not just learn a particular language)?

I've been hacking around a lot with Ruby on Rails, and I've gotten fairly good at reading working code and making changes to it to do what I want. I can also write fairly simple applications from scratch that work well enough. But I feel like I'm missing a lot of the 'fundamentals' about how to write polished, secure and stable applications. I also don't feel like much that I've learned is translatable to other programming languages.

So what I'd like is possibly some recommendations for general programming books, but also particularly good specialist books (for example, something on Object Oriented programming). What I'm not looking for are 'reference books' about particular languages, because those are pretty easy to find.
posted by empath to Computers & Internet (36 answers total) 37 users marked this as a favorite
 
I would suggest reading a book on design patterns, as a lot of the information in such books is relevant to all programming languages, and comes from people who've been doing it a lot longer than you or I.

You may want to investigate such concpets as defensive programming, depth of security, or even try doing some actual hacking (on your own kit... :-)

Finally, just code more. Lots and lots. Read other people's code, ask questions of the authors, get involved with more experienced people.

IMHO Ruby on Rails is a good place to start, Ruby seems to cover a lot of the important aspects of programming (objects, typing, exceptions, inheritance, scope, debuggering, etc). You could do a lot worse as a place to start.

For some reading:

http://www.rubycentral.com/book/
http://www.zend.com/php/design/patterns1.php

Just a starting point.
posted by gaby at 7:45 AM on February 5, 2007 [1 favorite]


Design Patterns. The book on design patterns, which are high-level object-oriented solutions to problems.

Also, keep in mind that the Rails part of Ruby on Rails has a lot of the fundamentals baked in, which is why people use it. Something like Model-View-Controller is a fundamental good practice but Rails already does it.
posted by smackfu at 7:47 AM on February 5, 2007 [1 favorite]


I suggest checking out these threads. Especially that last one; the title is "From scripting to compiling...".

Or, for the short version: The Art of Computer Programming series gets highly recommend and recommended often.
posted by philomathoholic at 7:55 AM on February 5, 2007 [2 favorites]


You should learn C. You should get a book (or practice writing) multi-threaded programs. You should try and learn some algorithm design and analysis. doing any of these three things will make you a better programmer.

Stuff like design patterns is interesting and informative, but a lot of it is just software engineering wankery. I don't think knowing the adapter pattern will make you a better programmer.
posted by chunking express at 7:58 AM on February 5, 2007


The “Gang of Four” [Gamma, et al.] Design Patterns book should be on every professional programmer’s bookshelf. When you first read it, however, you may think “this is so obvious.” The true lesson of the book, in my experience, is that OOP is fundamentally not about inheritance or encapsulation, but that modularity, polymorphism, and object composition are the critical paradigms.

One rule of thumb that I have is that if your method parameters tend to be scalar values (e.g. integers, strings,) then you probably aren’t creating a composition of objects.
posted by ijoshua at 8:03 AM on February 5, 2007 [1 favorite]


For more general algorithmic design and analysis, The Structure and Interpretation of Computer Programs goes from explaining the basic mechanics of typing in expressions, through creating data abstractions, and all the way to compiler design. Along the way you learn important lessons like how to analyze an algorithm’s order of growth.
posted by ijoshua at 8:09 AM on February 5, 2007


Write code.
posted by mitocan at 8:15 AM on February 5, 2007


Code Complete - a fabulous reference for writing clearer code and help you understand how some of your choosen language's constructs can be misconstrued or misused. I doubt there is any Ruby specifics in the book, but you can learn lots from it guaranteed.
posted by mmascolino at 8:19 AM on February 5, 2007


Read programming.reddit.com.
posted by PenDevil at 8:20 AM on February 5, 2007 [1 favorite]


Since design patterns have already been mentioned I'll say that you should get familiar with data structures and algorithms. They really are the fundamentals of programming in any language, and having a better grasp of them will make you a better programmer in any language. It really pays off to know which data structure is best suited to a given situation. Knowing how computationally expensive your procedures are to run, will allow you to design cleaner more efficient programs.

As mentioned above SICP is a great book, but to get the most out of it, you'll have to learn Scheme. Scheme is a cool language, but it's not everyone's cup of tea. For something in the OO paradigm, you may want to look at something like this. The book uses Java, but the material is easily applicable to any OO language. I took a course that used that book, and I found that it did a good job of explaining the various data structures and algoritms that were covered, as well as how to analyze algorithms, and the idea of abstract data types and interfaces. It also comes with tons of sample code.

One easy thing that I think you'll really benefit from, is making good use of encapsulation. Things can get to be a real mess when it's not used properly/enough.

Finally, people have mentioned that you need to write code, and this is true. I found that I became a much better programmer when I moved away from modifying existing code and hacking around, and started writing more programs from scratch, even if they weren't very complicated.
posted by benign at 8:29 AM on February 5, 2007


Not a resource per se, but pick up another language, even if it's just casually. If your only reference is language X, it's tough to decouple good practices from the specific grammar of language X. Seeing how things are done in other languages will make you better at every language, since you're not stuck in a "this is how it's done in Ruby" mindset.
posted by sonofslim at 9:02 AM on February 5, 2007


As far as I'm concerned, these two books are just about perfect for what you're asking:

The Practice of Programming
by Brian W. Kernighan, Rob Pike

The Pragmatic Programmer: From Journeyman to Master
by Andrew Hunt, David Thomas

I would summarize both books as "These are the things you'll learn on your own in the first five years of being a programmer." It's fun to read these books even if you have more than five years of experience, because they'll make you be a little introspective about what you know. You'll find yourself thinking, "I learned THAT one the hard way... I learned THAT one the hard way..."

But I feel like I'm missing a lot of the 'fundamentals' about how to write polished, secure and stable applications.

Ignore the people who tell you to read "The Art of Computer Programming", which are great books, but have nothing to do with your question.
posted by IvyMike at 9:27 AM on February 5, 2007 [1 favorite]


I second Code Complete. While all the design books are great when designing an application, Code Complete deals with ideas like properly naming variable names, testing, how to deal with loops efficiently. It may seem trivial, but it takes a lot of practice to write good and easily understandable code.
posted by bored at 9:41 AM on February 5, 2007 [1 favorite]


Besides the references here, the best ways I've found to become a better programmer are to (1) write a lot of code and use your own code a lot (2) read/use a lot of code written by a better programmer than you (3) reread your old code after you've learned new things.

These three elements can be summed up as "practice, exposure, and reflection" which are key elements of learning and growth on pretty much any subject.
posted by plinth at 10:04 AM on February 5, 2007


I dunno if you can read your way to better coding at this point. Like any craft, you have to actually do it to learn all the real pitfalls.

When you say that your applications "work well enough" is that in your own opinion, or have you released them to other people? Writing applications for use by random joe user is a massive step up from writing applications for your own use.
posted by tkolar at 10:07 AM on February 5, 2007


Take a class or two, or take MIT's OCW classes. Not many web guys know the basics of algorithm analysis. Big-O analysis sucks and is probably overkill for 99% of that stuff, but you should know efficiency vs. simplicity vs. space, and best uses for arrays, hashes, linked lists, search & sort methods (binary, quicksearch, so on). Know whether strings are immutable or not, and learn exactly how pointers are implemented in your language and how they are implemented on your system.
(I'm about 1/2 way through a CS degree, and that's what I've found useful so far)
posted by tmcw at 10:08 AM on February 5, 2007


Oh, and stay off of reddit, digg, etc for a while. Shiny new things are nice to use and language wars are entertaining, but if you want to learn fundamentals a scholastic approach is probably best.
posted by tmcw at 10:10 AM on February 5, 2007


Spend a lot of time browsing the WikiWikiWeb.

Read Design Patterns, The Practice of Programming, Code Complete, and The Pragmatic Programmer to learn about the practical aspects of getting software built.

Read The Structure and Interpretation of Computer Programs to get a deeper understanding of what programs are, and how to think about them.

Learn C, from K&R, to get closer to the machine than you ever will with Ruby. Even if you rarely write things in C, knowing it will make you a better programmer. (This is also true of Lisp, for different reasons, but you'll get that from reading SICP anyway.)

Listen to the CS61A, CS61B, and CS61C podcasts from UC Berkeley--it's not the same as taking the classes, but just listening to them will teach you a lot. (Also, two of the textbooks they use are SICP and K&R, so they'll make that reading easier to follow). As a programmer with no formal Computer Science education, this is probably the most useful thing I've done in the past few years for my understanding of programming.

If you find that you like Lisp, McCarthy's original paper on it is a really great read, and pretty short.

The Art of Computer Programming is an incredible reference book, but I wouldn't try to learn programming from it.

Teach Yourself Programming in Ten Years has some good advice.

How to be a Programmer is a great (and long!) list of specific pieces of advice about both programming and the job of being a programmer.
posted by moss at 10:25 AM on February 5, 2007


Reading won't do jack for you if you don't practice.
posted by crinklebat at 12:39 PM on February 5, 2007


Reading won't do jack for you if you don't practice.

This bears repeating.

But also: practicing and reading will teach you a lot more than just practicing.
posted by moss at 12:49 PM on February 5, 2007


If you want to reduce your bugs, keep a log of all the bugs you make. Everyone has those few bugs they keep on writing. Making a note each time you make it, and going over your log, will prevent them in the future.
posted by wireless at 8:37 PM on February 5, 2007


I concur with a lot of advice above and I'm coming in late -- but I've been at this for some three decades now so I get the advantage of seniority (I think).

I'm actually going to answer a different question -- how to be a better software engineer. No one really cares how good your programming is :-D -- they care how well it works. 'Course, if your programming is bad it won't work but...


Fundamentals, fundamentals, fundamentals of algorithms (sorting, hashing, searching...) and data structures (stacks, queues, trees, tries, spare matrices). You should know the time complexity O() of the running time of any algorithm. Schools and the books above will teach you all of these. These are like learning scales to an instrumentalist. You have to practice these endlessly by writing and rewriting small programs.


I'd have to say that test-driven development, writing tests as you write the code, is the difference between a talented hacker and an engineer. Beyond a certainly almost trivial level of complexity, it is impossible to even know whether you have written the code correctly or not without a set of clear and unambiguous tests -- which also serve as a specification of sorts for your code.

If you don't write tests, I don't want anything to do with you, no matter how great your code is. The best software engineers I've ever seen wrote copious, intricate tests for their code, you should too.


You should master one "grown-up" language completely if you want to be a master programmer.

Even though I've been spending most of the last few years doing C++ and loving it, I believe that the best choice is Java because you can do it for free on any OS. You can get Eclipse for nothing, and it's a splendid and very professional development system that will encourage excellent programming habits (for example, JUnit encourages the test-driven development style I mentioned above...)

I learned Java by getting the Java Language Specification and reading it all the way through, or at least trying to, repeatedly until I got all of it. (There are some very good hidden jokes there, for example in the chapter on "Definite Assignment".)

Eventually if you are going to be a master programmer you are going to have to go to the Language Specification -- Java has the advantage that this document is quite readable and comparatively short (it's still hundreds of pages but there's a lot of excitement and mystery in there if you like this sort of thing, check out the new sections on "erasure"...)


Master your tools. You need a good debugger and a good editor and you need to learn them in detail. Eclipse is a good debugger and a good editor! When you get new tools, always take the time to master them to some level before using them.


Bugs chew up astonishingly large amounts of time. Learn to avoid them. Test-driven development will help -- re-reading each line of your code before you decide it's ready will help a lot more. You should read each line of your code before deciding it's ready. If you have to read it out loud to make sure you don't skip any characters, then do that. If there's a bug in your code, why not stop and write the documentation on the area of the code you suspect might have the problem?


Do it right -- don't slack off. Document everything. Document the top of each file heavily as to why and how anyone would need it. Document each method and also name each method appropriately so you'll know what it does when you see it again. Name your variables so clearly that they never need to be documented, even if they are very long (learn to use your editor so you never have to type things more than once...) Avoid abbreviations almost always -- you'll have to read the code in six months. Write full sentences -- it costs you very little more time and they are faster to read than sentence fragments.


Finally, there's a knack I don't have a good word for, but that means picking an appropriate target, figuring out exactly what it means to hit that target, hitting it, sealing it up and moving on to the next.

Set very specific goals, each step as small as possible. Make sure that each goal is concretely verifiable, perhaps that you can automatically verify the goal with tests (this prevents you from accidentally breaking that code again later).

Dividing and subdividing your programming into very specific steps prevents all number of terrible problems, chiefly wandering around aimlessly and achieving nothing.


If you perform the steps above, other things like good use of resources will naturally happen as you write, or at the worst be fixable with a high-level change to an algorithm.


Hah. Hope it's useful!
posted by lupus_yonderboy at 9:46 PM on February 5, 2007 [2 favorites]


I'll give my standard contra-advice, and say avoid taking OO and design pattern wankery too seriously. People, even some smart ones, use it, but it doesnt have to be the first or only answer to every problem. It's just one possible tool. If you're not using OO with a good justification and scope, you can waste a ton of time writing nonsense glue code with no value.

Some people swear by unit tests; some don't.

Peoples' other advice is pretty universally good so far: focus on good algorithms, data structures, thin APIs, data-drivenness, abstraction (which is _not_ synonymous with 'data hiding',) consistent conventions, self-documenting code (with good names and no magic numbers) and _preventative_ programming techniques (and definitely NOT 'defensive' techniques.)

Preventative (GOOD): signal an error or unexpected condition as soon as possible; use asserts wherever possible to ensure your code is in an expected state. If you can't find what's causing a bug, add more asserts and checks up the chain, and leave them in.

Defensive (BAD): 'handle' errors by sweeping them under the rug, and trying to silently pick up the pieces. Wonder why your code crashes ten minutes later.
posted by blenderfish at 12:11 AM on February 6, 2007


Preventative (GOOD): signal an error or unexpected condition as soon as possible; use asserts wherever possible to ensure your code is in an expected state. If you can't find what's causing a bug, add more asserts and checks up the chain, and leave them in.

Or as we called it at my last gig, "Crash early, crash often."
posted by tkolar at 12:53 AM on February 6, 2007


Some people swear by unit tests; some don't.

Ignore that. :-D

It is foolish to build a program of any non-trivial size without automated testing. For example, it's impossible to know if one bug fix will reopen an old bug anywhere else in the system without unit testing, and it's very risky to change code that one has not written oneself.

Unit testing and modern refactoring editors allow programmers to make dramatic changes to the structure of a program in just a few minutes with little possibility of error occurring (almost no possibility if it's a refactoring).

Modern systems are extremely large and have many moving parts. It's literally impossible for an individual programmer to keep in mind all the constraints that a system requires for healthy functioning. Large engineering firms like Google require there to be unit tests for each piece of production code.

If you are going to be a real programmer, you are going to spend a lot of your time writing unit tests. Get good now and you'll save yourself hundreds of hours debugging later.
posted by lupus_yonderboy at 5:06 PM on February 6, 2007


Defensive (BAD)

Nothing at all wrong with defensive code. You often write such code in servers, where it is important that the server not crash, and very important that the data that it owns be kept in a consistent state. Wrong code I wrote has in the past caused a well-known web application to crash and not return a result for perhaps even some reader of this page. Because there are many moving parts, your control over the form of the data you get might not be good.

However, sweeping errors under the rug is defensive code in the same way that taping over your speedometer is defensive driving.
posted by lupus_yonderboy at 5:18 PM on February 6, 2007


Some people swear by unit tests; some don't.
Ignore that. :-D


Bah, humbug.

Any unit small enough to be tested by a plausible unit test isn't go to break anway. Any unit large enough to benefit from a unit test requires test infrastructure almost as large and complex as the unit itself.

Furthermore, a huge proportion of the problems in complex systems result from interactions between units.

Save yourself time and annoyance -- write a comprehensive system test suite and screw the unit tests.
posted by tkolar at 8:13 PM on February 6, 2007


umm, "isn't going to break anyway."
posted by tkolar at 8:15 PM on February 6, 2007


Any unit small enough to be tested by a plausible unit test isn't go to break anway.

I completely disagree. Any part of the system is liable to break during maintenance. In particular, subtle expectations as to behaviour might change when new functionality is added.

With unit tests, clients of your code know that the behaviour expressed in the unit tests is guaranteed to work. You can edit code you don't know with confidence that you won't break anything.

In unit tests, you can test out those "edge cases" that might only appear very rarely in actual code and might never appear in a large-scale test, no matter how comprehensive.

Every hour spent on testing is three hours less spent on debugging. Moreover, it's much easier to predict how quickly you'll write tests than how quickly you'll debug a problem.

And the unit test documents the behaviour of your part -- the user doesn't have to worry "how does it work on empty strings?" or "how does it work for very large numbers?" because there is a test that demonstrates exactly what does happen. Unlike documentation, the test is executable and therefore self-validating -- you can't change the code without changing the test or it will break (you can't tell if documentation is broken without reading the code!)


Save yourself time and annoyance -- write a comprehensive system test suite and screw the unit tests.

You will notice that I actually talked about "automated testing".

However, with all due respect, I would have to believe that you haven't worked on any very large or very complex systems for you to claim that system testing is your first line of defense.

"Comprehensive system test suites" -- system integration tests or regression tests -- are a necessary evil. But in a large system, a breakage in these large scale tests tells you little about exactly what has gone wrong.

Worse, the number of integration tests increases as time goes on and again as the number of developers increases. Yet each developer has to run all the integration tests for each change, because all the tests test all the system. The testing load on the system increases as the square of the number of developers times the development time -- and the chances that a test is broken at any given time increases as the number of developers do. Eventually, these large scale tests will sometimes remain broken for a long time as they don't have ownership by anyone.

This is hardly theoretical. I have a broken regression test nagging me in my mailbox right now. I made a very small change to a default flag, I checked it in, I didn't run all the tests, and one broke.

Because it's a regression test, I should have expected that this would happen -- the behaviour did actually change. Now I have to go in and edit a huge data file representing the data output and change it for my default flag. It's a pain in the ass.

Worse, if you weren't me, you wouldn't have any idea what the connection between that little flag here and that regression test failure there is (the testing system is pretty clever, runs tests automatically and the automatically pins the blame on the victim but it of course cannot tell you the causal relationship -- so if you had to change that flag, you'd have no idea how to verify that the regression test breakage was insignificant (which means that you can mindlessly replace the old test output with the new, changed output, as in my case) or significant (you've actually broken something rather than made a beneficial change to the program).


Some of the largest systems in the world, services you use every day, are buttressed with extensive unit tests at every level. Of course, there is a continuum of these tests, and I'm sure some of the largest tests could be called "comprehensive system test suites".

At this point in my life, I don't really consider a component "done" unless I have a complete set of unit tests for it -- and if I see someone else's component, I expect to see unit tests that exercise its functionality.


So :-D ignore what the man says. Figure out how to write tests for all your code at all levels. Learn to wrap code into small, clear, standalone, documented and unit-tested packages.

It's standard industry folklore that, if you work in a team, "productionizing" code takes nine times as long as just writing it. This rather shameful figure is due to a lot of things, but if you group the various debugging-related tasks and documenting interfaces together, you can see that a lot of the nine-times multiple is due to the fact that the quality of that initial piece of code is on average none too good and its behaviour is not adequately understood by anyone. Unit tests fit nicely into this space.


Testing at all levels, from unit testing through regression testing to full scale system integration testing, is one of the keys to being a master programmer... the three keys being good algorithms, good code and good tests.



(PS: I do understand that it's extremely hard to write unit tests for UI components and graphics and that one sometimes has to cut corners... even in those cases I believe that intelligent attention to testing in the early phases of development can make some sort of low-level "unit" testing fairly doable... but needs must when the devil drives.)
posted by lupus_yonderboy at 4:53 PM on February 7, 2007


Since I wrote the above, I finished some code and then created the following unit test:

class ZipcodeAreasUnitTest(googletest.TestCase):
def DoTestArea(self, zipcode, area):
self.assertEqual(zipcode_to_area.ZipcodeToArea(zipcode), area)

def testAll(self):
self.DoTestArea('79508', 'Abilene, TX MSA')
self.DoTestArea('79607', 'Abilene, TX MSA')
self.DoTestArea('79566', 'Abilene, TX MSA')
self.DoTestArea('47955', 'Lafayette, IN MSA')
self.DoTestArea('85333', 'Yuma, AZ MSA')
self.DoTestArea('00000', zipcode_to_area.NONE)
self.DoTestArea(' 79566', zipcode_to_area.NONE)

I had several failures before I got this to work -- compilation errors in the code, a problem initializing an internal data structure.

Now I'm going to put it into the main body of my code. I expect it to work right the first time. I'll report back.
posted by lupus_yonderboy at 6:55 PM on February 7, 2007


Worked right the first time. Bwahahahahahaha!

The source was 2571 lines of code where about 2500 lines of it was one large variable assignment.

The whole thing is a small part of a massively huge system with hundreds of millions of pieces -- but now I can check it in with confidence (after a code review).
posted by lupus_yonderboy at 7:11 PM on February 7, 2007


However, with all due respect, I would have to believe that you haven't worked on any very large or very complex systems for you to claim that system testing is your first line of defense.

Think again.

I've worked on systems of such complexity that unit tests were the least of our problems. Once again: the majority of problems that develop happen at the system level, not the unit.

Furthermore, while I'm glad that your unit could be tested with such a simple interface, most of troublesome units in this world aren't that easy. Consider for example networking stacks. Entire companies have been formed around writing the test software for those: it's not just a matter of "I make a function call and I get the right return value". There are timers, asynchronous events, malicious attacks. In fact, in order to truly test a TCP stack you basically need an entirely seperate TCP stack to run it against.

If your unit is so simple that it can be tested with "make a function call, check the result", then frankly it isn't a big candidate for failure to begin with. You can write unit tests for it if it makes you feel better, but you'll get a hell of a lot more bang for you buck by investing in system testing.

...three keys being good algorithms, good code and good tests.

There is a fourth key, which is knowing what to invest your time in. Large scale coding is never done in a vacuum, and you need to make choices about what is important.

If you have the time to write a test to make sure your hash function is still hashing correctly, then by all means go ahead. But the truth is, barring some form of code rot (which does happen) your hash function has a very low probability of breaking. Focus your energy on things that are likely to break instead.
posted by tkolar at 10:24 PM on February 7, 2007


The source was 2571 lines of code where about 2500 lines of it was one large variable assignment.

Yikes. You should have taken the time you spent writing a unit test and written better, data driven, code instead.
posted by blenderfish at 10:37 PM on February 7, 2007


You should have taken the time you spent writing a unit test and written better, data driven, code instead.

Why, it is exactly such. The actual code is very short: it's followed by 2500 lines of data. There's no external data file per se; I have the data declarations inline so the compiler does the work of reading and translating. There's a few lines of code to invert the index and then it's just a table lookup.

Since the data very rarely changes, putting it inline to the code is a much better solution than having to carry some separate file around with the code.
posted by lupus_yonderboy at 10:09 AM on February 8, 2007


I've worked on systems of such complexity that unit tests were the least of our problems. Once again: the majority of problems that develop happen at the system level, not the unit.

Perhaps that indicates that your testing procedure is at fault?

As I keep reminding you, there are also automated tests that are not system tests and not unit tests; regression tests, crosstests between pairs of components and that sort of thing.

Tests are necessary at all levels. The lower level and more specific tests you can write, the stronger your code will be.


Furthermore, while I'm glad that your unit could be tested with such a simple interface,

It was a simple example -- it just happened that I had to write a unit test right after I wrote the article.


most of troublesome units in this world aren't that easy. Consider for example networking stacks. Entire companies have been formed around writing the test software for those: it's not just a matter of "I make a function call and I get the right return value". There are timers, asynchronous events, malicious attacks. In fact, in order to truly test a TCP stack you basically need an entirely seperate TCP stack to run it against.

Certainly --- this is the reason that test harnesses exist -- so that test writers can have a framework that creates mocks of whatever other services they need to correctly test their individual features.

It should be the case that if you add a little feature to the TCP stack, you should also be able to add a little test for that feature as well.

If your unit is so simple that it can be tested with "make a function call, check the result",

When did I say that testability implies "make a function call, check the result?" I'm sorry I posted that example -- it was just funny to me that I finished writing this then immediately had to write a test.

then frankly it isn't a big candidate for failure to begin with.

Failure is inherent in all compound things.

You can write unit tests for it if it makes you feel better,

We are required to write unit tests for more or less everything: it's not a matter of personal ya-yas.

but you'll get a hell of a lot more bang for you buck by investing in system testing.

You said that already. To summarize my rebuttal:

1. System tests increase in overall complexity with the square of the size of the program and the number of developers.

2. It is difficult to go from a broken system test to the specific component or components that are causing it. This problem is multiplied when you are a developer who is not familiar with the area in question.

3. Since you are only testing the whole system, it is difficult to perform adequate edge-case testing on individual components and subsystems, and therefore it is difficult to predict what a component or subsystem will do when presented with valid but unusual data.

Again: overall system integration testing is necessary; it's your last line of defense; but you really don't want to be working in a place where those overall system integration tests are failing all the time.


Again, I'm sorry for posting the example. It was timely -- plus, I did in fact find a non-obvious edge-case bug with the unit test for what appeared to be almost trivial code.

The time it'd have taken me to debug it in production would have been about the same as the time it took me to write the unit test, except that the next person can see exactly what I intended to do, and that bug won't ever come back.


There is a fourth key, which is knowing what to invest your time in. Large scale coding is never done in a vacuum, and you need to make choices about what is important.

That's a great key! Good use of time, good algorithms, good tests, good code. Is there a fifth one?

If you have the time to write a test to make sure your hash function is still hashing correctly, then by all means go ahead. But the truth is, barring some form of code rot (which does happen) your hash function has a very low probability of breaking.

Funny example -- I have seen some of the best software engineers in the world spend a lot of time tracking down a large scale problem that turned out to be due to a subtle deficiency in a very standard open source system hashing function. Worse, the hashing function wasn't even "wrong" -- it satisfied its contract, it just had a particular behaviour on an "edge case" that any reasonable engineer would think of as "bad" (and that could be easily fixed too) -- which meant that suddenly one day thousands of jobs slowed to a crawl or ran out of memory and died with no other obvious symptoms.

Tests at all levels are necessary. (And if you're just getting started in programming, writing unit tests as you go is an excellent way to move from being a journeyman to a master -- you'll be surprised how many bugs you unearth in your own code that way.)
posted by lupus_yonderboy at 10:58 AM on February 8, 2007


I wrote: It should be the case that if you add a little feature to the TCP stack, you should also be able to add a little test for that feature as well.

(which doesn't mean that you don't also have to test it also at higher levels. It's hard to imagine a "small" change to a TCP stack...)
posted by lupus_yonderboy at 11:00 AM on February 8, 2007


« Older Getting a house permitted   |   Would a child of a hermaphrodite have two Y... Newer »
This thread is closed to new comments.