Getting to know new code
January 7, 2010 5:23 AM Subscribe

What are your tricks for understanding a large piece of new code quickly?

So this year I’d like to get involved in some open source projects and spend more time reading other people’s code. Some projects are good with documentation, others...not so much. I worry that I’m going to spend all my time getting only glimpses of the application as I trace down Bug A or Feature Q, and have no understanding of the project as a whole and how things fit together. I think to myself: if only I had a bigger picture…

So how do you get your big picture? Is it just from the accumulated knowledge of working on it for long enough? Do you just read and map out what’s there? Do you have special, secret (until now) tricks?

posted by ADoubtfulTrout to Technology (14 answers total) 18 users marked this as a favorite

This topic comes up a lot in the wonderful Coders At Work book. It's worth hearing what some of the great minds do. I'm not a great mind and I don't have any big secrets. Make enough commits and you'll eventually start to see the bigger picture.
posted by yeoldefortran at 5:51 AM on January 7, 2010

My first step is usually to develop a good understanding of the application from a front-end perspective. Then I work on mapping code very broadly to visible functionality. Once I know that, for example, files A, B and C are something to do with exporting text to PDF, I can decide whether I want to look at them in more detail.

For me, understanding the code in overview is not really a requirement. The code is a set of (hopefully) self-contained building blocks that serve the requirements of the application. The only view that matters for me code-wise is the detail view.
posted by le morte de bea arthur at 5:57 AM on January 7, 2010 [1 favorite]

The last large codebase I worked on (the site for a F500 bank) had 13,000 classes and an enormous range of functionality; while I was useful within the first couple of weeks, it took me about a year to come up to speed on the whole thing to the point where I intuitively knew what packages did what, who owned them, and what the architecture of each section was.

Even though it's been a few years since I worked there, I could probably still give you the dollar tour of the codebase (including pretty solid guesses as to the name of the person who wrote the bulk of each class). That kind of intimacy with the structure and implementation of the code comes from only one thing: reading lots of code. In my case, I was responsible for reviewing every checkin, so I wound up reading a lot of code fairly quickly and puzzling out what it did on a daily basis.

In your case: poke around more.
posted by majick at 6:03 AM on January 7, 2010 [1 favorite]

I use an editor that can browse the code. Something that understands the syntax of whatever language it is. I start reading the code and branch into functions if they're not clear ... semi-randomly building up a picture of what it does by sort of simulating a run through the code.

Sometimes, if it's really weird code, I do a quick reimplementation of parts of it in Java. I use Java only because it's very refactorable and code sketches can be automatically expanded in the editor.

If it's C or C++ code, I tend to use Eclipse or Textamate to follow the path. If it's Java, I like Eclipse and IntelliJ. I'd probably use Eclipse or Rubymine for Ruby code.
posted by krilli at 6:57 AM on January 7, 2010

Correction: "Textamate" is Textmate.

Textamate is some kind of Mayan herbal tea I think.
posted by krilli at 6:58 AM on January 7, 2010 [2 favorites]

How about a real trick to understand complexity. Run each class through a script that removes everything but the curly braces. Then check for length of each resulting file. It'll be similar to the length of the file, but simple functions without many loops or conditionals will end up shorter, and more complex ones will end up longer. You can quickly pick out the most complicated sections of the code this way.

Otherwise, hope things are named reasonably, and start at the high level. Go through every directory and see what files are there. Then go into whatever sounds interesting and see the classes, and read the comments at the top of each file. Then, you can drill down into the functions and methods themselves. Basically, do a breadth first search of the code, instead of a depth first.
posted by cschneid at 7:33 AM on January 7, 2010

The best ways is a tour by an expert or to have an expert that you can ask questions to. That of course isn't usually a possibility for an open source product. If your toolset can auto-navigate the codebase (i.e. show you everywhere this variable/function/class is used) then I like using it to give me a flavor of how interdependant a particular piece of code is. From there it is just time and patience.
posted by mmascolino at 7:36 AM on January 7, 2010

I am employed as a maintenance developer, so I often have to work on new projects.

My first step is to pull the project down from CVS, get it building, running and deploying. These are generally java web apps running on tomcat and developed in either eclipse or jbuilder, so this process is fairly familiar and consistent even across wildly different apps.

Then, I fire it up in debug mode, throw some break points around and try to get a sense of the end to end flow of the code, tracing my way through the app as it handles requests. What I'm really looking for is where "the rubber meets the road" and the important stuff actually happens. Then, it's pretty much time to dive in and try to duplicate and fix the bug I'm hunting.

Of course it is extremely helpful to have someone who is an expert on the app that you can ask questions of, but this is not always an option.
posted by utsutsu at 8:56 AM on January 7, 2010

I am not sure about the big picture, but I do remember something about reading the code word by word instead of skimming.
posted by kaizen at 9:47 AM on January 7, 2010

For me, understanding the code in overview is not really a requirement. The code is a set of (hopefully) self-contained building blocks that serve the requirements of the application. The only view that matters for me code-wise is the detail view

AGREED!

You are not doing this for work, so why waste time on a sloppy code base? If the classes are well defined, you should be able to work on them in isolation -- or at least you should be able to work on a sub-system of classes without understanding the whole. When I code, I keep that at the forefront of my mind. If someone needs to understand my DB-query classes to understand my GUI classes, I am not going my job right.
posted by grumblebee at 10:04 AM on January 7, 2010

Functional diagramming has always been my biggest helper. When having to understand some complex piece of code or interrelated pieces of code, what I usually do is go through the code line-by-line and then schematize, as precisely as I can, the functional logic of the code. I like process flow diagrams, because they're easy to draft and if used properly can help you quickly work through the trickier aspects of the business logic of some code. The basic idea is to read the code line by line as quickly and thoroughly as possible, and as you go, document in a flow diagram what the code seems to be doing in logical design terms. Describe how the code works in terms of the software domain, not in terms of the business domain. (For example, you might define a process flow along the following lines: "Process 1: Declare and initialize working variables--> Process 2: Instantiate Cursor Loop of Records to be Processed--> Process 3: For Each Record in Loop, Performing Processing Step A... --> Etc.", with conditional tests diagrammed as decision points in the diagram flow). Diagramming the functional design of units of code as you review the code, correcting and clarifying the salient details as you go--in my experience, anyway--makes it a lot easier to grasp the big picture in a way that allows you to focus development efforts where they'll count the most. The best way to do it is to start at the most abstract level, conceptually, and define the broad outlines of the functionality then work your way down.
posted by saulgoodman at 10:35 AM on January 7, 2010

" If someone needs to understand my DB-query classes to understand my GUI classes, I am not going my job right."

Agreed, and I admire your intentions but please do keep in mind that extant code is frequently just that bad, if not usually worse. Not to mention that, if I need to add a button to your GUI, I probably need to make it do something, so I'm going to want to understand both anyway. An integrated understanding is okay for head-down feature programming, but pretty much all other activities like profiling, optimizing, debugging, ripping-out-the-hideous-and-broken-outsource-code-and-reimplementing and so on are going to require a holistic view to some extent.

If you don't understand a large codebase from a systemic standpoint, you're never going to be able to do more than add cruft.
posted by majick at 10:50 AM on January 7, 2010

Actually what I intend to say is: "...a contained understanding is okay for head-down feature programming..."
posted by majick at 10:52 AM on January 7, 2010

Well non-open source code, to quote Google Talks- starts with having a future business model in place "when considering the operators"- the onus of your code.
Personally, as a musician I see myself constantly songwriting (private) and then performing (public).
So knowing its "open source", may I suggest that you are songwriting and then hope to perform it in some capacity.
In music there are many a-"unsung heroes"- just because you are good, doesn't mean that you will get the recognition you deserve.
There has to be some acceptance of your existence and your ability to promote as well as attract.
How does this have anything to do with deciding on what open source projects you wish to either uphold or to begin?
I dunno. What are you doing it for? How is that projects attribution policy? Do you like to be thanked? Or better yet- how often are you in the position to say "your welcome" and mean it?
Lastly, if you think big- start small. God will then do the rest as Einstein acknowledged soon after Gods children acknowledged him ;-)
posted by jpeek345 at 5:45 PM on December 28, 2010

« Older What to do when a new job is a bad fit? | how can I best edit .MOV files? Newer »

This thread is closed to new comments.

Ask MetaFilter

Getting to know new code
January 7, 2010 5:23 AM Subscribe

Tags

Share

Getting to know new code January 7, 2010 5:23 AM Subscribe

Tags

Share

Getting to know new code
January 7, 2010 5:23 AM Subscribe