Software engineering: where is the data?
April 25, 2019 9:24 AM Subscribe
I'm a programmer (and long-ago scientist) looking into software engineering best practices. As a field of scientific study, it seems... well, let's say, lean. But, I'm an outsider coming into a conversation already in progress. What resources are there to catch up on what's going on in the field?
This all started with a discussion of unit testing with a colleague (after fighting with some ridiculous mocking library). The question became "Is there actual proof that unit testing works?". The internet responds with "Absolutely yes/Absolutely no, in my experience, here's an anecdote, blah, blah, blah", and soon devolves into metaphorical screaming and flinging poo. (People selling their books are the worst.)
From there, I've been in a black hole of "Wait, why do we do X?". Actual data seems really thin on the ground.
I've been starting at papers like Belief & Evidence in Empirical Software Engineering (programmers trust their experience over the data, though the paper has some issues) and A Large Scale Study of Programming Languages and Code Quality in Github (your choice of language matters to defect rates... but much, much, much less than you think). I'm slowly grinding through the references for them. And I feel like I'm not getting the whole picture, just the parts.
Is there a better overview of the field out there, that actually is supported by data?
Please note: in particular, I'm not looking for Code Complete, Programming Pearls, the gang of four, SICP, the Pragmatic Programmer, or other prescriptive texts.
This all started with a discussion of unit testing with a colleague (after fighting with some ridiculous mocking library). The question became "Is there actual proof that unit testing works?". The internet responds with "Absolutely yes/Absolutely no, in my experience, here's an anecdote, blah, blah, blah", and soon devolves into metaphorical screaming and flinging poo. (People selling their books are the worst.)
From there, I've been in a black hole of "Wait, why do we do X?". Actual data seems really thin on the ground.
I've been starting at papers like Belief & Evidence in Empirical Software Engineering (programmers trust their experience over the data, though the paper has some issues) and A Large Scale Study of Programming Languages and Code Quality in Github (your choice of language matters to defect rates... but much, much, much less than you think). I'm slowly grinding through the references for them. And I feel like I'm not getting the whole picture, just the parts.
Is there a better overview of the field out there, that actually is supported by data?
Please note: in particular, I'm not looking for Code Complete, Programming Pearls, the gang of four, SICP, the Pragmatic Programmer, or other prescriptive texts.
Best answer: It hasn't had any updates for a couple of years, but It Will Never Work In Theory included quite a lot of empirical SE papers. You could chase links from there and see what else the same authors/conferences have been doing more recently.
posted by offog at 10:57 AM on April 25, 2019 [3 favorites]
posted by offog at 10:57 AM on April 25, 2019 [3 favorites]
It's quite old but I enjoyed Dreaming in code and I think a lot of the book is spent discussing the question: Why after so many decades is software engineering not really like other engineering disciplines in terms of predictable timelines, quality, best practices etc. I can't remember whether it cites many studies though.
posted by JonB at 12:08 PM on April 25, 2019
posted by JonB at 12:08 PM on April 25, 2019
You might like Accelerate: The Science of Lean Software and DevOps.
posted by neushoorn at 12:16 PM on April 25, 2019
posted by neushoorn at 12:16 PM on April 25, 2019
Empirical Studies of Software Engineering: A Roadmap is a 10-page paper from ICSE-2000 (that's International Conference on Software Engineering).
posted by JonJacky at 1:17 PM on April 25, 2019 [1 favorite]
posted by JonJacky at 1:17 PM on April 25, 2019 [1 favorite]
Science and substance: a challenge to software engineers ran in IEEE Software in 1994.
posted by JonJacky at 1:24 PM on April 25, 2019
posted by JonJacky at 1:24 PM on April 25, 2019
Best answer: Not the big picture, but follow @hillelogram on twitter. Just from the past week alone: here's a recent thread reviewing the empirical evidence that code review helps, and here's another citing empirical literature to show that all the practices software engineers advocate for/against have second order effects on code quality in comparison to ... sleeping well.
posted by caek at 1:31 PM on April 25, 2019 [8 favorites]
posted by caek at 1:31 PM on April 25, 2019 [8 favorites]
Two Solitudes goes some way to explaining why the research and the practise often seem to have little to do with each other.
posted by clawsoon at 1:47 PM on April 25, 2019 [1 favorite]
posted by clawsoon at 1:47 PM on April 25, 2019 [1 favorite]
Perhaps Making Software: What Really Works, and Why We Believe It? I haven't read it, but it seems to be focused on the question you're asking.
posted by clawsoon at 2:01 PM on April 25, 2019 [1 favorite]
posted by clawsoon at 2:01 PM on April 25, 2019 [1 favorite]
I think you are going to have a hard time finding the data for this because terms like 'unit test' are not scientific, they are vague. Also, it really depends on what you mean by 'works'. Are your teams delivering code the won't compile in another testing environment? Then you can really say that unit testing doesn't work where you work. Is the code working, but not meeting all the business rules? Then it technically is failing, but it may be due to other components not working correctly (ie: bad input). I'd say the amount of code delivered that won't compile in most orgs is extremely low, so unit testing 'works'.
posted by The_Vegetables at 3:03 PM on April 25, 2019
posted by The_Vegetables at 3:03 PM on April 25, 2019
Seconding Making Software and following Hillelogram on twitter.
posted by silentbicycle at 6:31 PM on April 25, 2019
posted by silentbicycle at 6:31 PM on April 25, 2019
It has been awhile since I read it but I seem to recall the Mythical Man Month cited academic research.
posted by mmascolino at 8:05 PM on April 25, 2019
posted by mmascolino at 8:05 PM on April 25, 2019
This is a great question. My impression has always been that there isn't enough research into what works, but as The_Vegetables noted, it's hard to even define 'works' in a consistent way - every project's goals are different.
Further, in my own brief attempts to work with software engineering researchers, I learned how difficult it can be to get professional programmers to agree to have their work studied! They don't get much out of it, and there's always the potential to be embarrassed. And rare is the for-profit software company who wants to share enough info to get any real insight into their processes…
Since you're looking for hard data instead of anecdotes and philosophy, you might want to look into the "Empirical software engineering" subfield, in the proceedings of the International Symposium on Empirical Software Engineering and Measurement , for starters, and see where those folks have also published...
One big guy in the ESEM field, Marv Zelkowitz at UMD, wrote What have we learned about software engineering? in the Communications of the ACM a while back, that might be interesting… Zelkowitz and his colleague at UMD, Victor Basili have published a lot in this area and you could start from their work too.
posted by mmc at 8:37 PM on April 25, 2019
Further, in my own brief attempts to work with software engineering researchers, I learned how difficult it can be to get professional programmers to agree to have their work studied! They don't get much out of it, and there's always the potential to be embarrassed. And rare is the for-profit software company who wants to share enough info to get any real insight into their processes…
Since you're looking for hard data instead of anecdotes and philosophy, you might want to look into the "Empirical software engineering" subfield, in the proceedings of the International Symposium on Empirical Software Engineering and Measurement , for starters, and see where those folks have also published...
One big guy in the ESEM field, Marv Zelkowitz at UMD, wrote What have we learned about software engineering? in the Communications of the ACM a while back, that might be interesting… Zelkowitz and his colleague at UMD, Victor Basili have published a lot in this area and you could start from their work too.
posted by mmc at 8:37 PM on April 25, 2019
Did you read the thread in the Blue about the 737MAX? That has some interesting insights on software engineering and maybe some links you might find useful.
posted by Fukiyama at 8:34 AM on April 26, 2019
posted by Fukiyama at 8:34 AM on April 26, 2019
Please consider the empirical work in DeMarco & Lister's 1989 paper Software Development: State Of The Art Vs. State Of The Practice, which I found quite concise, readable, and insightful.
posted by daveliepmann at 12:37 PM on April 26, 2019
posted by daveliepmann at 12:37 PM on April 26, 2019
Hillel just (today!) gave a talk with the following description:
posted by caek at 3:58 PM on April 29, 2019 [1 favorite]
Official Description: There are many things in software we believe are true but very little we know. Maybe testing reduces bugs, or maybe it’s just superstition. If we want to improve our craft, we need a way to distinguish fact from fallacy. We need to look for evidence, placing our trust in the hard data over our opinions.That link is to the talk's sources (including the infamous "Parachute use to prevent death and major trauma related to gravitational challenge: systematic review of randomised controlled trials") but the talk itself isn't online yet, so if you weren't already following him then now would be a good time, as I'm sure he'll link it.
Empirical Software Engineering is the study of what actually works in programming. Instead of trusting our instincts we collect data, run studies, and peer-review our results. This talk is all about how we empirically find the facts in software and some of the challenges we face, with a particular focus on software defects and productivity.
Actual Description: Nothing is real, we don’t understand what we’re doing, and the only way to write good software is to stop drinking coffee. Burn it all down. Burn it to the ground.
posted by caek at 3:58 PM on April 29, 2019 [1 favorite]
« Older Can I soundproof my bedroom against construction... | How to say "awww!" in different languages? Newer »
This thread is closed to new comments.
Yeah - I want things with less "woo" myself
posted by jkaczor at 10:47 AM on April 25, 2019