How To Troubleshoot Anything (but mostly IT)
July 29, 2016 8:34 AM   Subscribe

I'm looking for books, websites, software, cheatsheets or even training pgrograms that focus on the troubleshooting process in general. This is for a small software support team, but I'm not looking for anything that specific necessarily.

It could be a book or an article on triage processes at ERs, or a cheatsheet "trouble tree" diagram that you found useful. Or a book on the philosophies or methods of solving problems. I'm looking for any kind of media that you felt made you a better problem solver or helped you get better at thinking through complex problems when you don't have a lot of upfront data.
posted by Doleful Creature to Technology (15 answers total) 29 users marked this as a favorite
 
I've always been partial to simple things like this.
posted by koolkat at 8:44 AM on July 29, 2016 [2 favorites]




This Scott Hanselman post is related to problem solving without specifically being about problem solving.
posted by cnc at 9:11 AM on July 29, 2016 [1 favorite]


Binary search. Look for an experiment you can do that splits the problem in half. Eg "If I just hard-code some data here, and my dummy data gets printed there, the problem's on the server. Else it's on the client."

Also, when you rubber-duck, you'll find yourself making assumptions. Spot them, then prove the assumption (50% of what I say when I'm the duck seems to be "can you prove that?"). Hard bugs hide in reasonable assumptions. Are you sure you're connecting to the right database?
posted by Leon at 9:40 AM on July 29, 2016 [2 favorites]


Root cause analysis is one of the terms of art for problem diagnosis, at least in my field. Not the whole package, but understanding the problem is typically the first step. Lots been written on the subject, so this may be a useful search term for you.
posted by bonehead at 9:54 AM on July 29, 2016 [2 favorites]


Polya's How to Solve It is evergreen. What is known? What is unknown? What to we need to find, show, or change?
posted by clew at 10:24 AM on July 29, 2016 [1 favorite]


I've been programming for a long time and have not seen anything like that. There are some insightful definitions in the Jargon File such as the heisenbug, the Bohr Bug, the schroedinbug, and the mandelbug.

I mostly use a runtime debugger. It works well for the programs I write, but you have to keep in mind that use of the debugger can change the order of operations especially when an event fires.

If there is one principle that I learned for myself is that in a new program, the fault is usually a mistake in the code. When a program has been running for while, it's more likely bad data, or perhaps a new edge case that isn't handled properly
posted by SemiSalt at 10:33 AM on July 29, 2016


GEGeek has a lot of stuff that would qualify with this.
posted by deezil at 10:47 AM on July 29, 2016 [1 favorite]


This is more directly applicable to electronic equipment, but the Navy teaches a six-step troubleshooting process:
1. Symptom recognition.
What's the equipment doing? What should it be doing?
2. Symptom elaboration.
What else is going wrong, beyond what the user initially reported?
3. Listing probable faulty functions.
Of all the building blocks in this complex system, which ones could be causing the problem?
4. Localizing the faulty function.
Which building blocks are actually causing the problem, and which ones are working OK?
5. Localizing trouble to the circuit.
Drill down as far as possible to isolate the problem.
6. Failure analysis.
Figure out why that component failed (including root causes), fix the problem, and test the repair you made.
There are more detailed presentations here from IEEE (PDF) and a technical school.
posted by haltingproblemsolved at 11:09 AM on July 29, 2016 [6 favorites]


XKCD
posted by rhizome at 12:53 PM on July 29, 2016


To be proactive, to prepare yourself before trouble occurs, try some FMEA (Failure Modes and Effects Analysis).

You can generate a list of possible failures, with likely root causes. It can shorten your troubleshooting time.
posted by yesster at 12:55 PM on July 29, 2016 [1 favorite]


Does your software support team have any computer science education? That might seem obvious but previously at my company, the support department was not made out of programmers. They had knowledge in our field and were skilled with our software. They would diagnose whether an issue was in the code or in the config, fix the config if that was the problem, and then pass the bugs to the second level support who were programmers. A few years ago the company changed to hiring software engineers for their support department. Now bugs or config are both fixed by one department, and everything gets done a lot faster. (Disclaimer: I am one of these engineers.) We still have a couple members of the previous team attached to my department and they just don't have the troubleshooting skills/techniques that I and my compsci colleagues got from school, specifically from building small applications. They're great in their positions, but there's a clear difference.

I don't know what kind of resources you have, but would it be possible for the team to build some software together? A basic app they can use to track tickets or something like that, something simple but something they have to make. I feel like they'd learn a ton from that experience about troubleshooting. If possible, the app should be in the same (or main) language the software they are supporting uses.
posted by possibilityleft at 5:51 PM on July 29, 2016


zamboni, clew, and haltingproblemsolved hit on what I consider the most important problem solving tools. For times when you need inspiration, or you just get stuck, try something like Oblique Strategies, a deck of ideas for moving past creative roadblocks. Many aren't the most relevant to engineering/programming but you can always customize your own cards.

The "what should it be doing" step that haltingproblemsolved mentions is all about making a mental model of how the system works. For a catalog of useful mental models, Gabriel Weinberg of the DuckDuckGo search engine recently posted on Medium about mental models he commonly uses. This is a list that might send you down the Google/Wikipedia rabbit hole in the best of ways.
posted by orangewired at 10:01 PM on July 29, 2016


The novel Zen and the art of motorcycle maintenance by R Pirsig. Just skip to the bits about fixing motorcycles.

ITIL
posted by yoHighness at 9:20 AM on July 31, 2016


Thanks everyone for the answers. The posts I marked as "best answer" were especially helpful. I didn't use all the material, but your suggestions helped me create an outline for a training that I gave to my entire company, tech and non-tech departments alike. Afterwards, many of my colleagues told me it was the best training they had received this year!

For the record, I found the "5 whys" model the most useful and it has helped me and my team think through some sticky problems in ways we hadn't tried before.
posted by Doleful Creature at 5:47 AM on August 29, 2016 [1 favorite]


« Older Seeking Creative Writing Instructor job.   |   How to clarify whey protein drink Newer »
This thread is closed to new comments.