How should we integrate this “case scenario” into our database admin/developer interview?
December 4, 2008 5:54 AM
Inspired by the responses to this question, those of us on the hiring committee for our new DBA have been working up an interview protocol for our hopefuls. But I need more specifics on how to set up the case scenario (aka, “okay hotshot, show us the inside of your brain!”), how to have it unfold and what follow up questions to make sure to include in order to get the most bang out of it. Please help!
So, our team particularly liked the idea of using a real-world business problem as a case scenario for our prospective DB guru to work through. And, we’ve come up with a good (and reasonably complex) example that by necessity would require any DBA worth her/his salt to demonstrate knowledge of multiple types of joins, think hard about indexing strategies, etc. I’ve been charged with drafting this scenario into a usable form for the interview. I now have a decent sketch of the overall logic/workflow of the data problem the database would be designed to help solve that makes it pretty clear what fields need to be checked against what at each stage of the process and what decisions get made based on those. I am also planning to draft an overall statement of the problem, its significance to our work, and a general description of what the size/nature of incoming data might look like.
But...how should we plan to work this into the interview? There is some debate amongst our little hiring committee as to how to time and structure the candidates’ exposure to the information. Some think it should be sent out in advance, while others think it should be given out cold. The scenario is just complicated enough that I think it would be fair at minimum to give the prospect at least 15-20 minutes’ alone time just to digest before even starting to lay out table structures, etc. One approach I’m thinking of is: (a) introduce general problem and ask how they would go about designing a database solution for it. Deliberately underspecify the issues. Use response to assess overall organization of their thinking, approach to projects, communication skills, etc.; (b) share logic flow diagram with them, and allow them to ask more specific questions about it; (c) leave them for x amount of time to review and start sketching out proposed DB design and at least pseudocode for queries on whiteboard; (d) return, see how far they got, grill about details, probe their thinking further, etc. Or...some other, better, hive-mind-inspired idea. The case scenario will most likely happen after our general and technical questions but before the touchy-feely workstyle stuff.
I’m looking for two things:
1. Recommended structures for introducing and querying the prospect around the case scenario. Is my proposed approach a good one? How could it be improved? What do we need to keep in mind to get the most out of it? Are there other models that would work equally well or better?
2. We think this and our other planned interview questions are sufficient to separate the DB posers from the at-least-moderately-competent. Any specific follow-up questions as we’re discussing their response to the case study that would help to separate those folks from the true OMG-must-hire-rockstars?
As always, specifics and examples super-appreciated!!!
So, our team particularly liked the idea of using a real-world business problem as a case scenario for our prospective DB guru to work through. And, we’ve come up with a good (and reasonably complex) example that by necessity would require any DBA worth her/his salt to demonstrate knowledge of multiple types of joins, think hard about indexing strategies, etc. I’ve been charged with drafting this scenario into a usable form for the interview. I now have a decent sketch of the overall logic/workflow of the data problem the database would be designed to help solve that makes it pretty clear what fields need to be checked against what at each stage of the process and what decisions get made based on those. I am also planning to draft an overall statement of the problem, its significance to our work, and a general description of what the size/nature of incoming data might look like.
But...how should we plan to work this into the interview? There is some debate amongst our little hiring committee as to how to time and structure the candidates’ exposure to the information. Some think it should be sent out in advance, while others think it should be given out cold. The scenario is just complicated enough that I think it would be fair at minimum to give the prospect at least 15-20 minutes’ alone time just to digest before even starting to lay out table structures, etc. One approach I’m thinking of is: (a) introduce general problem and ask how they would go about designing a database solution for it. Deliberately underspecify the issues. Use response to assess overall organization of their thinking, approach to projects, communication skills, etc.; (b) share logic flow diagram with them, and allow them to ask more specific questions about it; (c) leave them for x amount of time to review and start sketching out proposed DB design and at least pseudocode for queries on whiteboard; (d) return, see how far they got, grill about details, probe their thinking further, etc. Or...some other, better, hive-mind-inspired idea. The case scenario will most likely happen after our general and technical questions but before the touchy-feely workstyle stuff.
I’m looking for two things:
1. Recommended structures for introducing and querying the prospect around the case scenario. Is my proposed approach a good one? How could it be improved? What do we need to keep in mind to get the most out of it? Are there other models that would work equally well or better?
2. We think this and our other planned interview questions are sufficient to separate the DB posers from the at-least-moderately-competent. Any specific follow-up questions as we’re discussing their response to the case study that would help to separate those folks from the true OMG-must-hire-rockstars?
As always, specifics and examples super-appreciated!!!
Is it practical to compromise slightly? I don't know your setup, but I'd be inclined to write to the candidate a few days in advance to let them know that the interview will include this excercise and roughly what they'll need to do. Don't give any specifics that could let them start writing answers out until the interview itself. This way they're not coming in completely cold -- after all, in the real job they'll always start off with a rough idea what structures they'll be dealing with and be able to do appropriate research -- but you still get to watch them actually apply this background knowledge under some time pressure, to see how their brains work.
posted by metaBugs at 6:14 AM on December 4, 2008
posted by metaBugs at 6:14 AM on December 4, 2008
You might want to do the initial one or two interviews with candidates you think you're less likely to want to hire; it sounds like whatever process you choose will have some kinks that will need working out, and the initial interviewees may be at an unfair disadvantage (or advantage) because of this.
posted by amtho at 6:41 AM on December 4, 2008
posted by amtho at 6:41 AM on December 4, 2008
Most DB analyst type people that I have worked with will ask to see the data you are working with. You should have some available.
posted by mkb at 6:50 AM on December 4, 2008
posted by mkb at 6:50 AM on December 4, 2008
Give the problem out early, to everyone, at the same time (so nobody's at a disadvantage) I'm not a DBA, but I am a developer. Interviews are stressful enough; I have a hard time thinking through problems with someone standing over my shoulder, questing my every thought and watching every little thing I scribble down. Unless of course, you need a guy whose strength is thinking on his feet. The problem you're specifying is not one he can go out and google the answer to, it's one he's either going to be able give an educated/knowledgable solution to or not, depending on his skillset. Give them time to think about the problem, on their own terms. You want someone who can come up with a working, smart solution; not a quick but full-of-holes solution.
Once they're in the interview, have them discuss their solution, the process they used to come to their final proposal, the assumptions they made, etc. If they can't talk about, then they probably got their rockstar DBA friend to help them out.
posted by cgg at 7:50 AM on December 4, 2008
Once they're in the interview, have them discuss their solution, the process they used to come to their final proposal, the assumptions they made, etc. If they can't talk about, then they probably got their rockstar DBA friend to help them out.
posted by cgg at 7:50 AM on December 4, 2008
Hmm....great thoughts here. Please keep them coming! Anyone care to respond with some specific follow up questions to pose when the candidates present their designs?
posted by shelbaroo at 8:33 AM on December 4, 2008
posted by shelbaroo at 8:33 AM on December 4, 2008
cgg, that's a helpful distinction - we do want someone smart/flexible enough to roll with our everyday punches, but it's true that in day-to-day practice the person would be given the time needed to come up with and evaluate solid work. So...if everyone receives the case scenario at some time before the interview, what length of time is enough but not too much?
posted by shelbaroo at 8:37 AM on December 4, 2008
posted by shelbaroo at 8:37 AM on December 4, 2008
cgg:
When interviewing, you aren't just looking for someone who can come up with a working, smart solution; you're looking for someone you can work with. You want someone who, even when a working solution is difficult or impossible, demonstrates correct and positive thinking while ascertaining this. Also, the ability to communicate about one's thoughts while working on a problem is a valuable skill.
I'm a developer too, and the best interviews I've had did involve "someone standing over my shoulder, questing my every thought and watching every little thing I scribble down." It's stressful, and I usually can't come up with a working, smart solution - there are always bugs. Our discipline is hard, hard problems are common, and everyone encounters them; so it's good to see how people approach hard problems before you hire them; not just the solution they construct, but how they get there.
shelbaroo:
Re: follow-up questions: look at the sketched-out DDL and SQL and identify places where NULLs may occur. Ternary logic is sticky; you want an applicant who recognizes the stickiness and knows about all of: the specified behavior of NULL, the actual behavior of NULL on a specific database, and the points in his or her solution where NULL could be a problem. See: http://en.wikipedia.org/wiki/Null_(SQL).
Also, though I'm a late-comer to the thread, I'd like to say I think it could be worthwhile to restructure your interview protocol to eliminate the 'black box' aspect, so you instead can get a handle on how your applicant thinks. You're hiring the person, after all, not the solution.
posted by doteatop at 9:53 AM on December 4, 2008
Interviews are stressful enough; I have a hard time thinking through problems with someone standing over my shoulder, questing my every thought and watching every little thing I scribble down. Unless of course, you need a guy whose strength is thinking on his feet. The problem you're specifying is not one he can go out and google the answer to, it's one he's either going to be able give an educated/knowledgable solution to or not, depending on his skillset. Give them time to think about the problem, on their own terms. You want someone who can come up with a working, smart solution; not a quick but full-of-holes solution.I agree that interviews are stressful, and it's hard to construct a correct solution to a problem on a whiteboard, with an interviewer watching, but I think an interview is more than an opportunity to administer a test to an applicant under controlled conditions.
When interviewing, you aren't just looking for someone who can come up with a working, smart solution; you're looking for someone you can work with. You want someone who, even when a working solution is difficult or impossible, demonstrates correct and positive thinking while ascertaining this. Also, the ability to communicate about one's thoughts while working on a problem is a valuable skill.
I'm a developer too, and the best interviews I've had did involve "someone standing over my shoulder, questing my every thought and watching every little thing I scribble down." It's stressful, and I usually can't come up with a working, smart solution - there are always bugs. Our discipline is hard, hard problems are common, and everyone encounters them; so it's good to see how people approach hard problems before you hire them; not just the solution they construct, but how they get there.
shelbaroo:
Re: follow-up questions: look at the sketched-out DDL and SQL and identify places where NULLs may occur. Ternary logic is sticky; you want an applicant who recognizes the stickiness and knows about all of: the specified behavior of NULL, the actual behavior of NULL on a specific database, and the points in his or her solution where NULL could be a problem. See: http://en.wikipedia.org/wiki/Null_(SQL).
Also, though I'm a late-comer to the thread, I'd like to say I think it could be worthwhile to restructure your interview protocol to eliminate the 'black box' aspect, so you instead can get a handle on how your applicant thinks. You're hiring the person, after all, not the solution.
posted by doteatop at 9:53 AM on December 4, 2008
doteatop - when you say eliminate the 'black box' aspect, are you suggesting that we don't leave them alone while they get a jump start on thinking through their approach? Or do you mean something else?
I'm fascinated by the responses so far...there are really valid points supporting vastly different approaches. I'm not sure what I think so far, but I know our team meeting tomorrow is certainly going to benefit from all this cogent mefi input. Thanks all - I'll come back and mark best answers later.
posted by shelbaroo at 11:13 AM on December 4, 2008
I'm fascinated by the responses so far...there are really valid points supporting vastly different approaches. I'm not sure what I think so far, but I know our team meeting tomorrow is certainly going to benefit from all this cogent mefi input. Thanks all - I'll come back and mark best answers later.
posted by shelbaroo at 11:13 AM on December 4, 2008
doteatop - when you say eliminate the 'black box' aspect, are you suggesting that we don't leave them alone while they get a jump start on thinking through their approach? Or do you mean something else?Yes. An interview is a relatively brief window of time to learn about a person. I think it would be a shame to leave any of it unused. Stay with the applicant, communicate the question clearly, and remain available for your applicant to ask follow-up questions. No specification is ever complete and unambiguous. Refinement of your requirements will be needed, and is a reasonable thing for your applicant to expect. In return, expect your applicant to communicate his or her thoughts to you throughout the problem-solving process. If the applicant says something unclear, feel free to ask for clarifications or more detail.
If your applicants are good, you should be learning from them about your questions, and learning more about the problem you are asking them to solve each time. Take the time to study the question you are asking beforehand, since you haven't used this one before. Get to know your problem. Weigh multiple approaches. When interviewing many people, you are likely to see a lot of new ideas - be as prepared as possible to evaluate them intelligently so you aren't caught flat-footed. If you have some good friends who are skilled DBAs, you can consider asking for a few sample solutions to your problem, to help you get to know it.
posted by doteatop at 11:57 AM on December 4, 2008
First, by "DBA" you appear to be meaning database designer?
"DBA" is term that people use to cover a variety of responsibilities. Is database design your main focus, or will the new hire also have responsiblity included in the several other things people mean when they say "DBA": selecting hardware and software, setting up the database server, writing SQL queries, and optimizing queries and structures?
All database are collections of entities, entity attributes, and relations among entities. If you're looking for design skills, a knowledge of (at least) Third Normal Form form is mandatory.
A good db designer knows that all databases are imperfect models of reality -- the art of database design lies in knowing what needs to be modelled and what is superfluous.
(Example: an employee database and a genealogical database will have "person" entities. But these are not the same person entity. Eye color is almost certainly not something you want in the employee database, but might well want in the genealogical database.)
Once that's known (and it's always imperfectly known), it's knowing how to model these entities, attributes, and relations that matter. This is rarely straight-forward, as database usually serves different clients, even if there's only one program that accesses it: programmers, users, program mangers all have different goal, and database design will make some goals easier and some harder.
Easy inserts work against easy updates; plasticity and openness to modification and generality work against speed of the database and speed of development, and all those needs have to be weighed against each other.
(Consider the design of metafilter: as I understand it, at one time the "front page", askmefi, and metatalk used different tables for posts and comments, even though each has the same basic structure: users, posts, and comments. Separate tables makes the database slightly faster, and removes some amount of update contention. But it means that new features for each sub-site, insofar as they depend on the database structures, have to be coded seperately and redundantly.)
More fundamentally, certain structures are more complex (and so slower) but allow for greater flexibility for future changes. Even more fundamentally, choices of data structures influence choices of algorithm, and choice of data structures make some things easier an some things harder. This can be as simple as fixed vs. variable column widths, or as complex as many-to-many relations (which require an extra table and extra keys) vs. simpler paent-child relationships,
Some things are simply difficult to do in a database: the obvious and recurrent theme is that recursiveness is hard; in particular, tree-like structures of arbitrary height can be represented but are hard to query to arbitrary depth. Features that require or imply arbitrary height trees (or arbitrary nesting, which is the same thing) mean making trade offs up front.
It's the designer's job to identify these "traps" and prepare for them.
But this is difficult if not impossible to do in an interview. In an interview, a good signer can give you a rough and preliminary draft of a design, but a design of complexity requires that he wave his hands and say, "of course all this is hypothetical".
Your interview should concentrate on electing from the interviewee a discussion of these issues.
To separate the good from the really good, here are a few questions worth asking (in the previous askmefi you linked to, I listed several "entry-level" and "journeyman" questions, so I won't repeat them here; these are harder questions, for the "journeyman" and "master" level database designer.
* How would he handle a graph of arbitrary depth, for example, given a genealogical database where each person entity has a (reference to) a "mother" and "father" entity, identify all of an entities grandchildren, or great-grandchildren, or n-th generation grandchildren? (There are at least two ways to do this purely in the database, but each has a fundamental drawback; otherwise the recursing needs to be done in a client program that accesses the database. Have interviewees identify the way to do this outside the databse, and two ways in the database, the drawbacks of doing it in the database, why these are drawback, and ways to mitigate the drawbacks.)
* Discuss the possible designs of a temporal (6th normal form) database. This is a hard problem, especially in relation to identifying what entities are current. (See my askmefi answer here for a brief discussion of temporal databses, and some example of the subtly of on modelling questions.
* To make it really hard, discuss making a currently non-temporal database temporal, with the object of minimizing changes to the existing database clients and the existing database interface.
* Segue into the general case of minmizing interface changes while changing implementation: can object oriented programming techniques such as encapsulation, information-hiding, abstraction, and inheritance be employed in databse design? What about functional decomposition? For each of these, how?
A good modeler is not necessarily good at SQL, or vice versa. Since according to your question, this guy will be your only database guy, he'll need to be good at SQL too. These are easier questions, you might want to ask them first.
* I'd ask the 454 Retail Table question I alluded to in the askmefi you linked to; the "right" answer requires a three table join in which the join predicate depends on mathematical calculations using operands from all three tables.
* I'd also ask some good questions about group bys, group bys and calculated columns, and the interaction of group bys and views. For an example, see my answer to this askmefi.
* Pose this question about sparse data too; if the interviewer has an Oracle background, he'll mention that Oracle has a pseudo-table that does this, so then ask how he'd do it in another RDBMS.
You want to hear the interviewee saying "view" a lot; if he's saying stored procedure often or, god forbid, "cursor" at all, be very dubious.
I'm purposely leaving out optimizations questions; they depend far too much on which RDBMS, which version, and what the competing trade-offs are. Either the questions are super general, or they're one you only solve after looking at a showplan/explain query.
Finally, I like your approach except for "(c) leave them for x amount of time to review" -- that's to easy to interpret as lack of interest on your part. Let the interview be an interview; if you wantt o give them timeto think, let both of you "take a break" and he'll use his break to think on the questions withut feeling like he's been left alone in a room.
Oh, and considering posting your job on MefiJobs; I'd like to interview for it.
posted by orthogonality at 12:54 PM on December 5, 2008
"DBA" is term that people use to cover a variety of responsibilities. Is database design your main focus, or will the new hire also have responsiblity included in the several other things people mean when they say "DBA": selecting hardware and software, setting up the database server, writing SQL queries, and optimizing queries and structures?
All database are collections of entities, entity attributes, and relations among entities. If you're looking for design skills, a knowledge of (at least) Third Normal Form form is mandatory.
A good db designer knows that all databases are imperfect models of reality -- the art of database design lies in knowing what needs to be modelled and what is superfluous.
(Example: an employee database and a genealogical database will have "person" entities. But these are not the same person entity. Eye color is almost certainly not something you want in the employee database, but might well want in the genealogical database.)
Once that's known (and it's always imperfectly known), it's knowing how to model these entities, attributes, and relations that matter. This is rarely straight-forward, as database usually serves different clients, even if there's only one program that accesses it: programmers, users, program mangers all have different goal, and database design will make some goals easier and some harder.
Easy inserts work against easy updates; plasticity and openness to modification and generality work against speed of the database and speed of development, and all those needs have to be weighed against each other.
(Consider the design of metafilter: as I understand it, at one time the "front page", askmefi, and metatalk used different tables for posts and comments, even though each has the same basic structure: users, posts, and comments. Separate tables makes the database slightly faster, and removes some amount of update contention. But it means that new features for each sub-site, insofar as they depend on the database structures, have to be coded seperately and redundantly.)
More fundamentally, certain structures are more complex (and so slower) but allow for greater flexibility for future changes. Even more fundamentally, choices of data structures influence choices of algorithm, and choice of data structures make some things easier an some things harder. This can be as simple as fixed vs. variable column widths, or as complex as many-to-many relations (which require an extra table and extra keys) vs. simpler paent-child relationships,
Some things are simply difficult to do in a database: the obvious and recurrent theme is that recursiveness is hard; in particular, tree-like structures of arbitrary height can be represented but are hard to query to arbitrary depth. Features that require or imply arbitrary height trees (or arbitrary nesting, which is the same thing) mean making trade offs up front.
It's the designer's job to identify these "traps" and prepare for them.
But this is difficult if not impossible to do in an interview. In an interview, a good signer can give you a rough and preliminary draft of a design, but a design of complexity requires that he wave his hands and say, "of course all this is hypothetical".
Your interview should concentrate on electing from the interviewee a discussion of these issues.
To separate the good from the really good, here are a few questions worth asking (in the previous askmefi you linked to, I listed several "entry-level" and "journeyman" questions, so I won't repeat them here; these are harder questions, for the "journeyman" and "master" level database designer.
* How would he handle a graph of arbitrary depth, for example, given a genealogical database where each person entity has a (reference to) a "mother" and "father" entity, identify all of an entities grandchildren, or great-grandchildren, or n-th generation grandchildren? (There are at least two ways to do this purely in the database, but each has a fundamental drawback; otherwise the recursing needs to be done in a client program that accesses the database. Have interviewees identify the way to do this outside the databse, and two ways in the database, the drawbacks of doing it in the database, why these are drawback, and ways to mitigate the drawbacks.)
* Discuss the possible designs of a temporal (6th normal form) database. This is a hard problem, especially in relation to identifying what entities are current. (See my askmefi answer here for a brief discussion of temporal databses, and some example of the subtly of on modelling questions.
* To make it really hard, discuss making a currently non-temporal database temporal, with the object of minimizing changes to the existing database clients and the existing database interface.
* Segue into the general case of minmizing interface changes while changing implementation: can object oriented programming techniques such as encapsulation, information-hiding, abstraction, and inheritance be employed in databse design? What about functional decomposition? For each of these, how?
A good modeler is not necessarily good at SQL, or vice versa. Since according to your question, this guy will be your only database guy, he'll need to be good at SQL too. These are easier questions, you might want to ask them first.
* I'd ask the 454 Retail Table question I alluded to in the askmefi you linked to; the "right" answer requires a three table join in which the join predicate depends on mathematical calculations using operands from all three tables.
* I'd also ask some good questions about group bys, group bys and calculated columns, and the interaction of group bys and views. For an example, see my answer to this askmefi.
* Pose this question about sparse data too; if the interviewer has an Oracle background, he'll mention that Oracle has a pseudo-table that does this, so then ask how he'd do it in another RDBMS.
You want to hear the interviewee saying "view" a lot; if he's saying stored procedure often or, god forbid, "cursor" at all, be very dubious.
I'm purposely leaving out optimizations questions; they depend far too much on which RDBMS, which version, and what the competing trade-offs are. Either the questions are super general, or they're one you only solve after looking at a showplan/explain query.
Finally, I like your approach except for "(c) leave them for x amount of time to review" -- that's to easy to interpret as lack of interest on your part. Let the interview be an interview; if you wantt o give them timeto think, let both of you "take a break" and he'll use his break to think on the questions withut feeling like he's been left alone in a room.
Oh, and considering posting your job on MefiJobs; I'd like to interview for it.
posted by orthogonality at 12:54 PM on December 5, 2008
Yes, people, I'm giving out "best answers" like halloween candy because these are all about the most awesome, helpful responses I could possibly have imagined. Once again MeFites come through and they come through in a big way.
Our team met today and, with the help of all of the answers so far, put together the following plan: (1) as metaBugs suggested, we're informing candidates that there will be a database design/administration (yep, orthogonality, it's both -- like I said, we're small and scrappy!) technical case scenario in which they will be expected to walk through a real-world problem. That seemed fair. (2) However, as doteatop and orthogonality suggested, we're not giving out any specifics beforehand and we're staying with them through the whole process. We're allowing a big chunk of time...hopefully this will. Oh, and we're also planning to follow amtho's advice and schedule the lower-scoring candidates for the earlier interviews.
I'm looking forward to seeing how this plays out and posting an update.
Orthogonality, check your mefimail.
posted by shelbaroo at 4:04 PM on December 5, 2008
Our team met today and, with the help of all of the answers so far, put together the following plan: (1) as metaBugs suggested, we're informing candidates that there will be a database design/administration (yep, orthogonality, it's both -- like I said, we're small and scrappy!) technical case scenario in which they will be expected to walk through a real-world problem. That seemed fair. (2) However, as doteatop and orthogonality suggested, we're not giving out any specifics beforehand and we're staying with them through the whole process. We're allowing a big chunk of time...hopefully this will. Oh, and we're also planning to follow amtho's advice and schedule the lower-scoring candidates for the earlier interviews.
I'm looking forward to seeing how this plays out and posting an update.
Orthogonality, check your mefimail.
posted by shelbaroo at 4:04 PM on December 5, 2008
« Older Is there an RSS feed of instructions for taking... | WorkFriendly.net has been gone for a while, though... Newer »
This thread is closed to new comments.
posted by shelbaroo at 5:58 AM on December 4, 2008