DIY Portfolio Management
November 3, 2010 9:31 AM

Database Filter: Building a database for project portfolio management- The project table must contain a record of every member of the project team as distinct people so that I can query which projects John Smith is working on as well as all of the team members of Project ABC. Project teams have between 3 and 20 people. The best design I can come up with is 20 fields for resources (ie, R_1, R_2, etc). Doable, but this will make my queries pretty bulky. Is there a better way to do what I'm trying to do?

I'm designing this database as a mockup to present the other side of a build vs. buy debate we're having about how to best get the portfolio data we want. I'm well aware of the many Project Enterprises and Clarities out there. Thanks for your help!
posted by bluejayway to Computers & Internet (14 answers total) 2 users marked this as a favorite
Yes - you need to make a relational database. One table would have People, and one would have Projects. You would need an intermediary table to relate these two - say you call it Project Participation. Each time someone participates in a project, a record goes into Project Participation linking PersonX with ProjectY.
posted by beyond_pink at 9:36 AM on November 3, 2010


Someone may have a better solution, but you probably want three tables:

1) Project table. One row for each project, have all the fields you need to describe a project but leave out people.

2) People table. One row for each person, have all the fields you need to describe a person.

3) Project-people table. One row for each person-to-project assignment. This is the table that keeps track of who's assigned to a project. It has two columns: one for a Project ID that refers back to a row in the project table, and one for a Person ID that refers back to a row in the person ID.

Let's say you have a project A with Bob, Mary and Carla, and a project B with Ted and Carla. Your Project table has two rows describing A and B, and your Person table has four rows describing Bob, Mary, Carla and Ted. Here's what your third table would look like:
------------------------
| projectID | personID |
------------------------
| A         | Bob      |
| A         | Mary     |
| A         | Carla    |
| B         | Ted      |
| B         | Carla    |
------------------------

posted by chrominance at 9:40 AM on November 3, 2010


Oh, also, this is more advanced stuff, but: if you want to store data specific to the project-person relationship—say, you want to list roles like Carla is the leader of project B—you can add that as an additional column to the project-person table.
posted by chrominance at 9:42 AM on November 3, 2010


Thanks for the replies so far. I didn't mention that I do have tables for status, people, projects, budgets, and more, but I don't have a project-people table. That's the ticket, I think.
posted by bluejayway at 9:50 AM on November 3, 2010


Also, it never hurts to have a guaranteed-unique numeric ID for each row of each table. Add a field for that. And then your ProjectPeople table can have tiny rows with just ProjectPersonID, ProjectId, PersonID fields. Ifyou want to do the roles thing, have a Roles table with RoleID (unique numeric), RoleName and a bunch of description fields, and stick a RoleID in each row of the ProjectPeople table as well.
posted by flabdablet at 9:57 AM on November 3, 2010


Oh yeah, I thought I'd missed something. flabdablet is right, have a unique ID on all the rows of your ProjectPeople table as well.
posted by chrominance at 10:26 AM on November 3, 2010


Another small tip...

These relationships probably have a timespan, so quite likely you'll want to have start and end dates for the projects and also start and end dates for people's participation in projects.

That way you will be able to handle queries such as "What is John Smith working on now?" or "What is John's workload over the next three months?"


Re Build v Buy

There's also free and open source. A couple of examples are Redmine and ProjectPier.

From the nature of your question, my suspicion would be that you're seriously underestimating what it would take to build a really useful project management application.
posted by philipy at 11:01 AM on November 3, 2010


Thanks for the Redmine tip, I will look into it. To clarify, this is for portfolio management, not project management. "Really useful" is in the eye of the beholder, and the whole purpose of this build vs buy exercise is to see if we can provide the limited data the executive leadership wants without using a Clarity-sized tool. For what it's worth, I'm in the buy camp. But buying, even if it's free, has its own set of complications and associated costs - not the least of which is getting it past our Kim Jong-Il style IT department (backward and totalitarian).

I've built something like this before at another company and know that it's doable. The last time was Access/SharePoint based, but didn't include resource tracking. I *know* that something even as relatively low-cost as Project Enterprise will accomplish this, but I have to present something that's built in addition to something that's bought.
posted by bluejayway at 11:21 AM on November 3, 2010


Do you have access to a library? What you are trying to do is a standard example in many books on databases. IIRC it is treated in some detail in Elmasri & Navathe - you definitely need to check it out before re-inventing the wheel.

Appart from that, a couple of minor to trivial points:

Yes - you need to make a relational database.

This is not strictly true. You want a database to model relations among entities, which is not necessarily a relational database.

Roughly speaking, relational databases are the ones where the data are modeled in a way based on algebraic relations, i.e. subsets of the cartesian product of sets of possible values. This terminology is greatly confusing, because non-relational databases (of which you won't find many around, at least not in a form you would immediately identify as a database) can also represent relations among data.

A good term to google for relations in the sense the OP is interested in is the Entity-Relationship model and accompanying Entity-Relationship diagrams.

Note that all this is of mostly academic interest to the OP, as most general-purpose database software these days is based on the relational model.

Also, it never hurts to have a guaranteed-unique numeric ID for each row of each table.

It may not hurt , but it often encourages poor design. The main problem is that a numeric ID has no other semantics appart from identification, and will allow you to represent things in your database that make no sense or hide problems in the data if you rely too much on it. On the other hand a simple, single-field key is usually a great convenience, as flabdablet rightly points out, so you will probably end up adding plenty of them, but don't get carried away thinking they will solve all your problems.

As an example, if you key every person record by their unique PersonID, and don't spend too much time on data validation, it is now valid to have two records with the same first name, last name and personal details but different IDs. Are they different people? A numeric ID will let you sidestep the problem of not being able to distinguish between the two records, but now you don't know if there are indeed two people with identical names or this is a double entry. If your data contains a practical primary key, it is often a better idea to use that instead of or in addition to an arbitrary unique ID.
posted by Dr Dracator at 12:07 PM on November 3, 2010


You are attempting to design a database. Your options are to (a) continue stumbling along, asking for help with the simplest of concepts, (b) take a class or read a book and learn how to do it, (c) hand it to somebody in the organization who knows how to do it, or (d) hire a consultant/temp who does this for a living.
posted by exphysicist345 at 10:04 PM on November 3, 2010


Also, it never hurts to have a guaranteed-unique numeric ID for each row of each table.

This is wrong. While we could argue the details of database design and natural vs. surrogate keys, I'll flat out say that you specifically don't want to do this on your mapping tables (the project-people table in your example). That sort of thing should have a composite primary key -
CREATE TABLE projects_people (
  person_id integer not null references people(id),
  project_id integer not null references projects(id),
  primary key (person_id, project_id)
);
where the you have no extraneous identity column, and simultaneously ensure that a b0rked app doesn't somehow make one person a member of some project multiple times.
posted by russm at 4:37 AM on November 4, 2010


As an example, if you key every person record by their unique PersonID, and don't spend too much time on data validation, it is now valid to have two records with the same first name, last name and personal details but different IDs. Are they different people?

I've seen a naive user of a popular genealogy database with a mediocre UI become horribly confused by exactly that issue. Sometimes Grandpa was married, sometimes not. Took a while to untangle.
posted by flabdablet at 5:02 PM on November 4, 2010


Also, russm's point about the desirability of composite primary keys in mapping tables is fair and reasonable.

The main reason I personally prefer using simple numeric keys even in mapping tables is to limit the number of places I need to change something when something needs to change. For example, if the database design makes (person_id, project_id) the primary key for projects_people, it won't allow any given person to be attached to any given project more than once. That's either a feature or a bug, depending on policy.

If policy says there must be only one relationship between any given person and any given project, it's not hard to tell the DB to enforce the uniqueness of each (person_id, project_id) tuple within projects_people explicitly, rather than making it a side effect of using that as the primary key; and there's an argument that if you want a uniqueness restriction like that, it's better to make it explicit.

On the other hand, it might be quite legitimate for a person to be associated with a project multiple times; perhaps a person can have multiple roles relative to a project. One reasonably natural way to deal with this would be to add a role_id column to projects_people. But now the primary key for projects_people has to be changed to (person_id, project_id, role_id), any database table or code that attempts to reference projects_people by primary key needs to change, and we now have (person_id, project_id, role_id) tuples appearing all over the place. If projects_people were instead keyed by a unique project_person_id, that wouldn't happen.

Naturally, whether or not it's good to force all that other stuff to get re-examined, even if only to the cursory extent necessary to fix up the primary key references, is a judgment call in its own right.

These are exactly the kinds of issues that inexperienced database designers find themselves coming unstuck on, and there are a lot of those issues; exphysicist345's suggestion that you consider availing yourself of onsite expertise is quite sound.
posted by flabdablet at 6:16 PM on November 4, 2010


Another consideration: if there are certain roles that all projects always have (e.g. customer, lead developer, QA coordinator) and those are the only roles you care about, and there is nothing you need to capture about those roles apart from their existence, then rather than have a roles table you might be better off with a column in the projects table for each such role, containing the person_id for the person who has that role. The projects_people table would of course still be needed to track the peons other team members.

Depending on the kinds of query you need, it may or may not be appropriate to create projects_people entries for each project's customer, lead developer and QA manager as well as the others involved. This would create data consistency issues you'd need to keep on top of, though.
posted by flabdablet at 6:37 PM on November 4, 2010


« Older How do you get your kid out the door in the...   |   Best way to research an area before buying a house... Newer »
This thread is closed to new comments.