Also, I think the database is so shoddily constructed that it can't handle the kind of programmed cleaning that you're suggesting. It appears that this job will have to be done manually. (Then again, it is important to note: none of us really knows how to do that sort of thing.Well, keep in mind that "programmed cleaning" almost always consists of someone who knows SQL poking vigorously at the database for a week or three, running queries that generate lists of 'likely duplicates' and refining criteria, etc. That kind of stuff isn't necessarily limited by the system you're using and the design of your database; they're the methods you use to migrate the data and massage it into a new structure when "OK, print it all out and type it back in" is unacceptable.
You are not logged in, either login or create an account to post comments
1) information integrity is a way of life, not a project goal. Beware the lazy user -- make provisions for ongoing de-duplication & the like.
2) decide what matters most: a perfect database, or one that's good enough. Although lazy users are a source of ongoing damage, you face a separate risk from people who will insist that the database be perfect. This will result in the database never being used.
...these two points are in tension with each other, but such is the natural order of things.
(actually, I have a third remark but it may already be too late: beware your implementation consultants. Be absolutely goddamn certain that the scope of work has been nailed down. Nailed down, not "commonly agreed upon" or anything like that. Nailed down with big scary words and tediously-exact statements.)
posted by aramaic at 11:57 AM on May 5 [4 favorites has favorites]