Seeking recommendations for records management database stuff
May 27, 2013 11:25 AM   Subscribe

I'm about to start a new job (!) as a records manager with a mid-sized company specializing in law and engineering. My background in is academic libraries, not IT, and I would like to run my half-baked ideas past you all before plunging into the abyss.

Basically they want a way to manage their documents and media files, so some kind of database is in order, right? I want to stick with an open source/free or nearly free solution, but it's not mandatory.

- Each project has thousands of media files attached to it, so storage space is important. (ie: won't be able to use the free version of Dropbox.)

- They have several offices and may open more, so any solution should be flexible and able to be used from multiple locations. I think the right term is "scalable" but I could be off.

- Access to the files needs to be secure and authenticated. I need to be able to "freeze" project contents so no changes can be made after a certain point, like receiving a subpoena.

- Is DSpace a good option here? Or should I propose something proprietary, like MS Access?

- I have no experience setting up a server or using Linux. Am I f#cked? If so, how much? I'd like to learn new IT skills as part of this job, and I hope I'm not being too naive about my ability cut it.

Thanks in advance!
posted by wowbobwow to Computers & Internet (9 answers total) 6 users marked this as a favorite
 
If this list of requirements is exhaustive*, I'd consider setting up a web server and making the documents accessible through WebDAV. One folder per project, and appropriately structured subfolders in each project to store the media files and other files. The WebDAV store could be accessed through https and require authentication. This would also provide an audit trail. Best of all, you can map WebDAV stores as network drives in Windows (and probably on OS X and I'm sure in Linux), so all of the projects will just appear to users as folders in their Z:\ drive (or whatever letter you choose). So, no learning new software or interfaces for them.

Snapshots would be simple for you. While logged in to your server,

tar cvf project69_snapshot20130527.tar /path/to/webdav/store/project69

Your snapshot is now stored as project69_snapshot20130527.tar

If you need to actually freeze the project itself, I'm pretty sure you can just remove write access to the folder.

I have no experience setting up a server or using Linux. Am I f#cked?

Yes. If you want to unfuck yourself, install Linux today (today, not tomorrow) and start learning. I'd recommend Debian or CentOS, as those are Linux distributions commonly used for servers. Download VirtualBox and get a Linux distro set up in a VM, or if you want to get started even more quickly, go to Digital Ocean and get a VPS set up. The cheapest option is about $6/month.


*It never is. Clients/bosses always think of additional requirements.
posted by jingzuo at 11:44 AM on May 27, 2013 [1 favorite]


One open source solution to check out is Alfresco. It's DOD certified for records management.

It's a bit overwhelming to setup and run without dedicated IT staff, so proceed cautiously if you don't have IT support.
posted by advicepig at 11:45 AM on May 27, 2013 [1 favorite]


Best answer: The fact that you even mentioned dropbox for storing legal documents is crazy talk. The fact that you're considering access, which is a desktop solution, not an enterprise solution, is crazy talk. Yes, I think you are _way_ in over your head on the technology side.

You need better requirements. You need to educate yourself on what the words these folks are using mean. You need to learn what risk tolerance the business has towards technology. You need to find out what IT requires for purchasing and installing new software.

Most importantly, you need to start reading about the product space and stop hewing to what you know.

In short, if this is actually a mid-size, go in and learn how many decisions you'll actually have to make on your own about this stuff, do some research, and come back with a more specific ask when ready. Don't go in all cowboy pretending to know what you don't.

And seriously, don't believe anyone that posts specific solutions before you have a better requirements list. This is barely sufficient to start a conversation about potential vendor solutions.
posted by bfranklin at 11:46 AM on May 27, 2013 [6 favorites]


Best answer: - Each project has thousands of media files attached to it, so storage space is important. (ie: won't be able to use the free version of Dropbox.)

Storage is cheap - it's the organization, access interface and robustness that makes or breaks this kind of project.

- They have several offices and may open more, so any solution should be flexible and able to be used from multiple locations. I think the right term is "scalable" but I could be off.

What is the network setup like? Are all the sites connected in some way? What level of access do the remote sites have to your main facility? You need to be talking to your IT people about this - the options available to you are vastly different depending on the infrastructure and central management capabilities available.


- Is DSpace a good option here? Or should I propose something proprietary, like MS Access?

I don't have much experience with dspace except as an end user, but MS Access is rarely a good answer, unless you are organizing your personal library or wine collection. You can use it for evil, soul-sucking actual production work but it will definitely bite you in the ass at some point down the road.

- I have no experience setting up a server or using Linux. Am I f#cked? If so, how much? I'd like to learn new IT skills as part of this job, and I hope I'm not being too naive about my ability cut it.

You are kind of fscked in a relative sense: if your employers expect you to set up and administer a Linux server, and do not realize you do not have the relevant skills, your employers are clueless. That being said, if you are generally competent in IT matters you will find basic system administration is much simpler than you would expect, with one big caveat: you must always align yourself with the correct Tao. If you are following instructions from somewhere and find you need to fiddle with things or ignore some error messages, you will probably come to regret it later.
posted by Dr Dracator at 11:51 AM on May 27, 2013 [1 favorite]


Response by poster: What is the network setup like? Are all the sites connected in some way? What level of access do the remote sites have to your main facility? You need to be talking to your IT people about this - the options available to you are vastly different depending on the infrastructure and central management capabilities available.

They don't expect any sort of server/DBA service from me; they basically just want a document librarian to help them organize things. Currently they just have a shared network drive. The offices are not connected in any way, IT-wise, and they didn't mention making them so a priority. I guess I was just thinking "it would be nice" to have that someday. I am probably over-thinking how to organize their stuff for them, and will know a lot more once I actually start. Methinks a more tailored askme is in order when I cross that bridge, as bfranklin suggested.
posted by wowbobwow at 12:02 PM on May 27, 2013


If you want recommendations about commercial products to try and to avoid, contact me via MeMail. Very few organizations I know of from 500-50,000 seats use anything open source due to the lack of supportability and accountability (which is not to say that there aren't some OS solutions out there that won't let you buy a support contract), though many of the larger vendors do leverage some open source technology.

The problem is that it sounds like you are being asked to play the following roles, which are usually occupied by different people in the organization:
1) Records Manager. - This is the person responsible for working with the business to determine what their needs are, not the technical specifications. This person would help them define retention policies and groups.

2) Sales Engineer. - This is the person responsible for selling the business on a SPECIFIC product or suite of products, framing the presentation in such a way that it speaks to the needs of the business as defined above.

3) Engagement Manager. - This person would come up with a plan for how to implement the product that was decided upon by the business. Think of this role as the architect of the product. This person would specify what infrastructure (server, network, etc) resources would be needed and how they would all fit together. This person would also be responsible for providing documentation to the custom about their specific configuration.

4) Implementation Engineer. - This person would work with the Engagement Manager to help build the infrastructure as designed by the Engagement Manager.

5) Support Specialist. - This person would be handed a functioning records management infrastructure and be responsible for maintaining the servers, patching the software, and responding to the business and Records Manager when issues arise.

It sounds like you need to focus on the Records Manager role immediately. You need to work with the business to define their needs or they risk potential civil liabilities for failing to follow a defined and enforced retention policy and you risk your job.
posted by MonsieurBon at 12:52 PM on May 27, 2013 [1 favorite]


Seconding what bfranklin and Dr. Dracator said; from my very limited experience with these sorts of problems, Alfresco and DSpace/Fedora Commons both sound like plausible solutions (in general, maybe not for your specific company), but not for you right now and possibly not at any point -- Alfresco, at least, is very enterprisey.

That said, it doesn't seem like your employer wants more than you can give yet, and getting the network drive organized (and securely backed up! including off-site backups!) then investigating how to provide access to it sounds like it'll be both immediately useful and a good way to start learn-by-doing with IT.
posted by snarkout at 6:00 PM on May 27, 2013 [1 favorite]


Good luck if you choose Alfresco. I spent about a year trying to upgrade our installation to their latest version but the software was so seriously buggy and tech support was so lacking (I had to actually teach some of their tech support people how to use it), that we have started investigating other solutions. People badmouth SharePoint a lot, but so far it seems like a heavenly dream in comparison.
posted by jenh526 at 8:19 PM on May 28, 2013 [1 favorite]


@wowbobwow I guess you are wiser now that when you first asked your question to the forum. I agree with @MonsieurBon, that you have to define your role more precisely.

One aspect that has not been addressed too much is your database needs: you are going to want to store an awful lot of data and access it quickly, as needed. Looking at your background, you might want to look at NoSQL-based, scalable options.

In a recent search, I came across this discussion of some scalable, NoSQL hosted services:

There are all NoSQL-based solutions: Redis Cloud (though it's not recommended for technical rookies) and two others (from Amazon and Mongolabs).
posted by DanielaSchuster at 2:53 AM on November 6, 2013


« Older Where to have my car windows tinted in Houston?   |   Alone in a quarter-life crisis Newer »
This thread is closed to new comments.