Best practices for verson control with binary files
June 12, 2015 8:01 AM   Subscribe

I am working with a non-technical distributed team that is drowning in different versions of binary files (Microsoft Office, PDFs, small icons). I want to suggest using version control like Git to simplify things and am willing to take on the responsibility for training the team. What are some of the best practices we should implement as a team in order to make this work smoothly?
posted by philosophygeek to Computers & Internet (9 answers total) 7 users marked this as a favorite
 
Are people all editing the same files, or does each member have responsibility for a file/file-set, and just needs to share those files with the rest of the team ?

With binary files that don't automatically/easily diff, how do you handle merges and collisions ?

Do you want to be able to diff things ? If so, sharepoint handles a number of formats. (vs Word/office's "track changes").

Are folks mature enough to put relevants in the commit log message ?
posted by k5.user at 8:09 AM on June 12, 2015


Response by poster: Are people all editing the same files, or does each member have responsibility for a file/file-set, and just needs to share those files with the rest of the team ?

More of the latter, and we can try to design a workflow that encourages that sort of activity. My assumption is that we'll have to do a system like that, but I wanted to see if there were other best practices. Or, if there are technical tools that help enforce that practice, that could be useful too. Our current sharing system is a disaster because we need to be able to look back at previous versions, and that results in dozens of folders or files named "XYZ - edit - final - really final 002".
posted by philosophygeek at 8:28 AM on June 12, 2015


Besides version control, for your PDFs, you should have the ability to set the metadata in the file and that should include the creation date of file, but any decent PDF software should also let you add in metadata for the version, perhaps in the Title field?
posted by plinth at 8:51 AM on June 12, 2015


Alfresco would probably suit you very well. They've got a (pseudo?) Open-source product that is free and requires at least one techie person to set up and maintain. Everything else is done through a GUI. They also have totally managed solutions with custom development and tech support.

Teamwork.com has file versioning, comments and revision history. People are responsible for files, and can send notifications when updates are uploaded. Organizing files could be a bit better. To be clear Teamwork is doing a whole bunch of other program management stuff, and file management is not a main feature.

And to throw out another suggestion, Basecamp. Project managementy stuff, with document managementy stuff too.
posted by fontophilic at 9:42 AM on June 12, 2015


SharePoint?
posted by pharm at 10:35 AM on June 12, 2015


Seconding Box.com. Simple UI with familiar sharing and permissions model, and handles a version history. Has good web preview for common file types, and handles large files well. I just used it for final hand-off of assets to a client and uploaded some zipped design assets of ~300 MB no problem.

Also, you may want to agree on some naming convention best practices and share those with the organization. Is

Sales Presentation 20150502.ppt

ok or do you share with people who may confuse May 2 with Feb. 5, etc.
posted by freecellwizard at 10:55 AM on June 12, 2015


It automatically creates a version history each time the file is changed. The old versions can be pulled up directly in the interface and reviewed, then you can make old versions the new current version, if you need to roll back.

Dropbox for Business does this as well. Also if you use MS Office a lot, it has some integration where you can see if you're simultaneously editing with someone else and if they save any changes. You can also leave comments on any changes made.
posted by bluefly at 12:44 PM on June 12, 2015


Git is great for source code and other text content where a computer can meaningfully track changes without understanding the underlying document format, but sucks for binary files - if the data can't be diffed against previous versions, then every checkout contains a full copy of every version of every file, and the repo ends up growing huge very quickly. If you want to use a traditional VCS instead of something like Box or SharePoint, you'd be better off (as much as I hate to say it) going with Subversion.
posted by russm at 5:54 PM on June 12, 2015


Low-tech, but I use my folders and filenames for this manually. In the shared files, XYZ isn't a document, it's a folder. In that folder, there's a file and two more folders. The file is the most current "released" version. The folders are "Old revisions" and "working."

In "working" I never save over old versions. Every time I save I give it a new name, like "150612a XYZ revised with data from heathers email" so the file name sorts chronologically and also has a commit-log-type summary of what I did.

When I get to where it's ready to release I move the top level one into the old revisions folder, copy the new finished version to the top level (without the remark part in the file name), and wipe out everything in the working folder to start all over on the next version.

It sounds cumbersome, but it's actually pretty quick. It's easy to go back to a previous version, and it's easy for other people to not get confused which one is the "real" one.
posted by ctmf at 8:51 PM on June 12, 2015


« Older What tips are there for buying a used car?   |   Help make our four days Rocky Mountain National... Newer »
This thread is closed to new comments.