I need a library catalog for digital files.
May 21, 2024 10:00 PM Subscribe

I have a lot of digital files that I need to organize and associate quite a bit of metadata with. What's the best way to do so without learning how to build a database from scratch or paying $$$ for academic library software?

For context, I have several hundred files (mostly photos but not exclusively) that I want to preserve and document in a way that, when I go back and look at them again in 10 years, I will know key information about - like their creator, date made, context for why the item in the photo was created, etc. Ideally this information would also be easily sorted and filtered and the files displayed simultaneously with their metadata (or at least linked in a clickable way.)

Is there a more elegant solution than a very large Excel spreadsheet? I have used Koha in the past and loved it, but standing up my own Koha catalogue and AWS server for file storage would be a lot of work for just this collection. There seem to be plenty of platforms for handling digital media cataloguing, but they're all aimed at large universities with $$ and librarians. What I'm looking for is something more like Zotero or Mendeley, but for digital files. It does not need to be public-facing; this is for my personal collection.

Your digital preservation help is great appreciated!

posted by WidgetAlley to Computers & Internet (17 answers total) 11 users marked this as a favorite

Zotero has options for artwork, audio recordings and other types of files. One of the file types may work well enough for your purposes. I've used the notes feature to add additional contextual info for my weird file types (including photos!)
posted by spamandkimchi at 10:27 PM on May 21 [1 favorite]

Koha is absolutely the wrong tool. It's meant for library cataloging, which means MARC, which is a great steaming pile of overkill. I like the idea of using Zotero.

CollectionBuilder or Omeka might also float your boat. For Omeka you probably want the spreadsheet import/export plugin. Webforms for metadata just suck.
posted by humbug at 4:41 AM on May 22 [2 favorites]

I've used Filemaker Pro for this professionally. I had no skill for it before a project but was able to cobble something together that was quite stable (yet changeable) and useable by others in just a few days of experimentation. It's been 8 years since I put it together and it's still good. The cost may be more than you want to pay, but it felt like a good investment for my team.
posted by knile at 4:55 AM on May 22

Check out the demo for ResourceSpace and see if it does what you want.
posted by nonane at 5:27 AM on May 22

If I were starting this project today, I would use spreadsheet software and save as a CSV. If not that, I would make very sure that the database could be saved in a portable, non-proprietary format.

* Yes, Koha is the wrong tool. Too heavy, too much work to run.

* Zotero is a lovely tool (I use the web and standalone versions), and I could see it working. I would personally find it too flexible and too heavy. It is open source and has historically been well supported, which is good. It's a product run out of the Corporation for Digital Scholarship, which is itself run by people I trust, but ten years is a while in digital scholarship-land. Everything from the 2029 collapse of U.S. higher education to the 2032 economic crisis could cause them severe problems, and you'd be left figuring out how to support it on your own, or with the help of the Zotero community (which is, to be fair, really solid).

* Excel is Microsoft, a well-known quantity. It's hated by many, sometimes for good reasons. I don't feel particularly sure that it won't have increasingly horrible AI and privacy aspects in the immediate future, and many would say that it already does. If you don't like MS, tho, then OpenOffice, LibreOffice, Sheets, etc. can all handle CSV files.

* Why not more fully featured database software? I have twice encountered real headaches trying to access (or Access, if you will) old databases that had been saved in proprietary formats and weren't easy to open on the device/platform I was using.

I am not a tech wiz, so you may not find any of what I say a barrier.
posted by cupcakeninja at 5:33 AM on May 22 [2 favorites]

Lightroom, Darktable and Digikam are for managing large image libraries and their metadata. The EXIF record in each image can carry a lot of metadata and would avoid creating a database beyond a folder hierarchy.
posted by k3ninho at 5:35 AM on May 22 [2 favorites]

This is a complicated problem for a lot of reasons, but at least two big ones are that you have mixed document types (so photo-only tools don't work) and you're looking for long-term access/archiving. If this is purely a personal project and you don't really need a lot of sharing to others or the public, something like DEVONThink might be the answer? It's been around a long, long time, the tech underneath is all standards-based and accessible in markup or plaintext...it's effectively a file system and relational database product all wrapped into one. It is Mac only, though there are probably Windows rough equivalents....
posted by griffey at 5:55 AM on May 22

Like k3ninho, I absolutely think it's worth separating out the photographs and using a dedicated tool intended for managing an image library. This is pretty much as close to a solved problem as you're going to get in this space, as there are lots of people who have this need, and the solutions are robust and well-maintained. If you use something else in order to also capture the non-photo files in the same system I think you're going to be compromising a lot on the photos, which you both describe as "most of" the collection and use as the example for your metadata.

(Bona fides: I'm a librarian who does stuff with metadata)
posted by pullayup at 6:02 AM on May 22 [4 favorites]

Going to the opposite extreme, if you're a bit techincal-minded and want something that's free, simple (for certain definitions of "simple"), absolutely future-proof, and platform/software independent, you could write your own schema and simply create a corresponding plain text metadata file for each of your items, validate it against your schema, and bob's your uncle. This could be done in json or xml, for example.
posted by pullayup at 6:14 AM on May 22 [1 favorite]

I would probably use Greenstone for this, it's open source library software that will do what you want. I haven't used it in about 15 years but it looks like it was last updated in 2023; I'm glad to see it's still out there.
posted by twelve cent archie at 7:17 AM on May 22 [1 favorite]

oh dang Greenstone's been updated? I thought it was dead as a doornail.

Welp, that's gonna change my digital-collections syllabus this summer...
posted by humbug at 7:31 AM on May 22

You may want to look at LibraryThing. It's a website used by thousands of personal and specialized collections.
posted by Enid Lareg at 7:36 AM on May 22

Notion might do what you want, but I share the opinion that it might be best to go with something designed for this.

I'd also make sure it can export to some non-proprietary format, just in case -- and then make sure you DO the export. You don't want to have all this work and all these memories get lost when some company goes out of business or discontinues a product.
posted by librarina at 11:43 AM on May 22 [1 favorite]

Nthing the suggestion to go with a tool built for use with image collections - but if you want to use something that's like a spreadsheet but a bit more robust, you might want to take a look at Grist, which has an open source version with binaries. You can install it on your own computer (the cloud version is optional). Here's a simple example template for images with info - Memes.
posted by kristi at 12:33 PM on May 22

Seconding cupcakeninja's re keeping approach as simple as it can be. I'm not very techie either, internet is not always reliable so wary of saas (and dislike rentseeking). I do have some things in Access etc, but if data is more texty it's faster to find in more ways.

Most of my information is very cross-disciplinary and hard to file, and I never found a good way of filing until I thought of adding a random number to the filename (I use VSCode's random number tool) preceded with s__ (s double underline doesn't occur anywhere on my system).

Most of my documents have been found via websearch and I start a search using this system too, naming the file after a hypothesis or search target:

Coanda effect search for applications in stormwater sediment settling s__9oOjYpV4.txt

For successful search items I enter the random number in the filename.

While I search I run a .txt file to track search term efficacy (and I can rerun my searches this way, which saves a lot of time). Files I save I enter my saved system filename:

Buer 2020 TSS Coanda as a recent addition to hard_BMPs P stormwater SUDS s__9oOjYpV4.pdf

and its doi, meta/comments, if I've discussed paper with author, and their contact dets. etc.

If I take an image that relates to the coded topic I add that to filename.

I search using Everything, and Ransack. Everything is dramatically faster than Windows search, and way more nuanced, like you can control path depth e.g. Depth:5 to limit search at five folders deep.

My data is either on my PC or my (local) NAS.
If I was on a mac I'd seriously consider Devonthink.

I was suprised to see Greenstone is still alive, my wife completed her degree in his lab.
posted by unearthed at 1:48 PM on May 22

The best database/asset manager that I have used for photos and a very wide array of files is NeoFinder. The developer is highly active and responsive.

It is advertised for photos and music BUT can be used for practically any data file including archive. An example is ebooks where I keep track of more than terrabyte of reading material with all the attached metadata, images and related files without a problem.
posted by jadepearl at 8:15 AM on May 25

It would probably be possible to do something like this with Obsidian, and a directory with a bunch of files and descriptive markdown files as the catalog. This might be cumbersome to set up, but once you have a process in place it shouldn't be too much work to keep it going. A bunch of markdown files are easily interoperable if anything happens to them, and Obsidian has a robust community with a bunch of plug-ins for different functions, which might make this even easier.
posted by taltalim at 7:55 AM on May 28

« Older Sending a parcel/package internationally to a... | Subaruh-oh Newer »

You are not logged in, either login or create an account to post comments

Ask MetaFilter

I need a library catalog for digital files.
May 21, 2024 10:00 PM Subscribe

Tags

Share

I need a library catalog for digital files. May 21, 2024 10:00 PM Subscribe

Tags

Share

I need a library catalog for digital files.
May 21, 2024 10:00 PM Subscribe