What is the best way to digitize all my old notes from school?
August 7, 2012 9:57 AM   Subscribe

What is the best way to digitize all my old notes from school?


I'm a second year dental student, and I'm looking to make all of my handwritten notes digital. After one year of taking notes and receiving handouts I can't take the volume of paper that has accumulated on my bookshelves. Currently, I want to make all my notes/handouts/old exams into digital files.

I thought scanning into PDFs might be an option, with some OCR integrated into it. I'm willing to spend some cash in order to make this happen. I'm also open to completely different ideas on how to do it that I haven't thought of.

Anyway, here is my setup. I have a mac, ipad, iPhone, and a basic HP scanner. I would like to make the files universally compatible with all computers so that I can throw things on dropbox and look at them anywhere if I need to or if 10 years down the road I switch to another platform I won't lose compatibility. As far as I know that pretty much limits me to PDFs which is fine. (As I mentioned I'm open to other file formats as well)

In summary, I need to know the best way to begin the scanning process, what software to use for mac, and how to cut down on file size with minimum quality loss, and if I need a new scanner or not to do this well.

Thank you so much!
posted by jModug to Education (8 answers total) 7 users marked this as a favorite
First step: I would seriously ask yourself if you really need all those notes/handouts. I would be willing to bet that 80% of them can be tossed--you will never look at them again.

How many of your notes from undergrad are useful now? When's the last time you looked at an old exam from then, or longed to review some course handout? Or even during undergrad, how often did you look back at notes from previous years?

The material you're learning this year and next year will build on what you've already learned and reinforce it. And for important licensure exams in the future, you're going to have specialized study guides and courses anyhow. If you cut the amount you're converting to the 20% of the really important stuff, it will make your life a lot easier.
posted by shivohum at 10:15 AM on August 7, 2012 [2 favorites]

The Fujitsu Scansnap scanners are really popular for this sort of thing. They're really expensive for a consumer device, but they'll probably cut your scan time down to 1/5 or 1/10 of the time compared to a regular HP scanner.

The fanciest Scansnaps scan (supposedly) at 20 pages per minute with a 50 sheet tray and duplex scanning in one pass. With a normal desktop scanner that has top loader, you'll be lucky to get 6 pages per minute, duplex will be a second pass and you'll have to reassemble the pages manually. The OCR is likely to be worse as well.

I looked into doing this at a Fedex Office or other copy center, and it was absurdly expensive per page. You might also be able to scan at work or school if you have access to a full sized copier.
posted by cnc at 10:30 AM on August 7, 2012 [1 favorite]

Best answer: Please don't do this with a basic flatbed scanner - it will take so very painful. As cnc mentioned I would go ask your favorite research librarian and department administrative assistants what scan services they have on offer, because for your sanities sake you need an single pass auto-duplex scanner with a sheet tray so that you can load a bunch let it run. Many modern commercial printer all in one units will have something like this built in and it will typically email them to you in PDF form, and most academic departments and libraries will have one. Find one that your school pays for, and go monopolize it overnight/during very very off hours to not piss people off. I would play with the settings and probably scan at as high of a DPI as possible, but you'll probably need a few test runs against contrast/brightness settings on your notes for optimal automatic usage. Consider also how you will organize your files too. Handouts separate or chronological inside of notes, etc? All easiest to change at this point.

Yeah, then use an after-scan OCR - I know Acrobat X Pro has that feature built in, but I don't know how it stacks up to the competition. Also consider taking some time and going through and making linkable indices and clickable links/citations for further use.

You probably want some sort of smart indexing/library service for after the fact as well for keeping track of the files and accessing them, rather than just a big folder 'o PDFs. If you work with lots of peer reviewed journals you could use something like Mendelay or Papers to do this and incorporate it into your current personal Journal Article Database. You can do some cloud syncs with these and have them be accessible via mobile clients as well and will have annotaion/Metadata sync options. Meh, actually, these would work pretty well even if you don't currently keep lots of journal articles electronically for reference.

If you have text books and want to incorporate this into them, and are a bit handy, there are some neat instructables online that involve making book scanners for rapid book scanning (something like this, for example.) IANAL,IANYL, but I believe as long as you own the book and use for personal usage/backup this is fair use.
posted by McSwaggers at 12:20 PM on August 7, 2012 [1 favorite]

Came in here to recommend the Scansnap! I have one, and *love* it. I did a large volume of notes from school using it and now use it regularly for work. A+++
posted by chiefthe at 12:49 PM on August 7, 2012

and a basic HP scanner

That ain't gonna work. Unless you enjoy pain and suffering on a level unusual for most people, you're going to need to get a real scanner. I'd recommend either one of the higher-end Kodaks (sometimes available used for a decent price, but be sure you have a return policy) or a Fujitsu.

Then you need to prep the documents. This actually takes longer than scanning them, or rather it should. Well-prepped documents should keep the scanner busy, running at its full speed. You'll want to unbind and unstaple everything that can be unbound and unstapled, and then either separate items by document (I like to turn alternating documents 90 degrees to each other) or insert separator sheets. Many scan applications can be configured to break documents on blank sheets, so you can use those as separators.

Then you scan and index. I'm not familiar with the Fujitsu software, but the Kodak "Capture" software isn't too terrible. By which I mean it won't make you completely despair and want to end your life immediately. (Most scan software is pretty bad, if not actually malevolent and evil -- looking at you, HP.) Kodak's is less bad, in that it's designed for batch work and doesn't get too much in your way; you can scan batches of documents, and it will present you with the first page of each and let you type in the document name and hit enter before continuing. Very fast once it gets going. It will assemble them into PDFs only when you have completed a batch. I assume Fujitsu's software is similar.

If the documents don't have any important color information, I'd scan at 200 DPI bitonal if the text is average size. Color will be slow and the resulting files will be large. Sometimes I will do color pages at a very low DPI, like 100 or 150, and then scan the same page at 200 DPI bitonal and put them immediately after each other in the output document. (This will screw up the page ordering if you ever print the document again, but I generally don't care.)

I agree with the decision to output to PDF. The only other widely-compatible format would be multipage bitonal TIFF, but I'm not sure there's any compelling reason to go that route today. At least PDFs can be easily OCRed and have the text and layout embedded in them.

If you can afford the software, I'd get a standalone OCR package license rather than trying to do it at the same time that you're scanning everything. OCR tends to be slow, and it sucks if the OCR engine crashes to lose a 100+ page batch. But I'm not sure how much you want to invest in this project; OCR software is unpleasantly expensive in standalone versions for some reason.
posted by Kadin2048 at 1:34 PM on August 7, 2012 [1 favorite]

I'll just add that I wouldn't expect OCR to work for your handwritten notes.
posted by Good Brain at 1:57 PM on August 7, 2012 [1 favorite]

I <3 my ScanSnap. (And DevonThink Pro Office for Mac.)
posted by Brian Puccio at 3:47 PM on August 7, 2012

Many large Xerox printer/copiers can also scan documents to PDF in a fairly automated fashion. Just load them up in the top tray and use the email mode. For handwritten documents (especially if you used pencil), it's better to turn background suppression OFF. Also make sure to set it to two-sided mode, since the ones I've used default to single sided scanning. Also, if you scan too many pages at once, you may run into email attachment size limitations.

I digitized most of my notes from my undergrad degree this way.
posted by cosmic.osmo at 7:19 PM on August 7, 2012

« Older Lost from 70s classic Scifi?   |   Strange dark substance in my soap Newer »
This thread is closed to new comments.