Document import handlers
March 8, 2011 7:52 PM   Subscribe

I need to import multiple documents to a couchDB. I am behind a corporate firewall, and we're not going to be able to open a port for contributors to access an admin tool on couchDB. How can I ensure that the docs we get have the fields we are expecting?

So far I've considered:
  • a Word template (downside - handling many different versions of Word)
  • a Java app that provides an interface and emails it to us (downside - might not be able to get to the email client if they are using web mail)
  • a web app hosted outside the firewall that emails it to us (downside -corp IT might not be OK with this).

  • Are there any other options that I should consider? My criteria are:
  • ease/quickness of development
  • ease of use for end-users
  • consistency of input
  • security

  • Complicating this is that some users will have large amounts of docs to send at one time.
    posted by awfurby to Technology (2 answers total)
    a web app hosted outside the firewall that emails it to us (downside -corp IT might not be OK with this).

    Probably the easiest way to validate the document before sending it on to your db.
    posted by Blazecock Pileon at 8:46 PM on March 8, 2011

    We've had to deal with similar requirements on a recent project. We tried Word templates, YAML, hand-edited XML, and local browser applications, but the end users had difficulty using each of those technologies. We settled on a Java Swing application that generates both human-readable text and a non-readable "data dump" (really encoded XML), which the user can then cut and paste into the mailer of their choice. You can only generate the text/data dump once all client-side validation rules have been met. The ingest process picks up the data dump from the inbound e-mail, makes sure it matches the human-readable form, and pushes it into the database.

    This has proven to be a reliable method, but there's been a learning curve for the end users. Early on, many messages came though without the data dump; later, there was a major uptick in validation errors as users got comfortable enough to start hand-editing the human-readable form.
    posted by backupjesus at 9:14 PM on March 8, 2011

    « Older Redoing work with less time and fewer resources...   |   Liberal arts graduates, lend me your ears. Newer »
    This thread is closed to new comments.