uploading/downloading huge files
August 26, 2008 8:10 AM Subscribe
How to code a Perl web app to upload/download HUGE files?
A straight HTTP upload/download for my app is untenable because I'm dealing with files of hundreds of megs each. I'd like to see a solution that would offer progress bars, and (more importantly) the ability to restart incomplete uploads or downloads.
The existing app is Apache 1.3/mod_perl/perl 5.8. Uploads require a login; downloads don't, but are limited inasmuch as they're keyed by UUIDs that are only considered valid in a short time-window. (I don't want the downloaders to see the real filename or location, but I suppose could make a link named for the UUID that gets deleted after its valid lifetime if I pursue an FTPish solution.)
I'm open to ditching doing the uploading and downloading in the browser if need be and requiring users to install client software to upload/download, but the download solution would need to be simple, easy, and available for all of Windows, Mac, and Linux. The upload solution would need to be simple and available for Windows.
But ideally, something AJAXy on the client-side and Perl (or plays-well-with-Perl) on the server-side can save the day and allow me to stick with the browser. I haven't done anything like this before. Anyone have any suggestions?
(3d-party hosting isn't an option.)
A straight HTTP upload/download for my app is untenable because I'm dealing with files of hundreds of megs each. I'd like to see a solution that would offer progress bars, and (more importantly) the ability to restart incomplete uploads or downloads.
The existing app is Apache 1.3/mod_perl/perl 5.8. Uploads require a login; downloads don't, but are limited inasmuch as they're keyed by UUIDs that are only considered valid in a short time-window. (I don't want the downloaders to see the real filename or location, but I suppose could make a link named for the UUID that gets deleted after its valid lifetime if I pursue an FTPish solution.)
I'm open to ditching doing the uploading and downloading in the browser if need be and requiring users to install client software to upload/download, but the download solution would need to be simple, easy, and available for all of Windows, Mac, and Linux. The upload solution would need to be simple and available for Windows.
But ideally, something AJAXy on the client-side and Perl (or plays-well-with-Perl) on the server-side can save the day and allow me to stick with the browser. I haven't done anything like this before. Anyone have any suggestions?
(3d-party hosting isn't an option.)
Your users' web browsers should be putting HTTP's 'Range' header in their GET requests when resuming failed or partially completed downloads. Apache should honor that header when it processes a request. If your users never resume downloads or their browsers don't support the header or whatever, you should be able to write some AJAX to download the file in chunks; that way you can set the headers in the XmlHttpRequest yourself, and use cookies to track how many chunks the user has downloaded. Or something.
I have no idea how to handle uploading of files. I know there are WebDAV extensions for Apache that add support for the Range header in PUT requests; maybe that would help somehow? Maybe use the same AJAX and cookies method, but in the opposite direction?
man, I don't think I could be less helpful if I tried. I'm an interface developer for desktop apps; all you Web people are off in your own weird little universe.
posted by xbonesgt at 11:05 AM on August 26, 2008
I have no idea how to handle uploading of files. I know there are WebDAV extensions for Apache that add support for the Range header in PUT requests; maybe that would help somehow? Maybe use the same AJAX and cookies method, but in the opposite direction?
man, I don't think I could be less helpful if I tried. I'm an interface developer for desktop apps; all you Web people are off in your own weird little universe.
posted by xbonesgt at 11:05 AM on August 26, 2008
Well, I wouldn't call hundreds of MB "HUGE," but here you go...
Just stream the data to disk. If, at any point, you get a failure, detect it and attempt to retry from the last position. If the program aborts, it should also be able to retry from where it left off by using the HTTP RANGE attribute, as specified above.
I personally try to stay away from Perl and can't help you with the specifics, but a cursory Google found me this streaming MP3 client:
http://www.perlfect.com/articles/streaming.shtml
Basically start from there, but instead of streaming to an MP3 player, stream it to disk.
posted by atomly at 11:45 AM on August 26, 2008
Just stream the data to disk. If, at any point, you get a failure, detect it and attempt to retry from the last position. If the program aborts, it should also be able to retry from where it left off by using the HTTP RANGE attribute, as specified above.
I personally try to stay away from Perl and can't help you with the specifics, but a cursory Google found me this streaming MP3 client:
http://www.perlfect.com/articles/streaming.shtml
Basically start from there, but instead of streaming to an MP3 player, stream it to disk.
posted by atomly at 11:45 AM on August 26, 2008
This thread is closed to new comments.
There are also a variety of "download managers" available for web browsers. Many/most of these support download resuming, and handle large files fine. I don't know if any of them do uploading, I don't really use them much, but it's an option I suppose.
For upload, I've made applications in the past that split the file up into pieces and uploade each seperately - in fact this is how I upload full-size images to my website. Once they're all uploaded I re-constitute them into one file. Supporting upload stop/start would mostly be a matter of detecting which ones had already been uploaded - I don't do this myself because most of my files are 5 megs or less, so individual success is largely guaranteed. I have a simple application that runs on the client side. Doing this in any scripting language would be pretty easy. (I did it in tcl/tk)
posted by RustyBrooks at 9:37 AM on August 26, 2008