Editing XML files in report production workflow
May 27, 2005 4:58 AM Subscribe
Editing the data in XML documents via a web browser.
I'm currently contracting in the report-producing department of a big financial company. The biggest bugbear is manually amending copy in many of the six million pages we produce each year. I'm researching into a possible project using XML to make everybody's lives easier...
Currently documents are produced in QuarkXpress with some automation with accounts produced in Excel.
I'm drafting a project plan with the main focus being that the various financial & secretarial types (writers) dotted all over the world who write, check & amend reports are able to log in to some kind of browser front end, call up the text from a section of a document and update the text. The text (as XML files) is passed back & forth between writers and production (who import & export the XML files into & out of their styled QuarkXpress documents).
I've got the concept for the production end using an XML import/export XTension called Atomic Roundtrip but I need some advice regarding the browser front end.
How to display (ideally with some basic formatting to differentiate between headings, subheadings etc - using XSL?) the text of an XML document in a browser page, allow the user to edit the text and then save and updated version of the file?
I want to be able to set up a very basic demo on a single machine to show exporting text from a QuarkXpress document, opening that text in a browser, editing it, saving a new version & then re-importing that back into the QuarkXpress document.
Any ideas? At this point any suggestions, no matter how tangential, could be useful...I'm good at the design & production side but laughable at the XML & web stuff.
Currently documents are produced in QuarkXpress with some automation with accounts produced in Excel.
I'm drafting a project plan with the main focus being that the various financial & secretarial types (writers) dotted all over the world who write, check & amend reports are able to log in to some kind of browser front end, call up the text from a section of a document and update the text. The text (as XML files) is passed back & forth between writers and production (who import & export the XML files into & out of their styled QuarkXpress documents).
I've got the concept for the production end using an XML import/export XTension called Atomic Roundtrip but I need some advice regarding the browser front end.
How to display (ideally with some basic formatting to differentiate between headings, subheadings etc - using XSL?) the text of an XML document in a browser page, allow the user to edit the text and then save and updated version of the file?
I want to be able to set up a very basic demo on a single machine to show exporting text from a QuarkXpress document, opening that text in a browser, editing it, saving a new version & then re-importing that back into the QuarkXpress document.
Any ideas? At this point any suggestions, no matter how tangential, could be useful...I'm good at the design & production side but laughable at the XML & web stuff.
It sounds like a wiki hosted on a secure server is the best option for multiple-user updates. There are companies like SocialText which concentrate on offering this approach, and they will probably be able to help with the translation to other formats.
posted by yclipse at 5:49 AM on May 27, 2005
posted by yclipse at 5:49 AM on May 27, 2005
Response by poster: ...parse it with a server-side technology and display forms to the user that they can use to edit the text...
...if you've got users posting data changes in a web environment, you're going to need to use forms...
That sounds reasonable. Excuse my aforementioned laughable web skillz but how do these forms you speak of work??
posted by i_cola at 6:03 AM on May 27, 2005
...if you've got users posting data changes in a web environment, you're going to need to use forms...
That sounds reasonable. Excuse my aforementioned laughable web skillz but how do these forms you speak of work??
posted by i_cola at 6:03 AM on May 27, 2005
Response by poster: ... and if I want to produce a simple demo am I going to have to start attempting to code them..?
yclipse: Good link - thanks.
posted by i_cola at 6:07 AM on May 27, 2005
yclipse: Good link - thanks.
posted by i_cola at 6:07 AM on May 27, 2005
Gut reaction: perfectly do-able. In fact, if you've already got the XML <> Xpress transfer sorted then you've done the tricky part.
I agree with frufry - you don't need to store the data as XML, you simply need to format the data as XML in order to pass it to Quark. I think you'd be much better off storing the data in a database and running a process along these lines:
1) Store the data in a database
2) Put a web-based front end onto this database to allow users to edit the data via a form
3) Write some script that will turn a record in the database into an XML page
4) Import the XML page into Quark
Parts 1&2 are a very common use for web apps and you shouldn't have any problems there. I'm almost certain that scripts will already exist to handle part 3 - if not, they'd be very easy to write in something like PHP / ASP .NET.
Have you tested Roundtrip with a sample Quark template and a hand-written XML page? Before you go too far down this route, you'll need to make certain that part works.
Other random thoughts:
One difficulty I can forsee at this stage is the size of your database - 6 million pages is a lot of XML. I'd guess that would be too big for one of the common "prosumer" databases and you might need to consider looking at something like Oracle. Some of the db experts on here should be able to help with that.
Data organisation will also become a headache at this size too - you'll have to think very carefully about how the pages will be filed, structued and organised.
Also, consider how you're going to get all of your existing pages into the database unless you're planning on running the two systems in parallel for a while. Is this system going to be used soley for new jobs or will you need to convert all of your current pages into the new format? If so, that's a hell of a job. Think about outsourcing it or at least hiring a small army of temps to do it.
Disclaimer: I could very well be missing the point, here, but this approach has worked well for us in the past for similar jobs. Sadly I can't make tomorrow evening but I'll drop you an email in case you want to go over anything in more detail.>
posted by blag at 6:08 AM on May 27, 2005
I agree with frufry - you don't need to store the data as XML, you simply need to format the data as XML in order to pass it to Quark. I think you'd be much better off storing the data in a database and running a process along these lines:
1) Store the data in a database
2) Put a web-based front end onto this database to allow users to edit the data via a form
3) Write some script that will turn a record in the database into an XML page
4) Import the XML page into Quark
Parts 1&2 are a very common use for web apps and you shouldn't have any problems there. I'm almost certain that scripts will already exist to handle part 3 - if not, they'd be very easy to write in something like PHP / ASP .NET.
Have you tested Roundtrip with a sample Quark template and a hand-written XML page? Before you go too far down this route, you'll need to make certain that part works.
Other random thoughts:
One difficulty I can forsee at this stage is the size of your database - 6 million pages is a lot of XML. I'd guess that would be too big for one of the common "prosumer" databases and you might need to consider looking at something like Oracle. Some of the db experts on here should be able to help with that.
Data organisation will also become a headache at this size too - you'll have to think very carefully about how the pages will be filed, structued and organised.
Also, consider how you're going to get all of your existing pages into the database unless you're planning on running the two systems in parallel for a while. Is this system going to be used soley for new jobs or will you need to convert all of your current pages into the new format? If so, that's a hell of a job. Think about outsourcing it or at least hiring a small army of temps to do it.
Disclaimer: I could very well be missing the point, here, but this approach has worked well for us in the past for similar jobs. Sadly I can't make tomorrow evening but I'll drop you an email in case you want to go over anything in more detail.>
posted by blag at 6:08 AM on May 27, 2005
I'd steer clear of a wiki since data structure is incredibly important to XML and wikis, by nature, don't have any fixed structure. You'll end up with users completely bollocksing up your nicely-formatted data.
posted by blag at 6:10 AM on May 27, 2005
posted by blag at 6:10 AM on May 27, 2005
I'd suggest that you just build html mockups for the demo. Just a couple of html page(s) with form elements that show what the "edit" page might look like if it was really interacting with the data.
First, if you go all the way and build most of functionality, that's not much of a demo; you've pretty much got the real thing.
Second, I think you're in over your head anyway. Understanding html forms is the easy part. On top of that you'd need to learn enough of a technology like php to be able to parse and build Xml. That's not going to be trivial for someone who has never done any of it before.
Stick with mockups if time is short, and hire someone to do the coding when the time comes.
If you really want to know more, though, keep asking. I (or someone) will be happy to explain more.
posted by frufry at 6:24 AM on May 27, 2005
First, if you go all the way and build most of functionality, that's not much of a demo; you've pretty much got the real thing.
Second, I think you're in over your head anyway. Understanding html forms is the easy part. On top of that you'd need to learn enough of a technology like php to be able to parse and build Xml. That's not going to be trivial for someone who has never done any of it before.
Stick with mockups if time is short, and hire someone to do the coding when the time comes.
If you really want to know more, though, keep asking. I (or someone) will be happy to explain more.
posted by frufry at 6:24 AM on May 27, 2005
so your problem is how to edit the xml in a browser in a user-friendly way without the help of quarkexpress.
i think you could do it with javascript and xsl, but it's going to be a fair amount of work and require a good browser - can you specify that everyone has to use firefox, for example?
you would associate an xsl stylesheet with the xml data. the browser would transform the xml using the xsl and display it. so the xsl would need to take the xml and generate an html page with editable elements for all the xml data. then to transmit the data back you need to go the other way. that should be possible using javascript and another xsl stylesheet, i guess, but as far as i know it's not standard (in other words, the xml to html bit is what xsl "was made for", but going backwards (so you'd need to be careful it was xhtml and not just html) is not something people normally do).
you might also look at xforms - i think that may be the "right way" to do this in the future, but i don't know much about it or what support it has.
another approach would be to use xsl and xml to generate the html page as described above, but construct the html as a html form, and send the form fields back to the server. the server would then generate the xml from the form fields. that would be pretty easy to program with sax, for example. so you serve xml+xsl, the browser converts that to an html form, the user edits, then the browser posts the form fields back to the server, which re-assembles the form fields into a new xml document. the only hard part is working out a system for auto-naming the form fields so that each fields has a name that clearly identifies where it came from in the xml. that shouldn't be too hard.
one final word of warning - xsl is cool, but some (many?) find it has a very steep learning curve. it's not famouse for being user-friendly.
[i skimmed other answers but maybe i'm repeating what others said. if so, sorry.]
posted by andrew cooke at 7:03 AM on May 27, 2005
i think you could do it with javascript and xsl, but it's going to be a fair amount of work and require a good browser - can you specify that everyone has to use firefox, for example?
you would associate an xsl stylesheet with the xml data. the browser would transform the xml using the xsl and display it. so the xsl would need to take the xml and generate an html page with editable elements for all the xml data. then to transmit the data back you need to go the other way. that should be possible using javascript and another xsl stylesheet, i guess, but as far as i know it's not standard (in other words, the xml to html bit is what xsl "was made for", but going backwards (so you'd need to be careful it was xhtml and not just html) is not something people normally do).
you might also look at xforms - i think that may be the "right way" to do this in the future, but i don't know much about it or what support it has.
another approach would be to use xsl and xml to generate the html page as described above, but construct the html as a html form, and send the form fields back to the server. the server would then generate the xml from the form fields. that would be pretty easy to program with sax, for example. so you serve xml+xsl, the browser converts that to an html form, the user edits, then the browser posts the form fields back to the server, which re-assembles the form fields into a new xml document. the only hard part is working out a system for auto-naming the form fields so that each fields has a name that clearly identifies where it came from in the xml. that shouldn't be too hard.
one final word of warning - xsl is cool, but some (many?) find it has a very steep learning curve. it's not famouse for being user-friendly.
[i skimmed other answers but maybe i'm repeating what others said. if so, sorry.]
posted by andrew cooke at 7:03 AM on May 27, 2005
If you're using XML (or something like reST, which is transformed into XML), then you might as well use XSL:FO and RenderX's XEP to create PDFs and HTML.
You may wish to abandon Quark and use Ventura, which has both a wicked database transformation tool and a decent XML transformation tool; or Framemaker, which does XML very well.
The old-style TROFF processing appears to be coming back into popularity. LaTeX is also a biggie. And, finally, it's worth noting that OpenOffice's native format is zipped XML, which offers many possibilities for transformation.
posted by five fresh fish at 8:50 AM on May 27, 2005
You may wish to abandon Quark and use Ventura, which has both a wicked database transformation tool and a decent XML transformation tool; or Framemaker, which does XML very well.
The old-style TROFF processing appears to be coming back into popularity. LaTeX is also a biggie. And, finally, it's worth noting that OpenOffice's native format is zipped XML, which offers many possibilities for transformation.
posted by five fresh fish at 8:50 AM on May 27, 2005
I used to generate XML for import into Quark using Movable Type when I worked at a newspaper. We only had a few necessary fields, so we didn't have to extend MT's data model, and then I could just make a template that output XML in exactly the format that was needed to import. Email me if that's something you'd wanna try.
(Disclaimer: I work for the company that makes MT now.)
posted by anildash at 12:16 AM on May 28, 2005
(Disclaimer: I work for the company that makes MT now.)
posted by anildash at 12:16 AM on May 28, 2005
That's a good idea, it would save a lot of time and its a very common platform so there should be loads of plugins available. Anil - how robust is MT when populated with 6 million articles?
posted by blag at 6:41 AM on May 30, 2005
posted by blag at 6:41 AM on May 30, 2005
This thread is closed to new comments.
Use Xml to get the data back and forth, but when some Xml comes in from the server, parse it with a server-side technology and display forms to the user that they can use to edit the text, in whatever way makes sense.
In the end, if you've got users posting data changes in a web environment, you're going to need to use forms (unless you use flash or something). Trying to get forms to work with Xsl seems like it would be quite difficult indeed, though I'm not an Xsl expert by any measure.
Also, Xsl is really just for formatting display. Even if you could successfully embed a form into Xsl, how would it actually update the Xml doc? Not to mention post it back to the server?
posted by frufry at 5:44 AM on May 27, 2005