xsl:saint-augustine -- help me get to grips with XSLT
February 14, 2006 3:13 PM
xsl:saint-augustine -- help me get to grips with XSLT
I've been making this joke for a couple of years now. "Oh Lord, help me understand XSLT ... but not yet!".
But the time has come.
Part One: can someone help me find the best material online or in a book that made XSLT really fall into place for you? Particularly if you're a coder who was at first baffled by its syntax and kept trying to understand an XSL Template in terms of traditional programming concepts like "while" and "foreach" and "select * from foo where bar = baz". I figure that's the thing that's been holding me back. I'm trying to wrestle XSL into the stuff I already know and am comfortable with.
As an example, it drives me crazy that the code doesn't nest the way I expect it to. I want it to be like
Part Two: when you imagine XSLT transformations happening in your mind's eye, what does it look like? I'm a very visual person, and I've never really managed to get a vision which works for me of what's actually going on. I can "see" other programming languages working in my mind and it really helps. I'm hoping someone will post "XSLT is like using a ____ to do _____, you have to _____ or else you get _____" and I'll get an "Aha!" moment out of it.
I've been making this joke for a couple of years now. "Oh Lord, help me understand XSLT ... but not yet!".
But the time has come.
Part One: can someone help me find the best material online or in a book that made XSLT really fall into place for you? Particularly if you're a coder who was at first baffled by its syntax and kept trying to understand an XSL Template in terms of traditional programming concepts like "while" and "foreach" and "select * from foo where bar = baz". I figure that's the thing that's been holding me back. I'm trying to wrestle XSL into the stuff I already know and am comfortable with.
As an example, it drives me crazy that the code doesn't nest the way I expect it to. I want it to be like
while outerloop while innerloop while yetanotherloop end end endbut XSLT does that "apply-templates" thing and you're roaming around the page looking for the next bit of code.
Part Two: when you imagine XSLT transformations happening in your mind's eye, what does it look like? I'm a very visual person, and I've never really managed to get a vision which works for me of what's actually going on. I can "see" other programming languages working in my mind and it really helps. I'm hoping someone will post "XSLT is like using a ____ to do _____, you have to _____ or else you get _____" and I'll get an "Aha!" moment out of it.
The equivalent of while is for-each. You don't even have to use templates, your entire XSLT could be filled with lines like:
<for-each select="something">
<for-each select="something else">
<do stuff>
</for-each>
</for-each>
However, if you're doing anything complicated, you're going to need some form of abstraction. That's where templates come in, they serve the same purpose as functions in other languages. In fact, if you ignore apply-templates, templates work exactly like functions: they can have a name, they can have parameters (but you have to use an annoying call-template/with-param combination to call them).
So the only new thing compared to traditional languages is that you can give templates a "match" option, and that you have a apply-templates to apply any matching template (but note that even here you have some control over which templates are evaluated).
My main problem with XSLT is the incredible verbosity. A small trick I use is using namespaces like this:
<stylesheet
xmlns:a="http://www.w3.org/2005/Atom"
xmlns:h="http://www.w3.org/1999/xhtml"
xmlns="http://www.w3.org/1999/xhtml">
That way you can drop the annoying xsl: prefix, but of course you gain an annoying prefix on your output tags.
posted by reynaert at 4:06 PM on February 14, 2006
<for-each select="something">
<for-each select="something else">
<do stuff>
</for-each>
</for-each>
However, if you're doing anything complicated, you're going to need some form of abstraction. That's where templates come in, they serve the same purpose as functions in other languages. In fact, if you ignore apply-templates, templates work exactly like functions: they can have a name, they can have parameters (but you have to use an annoying call-template/with-param combination to call them).
So the only new thing compared to traditional languages is that you can give templates a "match" option, and that you have a apply-templates to apply any matching template (but note that even here you have some control over which templates are evaluated).
My main problem with XSLT is the incredible verbosity. A small trick I use is using namespaces like this:
<stylesheet
xmlns:a="http://www.w3.org/2005/Atom"
xmlns:h="http://www.w3.org/1999/xhtml"
xmlns="http://www.w3.org/1999/xhtml">
That way you can drop the annoying xsl: prefix, but of course you gain an annoying prefix on your output tags.
posted by reynaert at 4:06 PM on February 14, 2006
I used to word "annoying" too much. But that's what XSLT is.
posted by reynaert at 4:07 PM on February 14, 2006
posted by reynaert at 4:07 PM on February 14, 2006
Thinking of it in terms of recursion helped a bit for me, although I like the tree image too.
posted by SuperSquirrel at 4:07 PM on February 14, 2006
posted by SuperSquirrel at 4:07 PM on February 14, 2006
I had to do some XSLT for a school project last year and I was having the same issues you were. It confused the hell out of me too.
I downloaded the trial version of Stylus Studio XSLT Editor and finished the stuff I had to do in a few hours, plus it helped me understand what was going on.
I'm no expert on XSLT now, but I'm not completely lost anymore.
posted by closetgeekshow at 6:52 PM on February 14, 2006
I downloaded the trial version of Stylus Studio XSLT Editor and finished the stuff I had to do in a few hours, plus it helped me understand what was going on.
I'm no expert on XSLT now, but I'm not completely lost anymore.
posted by closetgeekshow at 6:52 PM on February 14, 2006
XSL feels really backwards at first, but it's worth the effort.
There are two ways to write XSL templates:
Top-down: you write one massive template which matches the root node and contains lots of logic (using for-each, choose, etc.) to handle any sub-nodes. You want to use this approach only when the input structure is very predictable (i.e. you know every [foo] contains only [bar]s), and when you need to completely restructure it for output.
Bottom-up: you write lots of separate templates to transform individual nodes, and use a generic pass-through template along the lines of
to deal with everything else (and generally include a [xsl:apply-templates/] somewhere in each of your other templates to make sure the parser gets hold of any sub-nodes.) This is most useful when you only need to tweak some of the input and leave the rest as is, or when the structure of the input is unpredictable and you need to allow for [foo] to contain [bar], [baz], another [foo], arbitrary xhtml, or some new tag you don't know about yet.
XSLT novices tend to default to the top-down approach all the time, and try to treat templates as though they're function calls because that's how other languages work and they don't really "get" xsl yet. This is where you (and obviously reynaert) are getting stuck. Sometimes the top-down approach is appropriate, but more often it's the long way around; you wind up doing a lot of work that the parser will do for you automatically if you let it.
Often you'll wind up using a combination of both approaches, but even in those cases I find that separating out my code into "this is top-down stuff, which starts at node X and does a bunch of stuff to its contents" and "this is bottom-up stuff, which just works on a single node at a time and passes off anything it doesn't care about to another template" is a really useful way to keep things organized.
When I'm starting a new XSL task, I almost always begin with the pass-through template above. (If that's the only template you use, your output will be identical to your input.) Then I generally write the simplest templates first, for nodes at or near the bottom of the input xml tree. Then I work my way up from there. Only when I know the input is absolutely regular in structure do I bite off a big chunk to do as a top-down template; otherwise that chunk will have to contain a ton of logic to deal with all the possible structures that could be contained in it.
One hint: if your templates are getting really convoluted, or if you're writing lots of multiple templates to handle the same type of node, you're probably not using XPath effectively. A well-chosen square-bracket clause in a "match" attribute can save you a ton of if/then/for-each/etc work.
Finally, you must, must, must be absolutely 100% comfortable with recursion to do anything nontrivial with XSLT. I don't know of any good way to learn recursion other than to pound your head against the problem until it sinks in. XSLT is an excellent environment in which to do that pounding :)
Good luck, and don't get discouraged: it's seriously worth the effort; XSL is one of those tools that once you finally learn how to use it, you'll want to use it for everything.
posted by ook at 7:38 PM on February 14, 2006
There are two ways to write XSL templates:
Top-down: you write one massive template which matches the root node and contains lots of logic (using for-each, choose, etc.) to handle any sub-nodes. You want to use this approach only when the input structure is very predictable (i.e. you know every [foo] contains only [bar]s), and when you need to completely restructure it for output.
Bottom-up: you write lots of separate templates to transform individual nodes, and use a generic pass-through template along the lines of
[xsl:template match="*"]
[xsl:copy]
[xsl:copy-of select="@*"/]
[xsl:apply-templates/]
[/xsl:copy]
[/xsl:template]
to deal with everything else (and generally include a [xsl:apply-templates/] somewhere in each of your other templates to make sure the parser gets hold of any sub-nodes.) This is most useful when you only need to tweak some of the input and leave the rest as is, or when the structure of the input is unpredictable and you need to allow for [foo] to contain [bar], [baz], another [foo], arbitrary xhtml, or some new tag you don't know about yet.
XSLT novices tend to default to the top-down approach all the time, and try to treat templates as though they're function calls because that's how other languages work and they don't really "get" xsl yet. This is where you (and obviously reynaert) are getting stuck. Sometimes the top-down approach is appropriate, but more often it's the long way around; you wind up doing a lot of work that the parser will do for you automatically if you let it.
Often you'll wind up using a combination of both approaches, but even in those cases I find that separating out my code into "this is top-down stuff, which starts at node X and does a bunch of stuff to its contents" and "this is bottom-up stuff, which just works on a single node at a time and passes off anything it doesn't care about to another template" is a really useful way to keep things organized.
When I'm starting a new XSL task, I almost always begin with the pass-through template above. (If that's the only template you use, your output will be identical to your input.) Then I generally write the simplest templates first, for nodes at or near the bottom of the input xml tree. Then I work my way up from there. Only when I know the input is absolutely regular in structure do I bite off a big chunk to do as a top-down template; otherwise that chunk will have to contain a ton of logic to deal with all the possible structures that could be contained in it.
One hint: if your templates are getting really convoluted, or if you're writing lots of multiple templates to handle the same type of node, you're probably not using XPath effectively. A well-chosen square-bracket clause in a "match" attribute can save you a ton of if/then/for-each/etc work.
Finally, you must, must, must be absolutely 100% comfortable with recursion to do anything nontrivial with XSLT. I don't know of any good way to learn recursion other than to pound your head against the problem until it sinks in. XSLT is an excellent environment in which to do that pounding :)
Good luck, and don't get discouraged: it's seriously worth the effort; XSL is one of those tools that once you finally learn how to use it, you'll want to use it for everything.
posted by ook at 7:38 PM on February 14, 2006
What does it look like? Well, I think of it like Atari Missle Command.
Imagine a swarm of data in the sky bearing down on those Missle Command canons, The pattern matches gunning up and attacking every pattern they identify as branches explode and dissapear in a perfect white circle. Sometimes they'll convert child branches only to have to parents gunned down and the children are left to fall. It's hell. XSLT is hell.
Other parts of XSLT remind me of this. I don't just mean writing recursive XSLT, but writing XSLT that generates XSLT that generates XSLT unto forever.
Keep with it. XSLT is fantastic once you get it. You've read all of the Dave Pawson FAQ?
posted by holloway at 7:44 PM on February 14, 2006
Imagine a swarm of data in the sky bearing down on those Missle Command canons, The pattern matches gunning up and attacking every pattern they identify as branches explode and dissapear in a perfect white circle. Sometimes they'll convert child branches only to have to parents gunned down and the children are left to fall. It's hell. XSLT is hell.
Other parts of XSLT remind me of this. I don't just mean writing recursive XSLT, but writing XSLT that generates XSLT that generates XSLT unto forever.
Keep with it. XSLT is fantastic once you get it. You've read all of the Dave Pawson FAQ?
posted by holloway at 7:44 PM on February 14, 2006
I forgot to include the visual analogy. Dunno if this'll help, but it's how I visualize it:
Think of your input XML as a big protein molecule, and all your input templates as little enzymes that are floating around and want to latch onto bits of that protein and do something to it. The "match" attribute on the template controls what parts of the protein the enzymes can latch onto.
Some of those enzymes will prevent any others from touching any part of what they've grabbed -- that'd be a top-down template -- while others will expose branches of what they're working on (using [xsl:apply-templates /]) for other enzymes to deal with.
I know, that's probably way too science-geeky to be helpful, and not nearly science-geeky enough to satisfy someone (unlike me) who actually understands enzymes and proteins. Shrug.
posted by ook at 7:49 PM on February 14, 2006
Think of your input XML as a big protein molecule, and all your input templates as little enzymes that are floating around and want to latch onto bits of that protein and do something to it. The "match" attribute on the template controls what parts of the protein the enzymes can latch onto.
Some of those enzymes will prevent any others from touching any part of what they've grabbed -- that'd be a top-down template -- while others will expose branches of what they're working on (using [xsl:apply-templates /]) for other enzymes to deal with.
I know, that's probably way too science-geeky to be helpful, and not nearly science-geeky enough to satisfy someone (unlike me) who actually understands enzymes and proteins. Shrug.
posted by ook at 7:49 PM on February 14, 2006
Here are some things I learned in a year of doing intense XSLT work:
1. You almost never want 'for-each'.
2. Each time you edit your stylesheet it should get shorter, not longer.
3. You probably don't want 'choose' or 'if' either.
4. Understand when to use 'mode'.
ook is totally right about the importance of using XPath effectively. You want lots of match templates, with very specific XPath statements. I think applying templates with modes is better than conditional branching, if it comes to that. Try to make each template as atomic as possible, and let as much functionality pass through to lower templates passively.
posted by nev at 8:14 PM on February 14, 2006
1. You almost never want 'for-each'.
2. Each time you edit your stylesheet it should get shorter, not longer.
3. You probably don't want 'choose' or 'if' either.
4. Understand when to use 'mode'.
ook is totally right about the importance of using XPath effectively. You want lots of match templates, with very specific XPath statements. I think applying templates with modes is better than conditional branching, if it comes to that. Try to make each template as atomic as possible, and let as much functionality pass through to lower templates passively.
posted by nev at 8:14 PM on February 14, 2006
Zvon.org and D. Pawson's FAQ did it for me. Things really clicked once I understood it's effectively a Functional Language, and did some reading on what FLs are all about.
Basically, throw out all you know from previous programming experience: XSL uses a different paradigm.
posted by five fresh fish at 9:21 PM on February 14, 2006
Basically, throw out all you know from previous programming experience: XSL uses a different paradigm.
posted by five fresh fish at 9:21 PM on February 14, 2006
Also, I found it very useful to use LEO, the Literate Editor with Outlines, to write XSL.
posted by five fresh fish at 9:28 PM on February 14, 2006
posted by five fresh fish at 9:28 PM on February 14, 2006
Thank you all for your help. I marked those two Best Answers because the top-down vs bottom-up approach was very helpful and did indeed give me something of an "Aha!" moment, and because the Missile Command image was both useful and amusing.
I'm on Mac OS X by the way, and finding TestXSL, a straightforward application by Marc Liyanage, very useful.
posted by AmbroseChapel at 1:40 PM on February 15, 2006
I'm on Mac OS X by the way, and finding TestXSL, a straightforward application by Marc Liyanage, very useful.
posted by AmbroseChapel at 1:40 PM on February 15, 2006
I'm late, but one of the more valuable "aha!" moments I had with XSLT was the realization of how to do inheiritance. I once developed a site for which every single page was rendered using the same XSLT stylesheet (it seemed like a good idea at the time). The differences between pages were generated by template matching against variations in the input XML. I later realized that its much better to have many different XSLT stylesheets that import (not include!) a common base one, and simply override the relevent templates (thanks to import precedence), such as the template for the main content area. If you're able to choose the applicable stylesheet dynamically, this is much more likely to be the correct way to go.
posted by gsteff at 10:00 PM on February 15, 2006
posted by gsteff at 10:00 PM on February 15, 2006
This thread is closed to new comments.
posted by jeb at 3:35 PM on February 14, 2006