s/\s+/ /sg -- but if you want to treat attribute text differently (is that what you mean?), yes, you need a parser.
String normalize(String html){
html = html.replaceAll('\n', ' '); //replace with spaces
html = html.replaceAll('\t', ' '); //to preserve whitespace
Document dom = new Document(html);
dom.normalize(); //coalesce adjacent whitespace
return dom.toString(); //deprecated, use transform
}
You are not logged in, either login or create an account to post comments
posted by majick at 12:03 PM on July 19