Ensuring Well-Formed Markup
We’ve all seen bad markup in our day. I’m talking about markup that
doesn’t bother to close tags. I’m talking about putting
<p> tags, and putting content into an
Nokogiri corrects bad markup like a boss, similarly to how a browser would before rendering.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
And Nokogiri will even keep track of what the errors were, if the parse option NOERRORS and NOWARNINGS are turned off (the default for XML documents).
Thus, you could use
errors.empty? to determine whether the document was well-formed.
Being friendly and fixing markup is all well and good, but sometimes you need to be a Markup Nazi.
If you demand compliance from your XML, then you can configure Nokogiri into “strict” parsing mode, in which it will raise an exception at the first sign of malformedness:
1 2 3 4 5 6