I struggled a bit when trying to parse some XHTML with Groovy's XmlSlurper (and XmlParser). I was receiving the following:
Caught: java.io.IOException: Server returned HTTP response code: 503 for URL: http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd
It turns out that the guys from W3C got sick of dealing with the excessive traffic for their DTDs. So now they return a Service Unavailable (HTTP 503) if they detect parser requests.
To solve the problem I had to set the loading of external DTDs to false. Here's the code.
def slurper = new XmlSlurper()
slurper.setFeature("http://apache.org/xml/features/nonvalidating/load-external-dtd", false)
def results = slurper.parseText(htmlResponse)
Googling for the answer wasn't extremely helpful. This blog post helped (I think it's in Japanese). This post also helped. Thanks guys!
I decided to re-post the solution since it took me awhile googling for the answer.
1 comment:
10x! Found your posting by Google
Post a Comment