Groovy web page scraping the easy way. I found this from an example on http://www.maclovin.de/2010/02/robust-html-parsing-the-groovy-way/ and it works quite well even today on Grails 2. This uses Tag Soup 1.2.1 and Groovy's XMLSlurper.
In about 10 lines if code I can scrape the form fields (this one only does inputs and selects) off a web page:
def tagsoupParser = new org.ccil.cowan.tagsoup.Parser()
def slurper = new XmlSlurper(tagsoupParser)
def htmlParser = slurper.parse(config.clientUrl)
ArrayList inputs = new ArrayList();
htmlParser.'**'.findAll{
it.name() == 'input' || it.name() == 'select'
}.each {
if (it.attributes().get(id) ) {
inputs.add(it)
}
}
No comments:
Post a Comment