doPostBack is not as scary as people think. It basically takes 2 arguments, sets the form values and submits the form. I'll make it simple by monkey patching it into Mechanize::Form
require 'mechanize' class Mechanize::Form def postback target, argument self['__EVENTTARGET'], self['__EVENTARGUMENT'] = target, argument submit end end
The rest is simple. Find the 'Next' link, parse out the values and send them to Form#postback. Put it in a while loop and you've got paging.
agent = Mechanize.new page = agent.get 'http://data.fingal.ie/ViewDataSets/' while next_link = page.at('a#lnkNext[href]') puts 'I found another page!' target, argument = next_link[:href].scan(/'([^']*)'/).flatten page = page.form.postback target, argument end
The result is much cleaner than what I've seen from the python side. Ruby's mechanize is sophisticated enough to avoid all of the many pitfalls of its python counterpart. No wonder I like it so much!
No comments:
Post a Comment