Skip to content Skip to sidebar Skip to footer

Phantomjs Download Using A Javascript Link

I am attempting to scrape the below website: http://www.fangraphs.com/leaders.aspx?pos=all&stats=bat&lg=all&qual=0&type=8&season=2011&month=0&season1=20

Solution 1:

What have worked very well for me is simulating mouse clicks on the desired element.

page.evaluate(function () {
  var btn = document.getElementById('LB_cmdCSV')
  var ev = document.createEvent('MouseEvent')
  ev.initEvent('click', true, true)
  btn.dispatchEvent(ev)
})

Solution 2:

Couldn't you just run the code, __doPostBack('LeaderBoard1$cmdCSV','');, within the context of the webpage?

Something like this:

page.evaluate(function() {
  __doPostBack('LeaderBoard1$cmdCSV','');
});

I haven't tested this code within PhantomJS, but theoretically it should work since running the __doPostBack method from Google Chrome's developer console worked. If in doubt about running JavaScript code in PhantomJS, Google Chrome's developer console is a great way to test out the code as it runs on WebKit like PhantomJS. I hope this helps.

Solution 3:

It's an ASP powered website so this is going to be a tad trickier than most and you will have to use cURL commands to mimic POSTing the entire form viewstate & eventvalidation strings back to server. Probably just be easier just to lift the data straight out of the page you have.

Solution 4:

I'm using Ruby on Rails and Watir Webdriver (https://github.com/watir/watir-webdriver).

I have identified that the tool using the ASP.NET when using the "doPostBack" identical browser used by the User Agent defined by the customer. When using PhantomJS the user agent is identified as something "Mozilla/5.0 (Unknown; Linux i686) AppleWebKit/534.34 (KHTML, like Gecko) Safari/534.34 PhantomJS/1.9.1".

Therefore it is necessary to change the user agent client before accessing the page. Rails and did something like:

HTTP_USER_AGENT    = "Mozilla/5.0 (X11; Ubuntu; Linux i686; rv:22.0) Gecko/20100101 Firefox/22.0"
HTTP_DRIVER        = Selenium::WebDriver.for:phantomjs, :desired_capabilities => Selenium::WebDriver::Remote::Capabilities.phantomjs(
  "phantomjs.page.settings.userAgent" => HTTP_USER_AGENT
)
...
browser = Watir::Browser.new HTTP_DRIVER, :http_client => client

Post a Comment for "Phantomjs Download Using A Javascript Link"