basic scraping (with python): the source of the page doesn't have the info
I see in the browser
This is my first webscraping attempt, so please bear with me. (I put
Python in the title because that's the language I'm familiar with, but my
answer is more general than that, so please feel free to answer in
whatever language you'd like)
I'm trying to collect a few flight prices and travel times from Kayak
(http://www.kayak.com/flights). Say, I'd like to collect info on prices
from LA's LAX airport to New Yorks' LGA airport.
The Kayak website seems very convenient in that you write an address to
show results. For example:
http://www.kayak.com/flights#/LAX-LGA/2013-12-13/2014-01-05
gives prices for travels initiated on 2013-12-13 and with a return on
2014-01-05
I load it like this:
from urllib2 import urlopen
import BeautifulSoup as bs3
soup = bs3.BeautifulSoup(urlopen(
"http://www.kayak.com/flights#/LAX-LGA/2013-12-13/2014-01-05").read())
but then I don't see any price information! I checked the source of the
page too on Google Chrome and couldn't figure out where the prices and
travel times are.
No comments:
Post a Comment