http://environment.yale.edu/envirocenter/
This blogging on the Yale Center for Environmental Law & Policy site discusses issues arising from our recent study of linkages between trade and the environment.
I will try to blog about things that interest me, including data sources and data analysis, travel tips (hot restaurants are good data, right?), and more.
Wednesday, August 17, 2011
Tuesday, August 16, 2011
Fantasy Football 2011
It's that time of year again! Yesterday I scraped some ranking and points projection data from http://fftoolbox.com.
I was interested in how the projected points declined with rank, across the player positions. The plot, below, helps explain why running backs are selected ahead of wide receivers, for example: the decline in production of wide receivers is much more shallow than for running backs. You get hurt less (in expectation) by taking lower-ranked wide receivers than you do by taking lower-ranked running backs. What I'd really like to do is integrate weekly variation into the analysis... but this requires a more substantial data scrape than I had time for.
I was interested in how the projected points declined with rank, across the player positions. The plot, below, helps explain why running backs are selected ahead of wide receivers, for example: the decline in production of wide receivers is much more shallow than for running backs. You get hurt less (in expectation) by taking lower-ranked wide receivers than you do by taking lower-ranked running backs. What I'd really like to do is integrate weekly variation into the analysis... but this requires a more substantial data scrape than I had time for.
Monday, August 15, 2011
Using "Google Docs" to scrape HTML tables from web pages
One of my students suggested I try this... so I did. In Google Docs, create a new spreadsheet. In the first cell, type something of the form:
My first attempt was scraping some fantasy football points projections:
=ImportHtml("http://www.fftoolbox.com/football/2011/cheatsheets.cfm?player_pos=QB", "table", 0)
Bingo. At least, it worked for me on the 8 pages I tried. I used 0 as the third argument because some web page recommended it.
I could see using this for data scrapes when a small number of pages are involved, but for more advanced scrapes that require automation I'll continue to use R.
=ImportHtml("http://the-url-goes-here", "table", 0)
My first attempt was scraping some fantasy football points projections:
=ImportHtml("http://www.fftoolbox.com/football/2011/cheatsheets.cfm?player_pos=QB", "table", 0)
Bingo. At least, it worked for me on the 8 pages I tried. I used 0 as the third argument because some web page recommended it.
I could see using this for data scrapes when a small number of pages are involved, but for more advanced scrapes that require automation I'll continue to use R.
Subscribe to:
Posts (Atom)