Sarah Booker

29/11/2010

Creating something visually stimulating from data #hhhrbi

Filed under: journalism,technical — Sarah Booker Lewis @ 12:56 pm
Tags: , , , , ,

We were quite a large group to start with, so we’ve ended up splitting in two. One group is working on scraping details of registered care homes, and I’m in a group working on information gathered but creating an interesting and informative visual.

Our first battle was making sure Scraperwiki could read our data so we could work with it.

First of all I uploaded to Google docs, but the comma separated values (CSV) scraper didn’t like it. Then when the spreadsheet was published as a web page, as suggested by  it still wasn’t happy because it wanted to be signed into Google.

Matt suggested putting the CSV onto his server, so I exported it and sent it over to him.

Francis Irving also suggested scraping What Do They Know, because it was Freedom of Information dat.

After much fiddling Matt managed to pull out the raw data by popping (pulling from the top of the list) and using a Python scraper.

It turned out the data we had was so unstructured it wasn’t possible to work with it.

After lunch we’re working on a different project.

24/11/2010

Programming for the public (@frabcus) #hhldn

Francis talking about two different stories on the internet.

It used to be the case you had to check the division list to find out how MPs voted.

Created a web scraper pulling out the information and created The Public Whip, showing how MPs voted.

Have to be a parliament nerd to understand, even when it’s broken down.

They Work for You simplifies the information even more, it tells you something about your MP.

Bring the division information together. Take a list from public whip and create a summary of how they voted.

Checking how one MP voted on the Iraq War. Voted with the majority in favour of the war on three votes and abstained from the first and then the final three. It’s almost a deal with electorate.

MP asked to have “voted moderately” removed because found it misleading. A number of MPs have complained, but checked the votes.

 

Richard Pope founder of Scraperwiki made a website after the demolition of his local pub (a fine-looking establishment called The Queen) and created Planning Alerts.com website.

It helps people access information from outside the immediate catchment area. He wrote lots of web scrapers. Example of different councils’ planning application systems.

Scraperwiki is like Wikipedia but for data. It’s a technical product for use when you’re not technical. Can look at different data scrapers and copy what others are doing without learning Pearl or Python.

Planning Alerts is being moved over to Scraperwiki. Can tag it on Scraperwiki and find information. Can find stories and in-depth information.

Can request a dataset and have something built for you.

Francis was asked,  is it legal? In the UK if it’s public data, not for sale, you can reuse it. Would take things down if asked, but it’s open stuff.

Could it be stopped? Would be ill-advised to stop people, and journalists, reading public information.

Public whip and They work for you, look at numerous votes.

Looking at ways to fund it such as private scrapers, or scrapers in a cocoon. Looking at white label for intranet use. There’s a market for data and developers who want to give data. Want to match developers with data. Currently funded by Channel4. Want to remain free for the public.

Does it make people lazy? No, it’s already published but it makes it easier. Movement of people trying to get publishers of data to change. Always a need to pull out in a variety of formats.

Running Hacks and Hackers days working together finding stories and hunting around.

Have had data scraped from What do they know site.

 

 

The Rubric Theme. Blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.

Join 3,102 other followers