Sarah Booker

29/11/2010

Data projects presented at #hhhrbi

Filed under: journalism,technical — Sarah Booker Lewis @ 6:45 pm
Tags: , ,

Transparency

How your money is really being spent.

Wanted to look at local government spending in various areas. Looked at government account figures published on Number 10 website.

Government temp spending is triple it’s own staff budget

Found page with 190,000 individual data entires.

Had someone writing in java, one in Ruby and found the data was a bit rubbish.

Date columns were not filled, or had the number of days since 1900.

Cannot trust the data, have to ask if it’s correct and can be validated.

Had a massive amount of data, tried to break through agency and temp staff. Cutting back a massive spreadsheet.

Used Zoho(?) where you can see things pretty quickly.

Visuals created once separated the costs. Need to dig deep into the data to find the quirks.

Taking home to learn the accuracy of data, structured database, other axis of investigation, getting data clean, automatically updating.

Is it worth it?
Took extensive salary data.

Put in location and job and then the function shows if it’s worth living there.

A Welsh teacher earning £45,000, not competitive.

Someone in London working as an accountant at £45,000, data showed 16 applicants per job making it a 50/50.

From the initial data service a map was created where you can choose a function, a job title and a region to find out visually whether a job is worth it. It pushes down per region.

Can also zone in to regions using a slider system.

A splendid and complex visualisation. (The winner)

Truck stops

Started with the idea of truck stops and which ones were safe.

Started looking for data on the Highways Agency site and found it wanting.

Found a map with decent truck stop sites.

Had the xml source and started to develop a scraper on Scraperwiki and got a view on Google maps.

Plotted all the points. Letter on the point shows how safe by analysing which ones had CCTV and various security measures.

Further on wanted to find out more about truck crime. Looked ast the TruckPol website and took the data from PDFs and put in a spreadsheet.

Updated the view with the information about crimes. Red ones not so great, blue are good and a purple is okay.

(Winner of the best scraper award from Scraperwiki and third place overall).

Take over watch

UK Takeover panel was the prime source of information showing all take overs in play. The aim was to create something to provide details about companies.

Had scraped data but needed to add sector and revenue to create context.

Also used Investigate.co.uk

Had a live table showing activity from the last two days.

Have different sectors and can pull information out to see what’s happening in different areas

Snow Holes

Creates a map showing areas affected by snow and see where the nearest snow hole is. (See snow hole blog)

Plantlife

How people move around the chemical world

Used Google Refine to play with the data. Pulled out the geocode to map where the companies were.

Google Fusion also used.

Top 100 chemical companies. Merged Google finance information with Isis.

Created a visual showing how sales had gone down with the chemical industry sales halving from 2007-08.

(Second place)

Advertisements

1 Comment »

  1. […] Sarah Booker: Data projects […]

    Pingback by Hacks & Hackers RBI: Snow mashes, truckstops and moving home | Scraperwiki Data Blog — 10/12/2010 @ 9:52 am | Reply


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: