Sarah Booker

28/09/2011

Following Hacks/Hackers London

Filed under: journalism,Storify — Sarah Booker Lewis @ 8:15 pm
Tags: , , , ,

As I couldn’t make it tonight I created a Storify from Hacks/Hackers London.

Brilliant #HHLdn opening from @timesjoanna about why the event helps newsroom journalists get digital training they might not get elsewhere
currybet
September 28, 2011
Impressed that ten people in #HHLdn just put their hands up to say they were online in 1993
currybet
September 28, 2011
Here is the original article about Scientology that Wendy M. Grossman is talking about at #HHLdn *cough* with link http://t.co/GgbLWUaw
currybet
September 28, 2011
At #HHLdn, @wendyg is talking about “astroturfing” – 1995 style – http://t.co/1uJFZLh5
currybet
September 28, 2011
Great to hear a story about a journalist trying to get to grips with computers, anonymity & the interwebs in the early 90s #HHLdn – @wendyg
currybet
September 28, 2011
Fabulous trip down memory lane at the hacks/ hackers event. Early days of internet hacking – and I was there :-)
joannejacobs
September 28, 2011
Chris Sumner starts by saying that Maltego is great if you’re in to stalking :-) #hacks/hackers
joannejacobs
September 28, 2011
At #HHLdn @TheSuggmeister is showing Maltego – http://t.co/AzSySuK7 – a way of visualising Twitter networks
currybet
September 28, 2011
Another great #HHLdn phrase from @TheSuggmeister – “I’m willing to share the code for free”
currybet
September 28, 2011
At #HHLdn, @TheSuggmeister is suggesting that it has become a lot harder to search the archives of Twitter. He also just said Perl & STDOUT
currybet
September 28, 2011
Also pitched at #HHLdn, some great looking courses from the centre for investigative journalism – http://t.co/DbGThENi
currybet
September 28, 2011

29/11/2010

Introduction to ScraperWiki #hhhrbi

Filed under: journalism,technical — Sarah Booker Lewis @ 10:44 am
Tags: , , , , , , , ,

Francis Irving of Scraperwiki explains how it works.

Take the Gulf oil spill. You can find a list of oil fields around the UK, but it’s all in a strange lump.

He shows a piece of Python code reading the oil field pages and turns it into a piece of data.

It’s quite simple to make a map view, but also code to make more complicated views.

Scraperwiki is automatic data conversion.

 

Scrape internet pages, Parser it, organise it, collect it and model it into a view. It will keep running and give the dataset constantly.

 

There are two kinds of journalism to use with the data. You can make tools, specific tools and find a story.

In Belfast took a list of historic houses in the UK. The data scraper looked through a host of websites, using Python, can use Ruby.
There are a multitude of visuals available. The Belfast project showed a spike in 1979, this was explained due to a political sectarian issue.

Answering a question, Francis confirms you can scrape more than one website at a time.

Francis would like to see more linked data and merging datasets together.

Asked about licensing for commercial use. Francis says it’s mainly used for public data. Scraperwiki blocks scraping Facebook because it’s private data, but the code can be adjusted.

Interested areas for projects today are: farming, local government budgets, public sector salaries, mapping chemical companies and distributors, environment, transport, road transport crime, truckstops map, energy data, countryprofile link to carbon emissions, e-waste, airline data, plastics data, empty shops, infotainment to make user interested in the data, another visualisation on companies ranking based on customer reviews, using the crowd to share information with data and create interesting information, data annotating content and enriching content, health data… and anything else we’re doing.

 

The Rubric Theme. Blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.

Join 3,102 other followers