Posted in Uncategorized by johnsnavely on November 15, 2012

Oh boy, it’s been a long time since I’ve sat down to write anything in this blog. But moving on…

I was reading this post on Marginal Revolution, tipped off thanks to the always interesting T on QofW

I’ll try and summarize:

Under fire about the accuracy of his predictions, Nate Silver offered a bet to his critic $1000 donation to the Red Cross. A NYTimes editor, Margaret Sullivan, called the bet “inappropriate” for a journalist. Alex Tabarrok on Marginal Revolution supported the bet and, moreover, offered a way to make similar wagers blind and thus non-partisan, and calls these sorts of bets a “tax on bullshit”– which is an awesome phrase and one which I wish I could use more often. Overall, I like the idea of responding to critics with a wager. It’s something a former poker player like Nate would find as a friendly and easy thing to do. It’s a lovely gesture of confidence. 

But I was puzzling about something else when I read the article(s). Clearly, Margaret Sullivan has assumed that there’s some sort of shared ethics for journalists and reporters of the news. But she’s also assumed that what Nate Silver does is journalism or reporting.

T says (quite rightly):

“Also, do you really think that Nate Silver is a “reporter?” I’m not sure what that label really means in this context — most reporters, I assume, assemble sources, get quotes, filter facts and assertions, and then craft a coherent story to be printed in limited space.


Silver’s running MCMC-simulations of election outcomes using probability distributions inferred from the confidence intervals (Bayesian posterior distributions, natch) he gets from executing a panel of linear regressions. That seems different from what most reporters are doing, at any time, ever–“

I’m not very good at answering questions like this. Usually I work by assuming that an answer is true, hypothetically and then guessing what might happen. 

So let’s say I’m a reporter. I’m giving an assignment to write about an event (maybe a natural disaster hitting a major city) only I don’t have the budget to actually go the the city. And I need to write the report within hours of the event. All my “sources” (data) are going to come from hundreds of thousands of twitter posts that are being produced by people who are there.

I really could use some computational tools that can help me sort through which twitter posts I should read. Which contain relevant information. Suggested organizations/chronologies for events. Help finding “most interesting” quotes. I need to do this very quickly so I can write a canonical version of what happened with poignant bits culled out, maybe even styled for the particular readership of the publication I’m working for.

A reporter’s software helper can do a lot of scraping and gathering, that’s not hard. But what it needs to do is generate a networked map of all the conversations, condensing like a cloud, around the event. And the representation needs to be structured in such a way that it can be built extremely quickly and help answer some basic but ambiguous questions… like “What happened when?”

What if more reporters were armed with better software? I wonder what the news would be like.