Data is decidedly intended to be either private or public. If your data is private, you want very much to keep it that way and you’ll go to great lengths to ensure the data’s privacy. Just ask the leadership team at Twitter how they feel about their strategic notes being leaked to TechCrunch and laid bare for all the world to see. The corollary is that if your data is public, you very much want as many people to see the data as possible.
We all know that viruses can “go viral” as can chain letters, YouTube videos and other memes. But what about data – can it go viral?
Yes, it can. It’s a function of how useful, valuable or interesting the data is. At Socrata, we call this the social data effect.
The social data effect contends that the farther data is replicated and propagated from its initial source, the more intrinsically useful, valuable or interesting the data is.
Let me show you the social data effect by example.
On July 1st, the White House New Media team wrote a blog post, in which they publicly shared the salaries of all White House staffers:

A blogger from Talking Points Memo (TPM) read the White House blog and wrote a short blog post of her own, adding some editorial opinion and embedding the salary data into her own story:

The next day, a writer for GOOD / blogs saw the story in Talking Points Memo and wrote his own story, adding some editorial opinion of his own. The writer attributed the TPM story via a hyperlink. He also embedded the salary data into his post:

Within a week, the salary data had been propagated to more than 20 different websites and 100,000 people saw the data – half of which saw the data on some site other than its official and initial source – The White House blog. That’s the social data effect in action. Because this dataset was propagated it’s an intrinsically useful, valuable or interesting dataset.
2 Responses to The Social Data Effect
Leave a Reply Cancel reply
Archives
- October 2011
- September 2011
- August 2011
- July 2011
- June 2011
- May 2011
- March 2011
- January 2011
- December 2010
- October 2010
- August 2010
- July 2010
- June 2010
- May 2010
- April 2010
- March 2010
- December 2009
- November 2009
- October 2009
- September 2009
- August 2009
- July 2009
- June 2009
- May 2009
- April 2009
- March 2009
- February 2009
- January 2009
- December 2008
- November 2008
- October 2008
- September 2008
- August 2008
- July 2008
- June 2008
- May 2008
- April 2008
- March 2008
- February 2008
- January 2008
- December 2007
- November 2007
- October 2007
- September 2007
- September 2006




Very interesting. It is fantastic to see this level of transparency. I just wonder what is the best approach to capture and associate the metadata (editorial, comments, etc.) that is generated from the datasets?
An interesting feature would be a metadata panel for the player that would aggregate comments and surrounding text (like an abstract) from sites where the dataset has been republished. This would be sort of a social data graph. It would provide a way to visualize just how viral the dataset has become. Just a thought…
Kevin
I could see a real viral effect occur with interesting data, but the conclusions from the data will probably need to be either surprising or humorous.