## testing buzz data

on Thursday, October 20th, 2011 2:44 | by Julien Colomb

looking for a platform to share our trajectory data, I got interested in buzz data. They started a little contest, and since it seemed to be a nice way to test the functionality of the web site, I participated.

I downloaded the dataset of water consumption in Canada for the last years. The data is split by year and ward. I focused on the total water consumption and ran a little analysis.

A simple ANOVA shows that the total consumption is dependent on the year, the ward and the combination of the two (this means that the differences between years consumption is not equivalent in the distinct wards). In the visualization, one can see that the mean consumption decreased in the last three years (black dots are means of consumption for each year). In the trace for each ward, we can pick particular wards with specific results. For instance ward 11 had a huge increase in consumption in 2002; and ward 2 decrease its (relative to the other very important) consumption every year since 2002.

anova results

Df Sum Sq Mean Sq F value Pr(>F) data$Year 1 6.2445e+13 6.2445e+13 110.0587 < 2.2e-16 *** data$Ward 43 5.1300e+15 1.1930e+14 210.2682 < 2.2e-16 *** data$Year:data$Ward 43 1.0535e+14 2.4500e+12 4.3181 2.399e-15 *** Residuals 396 2.2468e+14 5.6738e+11 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1------

The test show that it is quite easy to take data and perform its own analysis. Buzz data is not (yet?) very good in updating data, since there is no way to directly add data (in this case, once the 2011 data come, it will be needed to download the dataset, add the new data and reupload the new dataset. Not very convenient.

Hi Julien,

Thanks for writing about BuzzData, even if what we’re building is solving a different set of problems than what you’re hoping for.

For us it is still early days as we finalize what we consider to be our initial offering. Our strategy from the beginning was to not attempt to re-invent what people can already do with Excel or Google Spreadsheets. I appreciate that you wish to be able to update data right on the site, and we are working on that.

However, I’m even more excited about being able to provide even more convenient ways to keep your data syncronized. Ideally, you shouldn’t have to go to a website to update your data, and we’re working hard to make it so that changes you make locally will be updated on the web, and anyone following your dataset will be notified of the changes as well.

That is when things start to get really interesting. In the meantime, we appreciate your patience as we continue to build the product. Keep the suggestions coming!

Pete

How about FigShare? I’ve already deposited data and a figure there…