Archive for category playground

Open Data Eindhoven – Apps for Eindhoven

I recently submitted my ‘app’ (hate the word) called ‘Buurtvergelijker’ for the Apps for Eindhoven Challenge. For those who don’t know, Eindhoven is situated in the south of The Netherlands and it’s the place where Philips, PSV and Arie Ribbens have their roots. The Challenge is one of the many ‘Apps for <some city>’ challenges that areĀ  popping up all over the globe these days. I worked on it for about a week and would have worked on it much longer if the closing date hadn’t prevented me from doing so.

More details can be found through the application itself. Only in Dutch unfortunately.

Edit on July 3, 2011:

I won the competition. See here (in Dutch).

No Comments

Mozilla Open Data Visualization competition


You can read about the competition as an introduction.

Or go straight to the visualization.

I started looking at the answers that the Mozilla team was looking for, but at the same time I tried to find what kind of information I would be interested in myself. One of the questions that the Mozilla team wants to be answered is:

Do people who use more tabs use more bookmarks or fewer bookmarks?

Inspired by that question, I was wondering if a lot of people are like me: hording bookmarks but never use them. Or perhaps there are people that diligently structure their bookmarks and by doing so can have their surfing needs satisfied by only selecting bookmarks. And can we identify different bookmarking habits by each age group?

Preprocessing the data

Data per user session

To be able to get a grip on the data I decided to extract the following parameters for each user session. (I define a user session as the user actions between starting up the browser and browser close/crash.):

  • duration (in milliseconds)
  • inactive time (in milliseconds)
  • number of stored bookmarks at browser start up
  • number of bookmark folders at browser start up
  • maximum folder depth at browser start up
  • number of selected bookmarks
  • number of tabs at startup
  • number of windows at startup
  • the event type of the last event (usually ‘browser shutdown’, but this is different if the browser crashes (?))
  • is the browser restarted after a crash

Data per user

Then for each user, I aggregated all sessions to a single record describing for each user:

  • average session time
  • average number of selected bookmarks per hour
  • average number of stored bookmarks at browser start up
  • average number of folders at browser startup
  • average bookmark depth
  • average number of tabs at start up
  • the age group to which the user belongs

Data per number of bookmarks interval

Finally, based on the individual user statistics, I categorized all users into categories that are based on the average number of bookmarks . The categories are power of two intervals, i.e.:

  • 1 bookmark
  • 2 up to 4 bookmarks
  • 4 up to 8 bookmarks
  • 8 up to 16 bookmarks
  • 16 up to 32 bookmarks
  • and so on


People with more bookmarks use more bookmarks

Sounds pretty obvious, but if I look at myself: I do a lot of bookmarking, but (I think) I almost never use them.

Fig. 1

On the horizontal axis we see the number of stored bookmarks. On the vertical axis we see the number of selected bookmarks per hour. Clearly, the line is ascending. There is a big orange bubble in the category 4096 up to 8192 bookmarks. Using the tooltip, we can see that the bubble relates to 3 users. The application gives you the possibility to ignore small groups. By using one of the controls, we can hide all bubbles that correspond to 6 users or less. We then get a different picture. See Fig. 2.

People with more bookmarks have more tabs open at browser start up

The big orange bubble of Fig. 1 is filtered out and the remaining bubbles scale to fit between the minimum and maximum bubble radius. We see (by examining the bubble size) that people with more bookmarks have more tabs open at browser startup.

Fig. 2

We see the same pattern if we don’t use the bubble size to indicate the number of open tabs, but if we use the vertical axis, see Fig. 3. In Fig. 3 we choose that the bubble size corresponds to the number of users. So we can see that most users (in the selected age groups) have between 8 and 256 bookmarks.

Fig. 3

Younger people use bookmarks more often

Using the controls at the lower right, we can set the transparency of each series. If we select the series that correspond to users of age:

  1. 18 to 25 years old
  2. over 55 years old

We clearly see that the younger age group uses more bookmarks (green bubbles are positioned higher, albeit with the top purple bubble as an exception).

Fig. 4

People with more bookmarks surf longer

Differences are small, but you can see a trend of longer session times (time between starting up a browser and closing it down) of people with more bookmarks. See Fig. 5.

Fig. 5

Most users in age group 18-25, increasing bookmark usage with increasing number of stored bookmarks

A difficult (and incomplete) title. But that’s why we need visualizations: explain it with images! This is one of my favorite images of this series. You can clearly see the different age group trends:

  • Most users in age group 18-25
  • Number of users decrease with each step to the next (older) age group
  • bookmark selection increases as the number of stored bookmarks increase


During development/analysis, I used Adobe Flash Builder 4, Flex SDK 4.1, Adobe AIR and SQLite to do all parsing and visualization.

The final product is a web application and only uses Flex SDK 4.1 (with an xml-file as input). Right-click above the application to view/download the source code.


General meteorlogical data viewer

Here another result of my experiments with weather data. This visualization is not very useful yet, but maybe you will find it interesting. I made a simple parsing algorithm to filter out the geografic locations of the weather stations. I have to improve that parsing, since makes some mistakes, e.g. ‘ZEE’ is actually ‘WIJK AAN ZEE’. Hopefully the KNMI can make the data files a bit more parsing friendly in the future.

I’m planning on making year comparisons. So you can see, for example, the temperature change over the years.

Get Adobe Flash player

No Comments

Wind speed and wind direction visualization

Recently I found out that the KNMI publishes their data on the internet. I browsed around a little and downloaded some data.

As the title says, this visualization shows wind speed and wind direction. The country is The Netherlands and each arrow corresponds to a weather station.

The direction of an arrow represents the wind direction (DD). I decided to let the arrows point 180 degrees from the wind direction, i.e. if the wind direction is north, the arrow points towards the south. The size of each arrow represents the wind speed (FF).

These are the official variable descriptions:

DD = Windrichting (in graden) gemiddeld over de laatste 10 minuten van het afgelopen uur / Mean wind direction (in degrees) during the 10-minute period preceding the time of observation (360=noord/north, 90=oost/east, 180=zuid/south, 270=west, 0=windstil/calm 990=veranderlijk/variable)
FF = Windsnelheid (in 0.1 m/s) gemiddeld over de laatste 10 minuten van het afgelopen uur / Mean wind speed (in 0.1 m/s) during the 10-minute period preceding the time of observation

If DD equals 990, the chart shows a red circle with the radius in proportion to the wind speed (although this always seems to be equal to 10).

Get Adobe Flash player

, ,

No Comments