Friday, May 10, 2013

Tweeting about local weather conditions

I am quite proud of my personal weather station and the fact that it automatically makes reports available through a dedicated page on the Weather Underground site. However, until recently I was embarrassed to admit that the weather station didn't have its own twitter feed. Yesterday, I rectified this situation and I can now boast to my fellow geeks that the station is Tweeting hourly weather reports as @LLweather which is a new Twitter account I created for the weather station. In fact it is one of a large network of private weather stations in Ireland tweeting weather reports with the #iwn hashtag - this should not be surprising because most Irish people are obsessed by weather.

Initially I tried to write a short python script to do the updates, but then I discovered that this feature was already supported by the PYWWS package that I was using to upload the data to Weather underground. There were even clear instructions available on how to configure the software on their web site (I had to upgrade to v12.10 to get it to work).

After I set it up I was surprised to notice that the weather station was apparently giving weather reports about 20-25 minutes into the future. When I investigated I found that the clock on my Tonido plug was running alarmingly fast. I tried to set up the NTP deamon to keep the clock synchronised, but this proved complicated because the plug is running a very old version of Ubuntu. I corrected the time manually and will keep an eye on it until I get around to configuring NTP properly.


Saturday, May 4, 2013

How much do you need to pay for a bike computer


I like cycling and I like gadgets so I suppose it is not surprising that I consider that it is essential to have some form of bike computer to keep track of my speed and distance. Currently I am using simple bike computers on each of my bikes which I bought for €5.99 in Aldi.

Some of my cycling friends recommend that I should consider splashing out on a Garmin 500 series model. This would give me certain neat features not present on my current model:

  • It has built in GPS so as well as tracking my speed it can also record exactly where I was cycling.
  • It allows uploading of data directly to the internet so I can analyse my performance with neat visualization tools.
  • It connects with additional sensors to track things like my hear rate and cadence so I can judge how effective my training is.
However, these features come at a price. The Garmin model costs about €250 as compared to only €6 for the Adli model. Therefore I think I will stick with my Aldi computer for the moment. In any case I can use some neat free apps for my phone to track my training in addition to the bike computer.

Wednesday, May 1, 2013

The demise of Google Reader and Google Listen


Like many other people I was surprised to hear about Google cancelling two of my favourite tools - the Google Reader service for managing your RSS feeds and the Google Listen podcast player.  Apparently the reason given is that they could not see any way to make revenue from these products, but I would have though that the users of Google Reader would have been a gold mine of data in relation to what feeds are most popular through the number of reads/favourites/shares etc..

I like the simplicity of Google Listen and you can still install Google Listen, but only if you have  saved a link to it (it no longer shows up in search results). I occasionally find that Listen will crash at random times and since Google have cancelled the project they are unlikely to fix it (I wonder why they didn’t Open Source the project like they did with Google Wave),  As a result I have switched to BeyondPod as a replacement. Overall I find it reasonably easy to use - I have paid for the Pro version, but I don't think I really use any of the Pro features so the free version is equally good from my point of view.

Apparently Google Reader site will be shutting down in July, so while there is no mad rush to get off the platform, it is still a good idea to find some other tool to replace it. The consensus on the internet seems to be that Feedly is the best alternative - my initial experiences seem to back up that claim.

Monday, April 29, 2013

How big does your city have to be in order to make Sentiment Analysis worthwhile

I wrote earlier about a solution I helped develop which allows city leaders monitor the sentiment being expressed online about their city. As we present this solution to the leaders of various cities, one of the questions that is always asked is whether their city is well known enough to generate enough mentions so that the sentiment charts will be statistically significant.

The general rule of thumb we have been using is that a city must have a population of at least .25 million in order to make the tool feasible. The thing that matters is the number of mentions of the city online (we would hope for at least 5k per week) and many times (but not always) the population of a city can be a rough guide to how many mentions that are likely to be made. Therefore I decided to run a quick test to see how many mentions I would find for a pseudo-random selection of cities with both large and small populations (some of the smaller places were not technically cities) in the first week of April this year.

This table summarises the results:

CityPopulation Mentions/week 
Loughrea5,057102
Birr5,818291
Nazareth14,1233,873
Clemmons18,627224
Bethlehem25,2663,661
Navan28,158775
Dundalk31,149898
Lorient58,1351,971
Galway75,5294,758
Cergy-Pontoise   183,430235
Bordeaux235,8919,555
Montpellier255,0806,539
Toulouse449,32811,933
Dublin527,61229,859
Boston625,087111,035
Jerusalem801,00020,767
Paris2,234,105178,406
Sydney4,627,34552,791
London8,173,194257,094
Bangalore8,474,97016,455
Moscow11,503,50149,534
Tokyo13,185,502216,606
New York19,570,261440,535
Beijing20,693,000274,062

This can be visualised by the following chart:


I think you can see that there is a correlation between city size and the number of mentions (correlation coefficient = 0.83). You can also see that Galway is getting roughly enough mentions to make sentiment analysis useful despite only having a population of 75k, while Cergy-Panoise has more than double the population but is not getting enough internet mentions to make sentiment monitoring useful.

A few examples of where the city gets a number of mentions very different from what would be predicted for their population:
  • Both Bethlehem and Nazareth get many more mentions than would be predicted by their population (e.g. they are both mentions significantly more than Navan and Dundalk which have larger populations). This is probably due to the biblical significance of the towns - in fact this is why I chose them for inclusion in the test and I don't know the names of any other towns in the middle-east with such small population.
  • Where the name of the city in the local language was different from the name in English I searched on both versions of the name. In general the local language version received more hits (e.g. there were 6.5 times as many mentions of 北京 as there were for Beijing. However, for the Israeli cities it was the other way around. For example the word "Jerusalem" got 20,711 mentions while the Arabic and Hebrew translations of the city name only had 17 and 39 mentions respectively. Perhaps this is an indication that people in other parts of the world are talking about the city much more than the locals.
  • Cergy-Ponoise only gets 235 mentions,while Lorient gets 1,971 mentions despite having a smaller population. I am not sure why this should be the case, but perhaps it is due to the fact that Cergy-Ponoise is so close to Paris that local residents consider themselves to be Parisians. Lorient has no similar large city nearby to overshadow it.
  • The statistics will vary over time.For example,if I has run my test for the 3rd week in April rather than the first, the number of mentions for Boston would have been 1,893,159 rather than 111,035 - probably due to coverage of the marathon bombing.
Notes:
  • In the case of some of the cities I chose, there are multiple cities with same name - for example, the wikipedia disambiguation page for Boston lists several cities with this name, but I only counted the population of the capital of Massachusetts (the population of the other cities would probably not be very large). 
  • Wikipedia sometimes has several different estimates of population because of ambiguity of how large an area to include. I only considered the first number listed(which is typically the smallest). In some cases the difference might be minor, but in others it would be very significant. For example the estimates of the population of Boston vary by an order of magnitude from 625 thousand to 7.6 million.

Sunday, April 28, 2013

Cycling in the Wicklow Mountains

Earlier this year I was persuaded to sign up for the Wicklow 100/200 cycle event. This event takes place in June and offers a choice of two routes, one 100km long and another 200km long. A 200km cycle would be challenging enough, but this route has the added challenge of passing over several steep climbs.Luckily you don't need commit to either distance when you enter and you are allowed change your mind at any stage until you come to the fork in the road where the two roads diverge.

I don't have much experience of cycling in the mountains so I am unsure how I would get on.Yesterday I rode over the Sally Gap for the first time. I found that I was not a strong climber and was constantly being dropped from the group as we went uphill. Luckily I had no problem catching up again when we came to a flat section, but it is looking very much like I will be opting for the 100km route in June. I will also need several training cycles in the meantime to ensure I complete it in a decent time.


View My Firsttime Cycling Over The Sally Gap in a larger map

Thursday, April 25, 2013

[xpost] Smasher now works with juniper firewalls

As many people know, I was the original developer of the smasher Sametime plugin for automatic BSO authentication. However, I have not been actively maintaining it in the last few years. The last update I did was in 2011 when I partially fixed a problem which stopped SUT and smasher working together. Whenever people ask for new features or bug fixes, I typically point them at the location of the source code and then politely suggest that if they really want their issue solved they should fix it themselves.
Recently the Böblingen lab announced that they were planning to replace all of their CISCO BSO devices with juniper ones. This caused a flurry of emails from German employees since neither smasher of any of the alternative tools work with the Juniper firewalls. I was not in a position to help because I don't have access to any of the new firewalls to test, Luckily Thomas Immel was kind enough to help out and he developed a new version 1.3.5 which apparently works with the new firewalls.
The new version of smasher is available from the same update site URL as before http://dubgsa.ibm.com/~bodonova/public/smasher/latest/ - I didn't get a chance to do any testing with this new version (I no longer use smasher myself), so just in case it causes problems for anyone the old version is still available at http://dubgsa.ibm.com/~bodonova/public/smasher/smasher-1.3.4/
I hope you enjoy (and send any praise or complaints to Thomas rather than me).

Sunday, April 21, 2013

To tri-bar or not to tri-bar? - that is the question

When I bought my racing bike through the bike to work scheme, I had 50 euro left over. The bike shop offered to give me a voucher for the unused money, but I was keen to spend it on some accessory. I asked the shop what I could get for 50 euro and I finally decided on getting tri-bars.

Tri-bars are extensions to the handlebars on a bike which allows the cyclist to take on a more aerodynamic position. They are called tri-bars because they are normally only used by participants in either a triathlon or an individual time trial.

The advantages of the tri-bar are:

  • The position of the cyclist is more aerodynamic so it is possible to cycle faster and expend less effort.
  • While using the tri-bars the cyclist will normally rest their elbows on soft pads which eliminates all strain on your arms or back.
  • You look really cool when using your tri-bars (this was probably the main motivation for me to purchase the tri-bars),
However, the tri-bars also have some dis-advantages:
  • When your hands are on the tri-bars they are quite some distance from the brakes, so sudden braking is not possible. Hence they cannot be used in traffic or when cycling in a group.
  • You have minimal steering control while using the tri-bars so they can only be used on straight road. In fact it is not even feasible to swerve to avoid pot holes while using the bars so they can't be used on poor road surfaces.
  • While it is more efficient to cycle with tri-bars, it takes some practice to get used to the different cycling position.
  • The tri-bars use up some space on the handlebars which reduces the space for attaching other accessories. 
When I initially started cycling on my new bike, I found that I hardly ever used the tri-bars and so I decided to remove them. However, when I started training for a triathlon last year I re-attached them and decided to make a concerted effort to learn how to use them. I still find that I don't use the bars very often, but I think that it is still worth having them because they don't get in the way very much when not being used.

Wednesday, February 13, 2013

Mysterious growth in visitor numbers from China

Readership statistics according to Blogspot
It is now 3 years since I started this blog and so I decided to review the visitor tracking statistics. I was pleasantly surprised to see that there was an apparent dramatic increase in the number of page views near the end of 2012.

This surprised me since I hadn't changed my blogging habits in any significant way. Therefore I decided to dig a little deeper. When I started the blog, I enabled Google Analytics tracking because the statistics provided by the Blogspot platform was much more limited. It turns out that the statics provided by each tracking platform were quite different and Google Analyitics doesn't see any  similar growth in visitors.

To understand why there might be a difference you need to understand that Google Analytics works by executing a snippet of JavaScript in the visitor's browser. Almost all browsers today support JavaScript soI would not expect many normal visits to fail to be registered. However, visits to the blog that come from automated programs won't register. It seems that there has recently been a huge leap in the number of visits to my blog that come from automated bots.

The other thing that has changed is the location that the visitors come from. When I looked at the location of my visitors from 2011 most of my visitors came from Ireland (which is not a surprise) and  I had no readers at all from China (I assume that this was because the great firewall of China was blocking access to the Blogger platform). However, I got my first visitor from China in 2012 and they now represent the a very significant proportion of the visitors (according to Blogger statistics, but not according to Google Analytics)

There is definitely something dodgy going on, because I don't think that I am suddenly popular in China. Perhaps this article from the BBC gives some clue.

Saturday, February 9, 2013

[xpost] No longer a reluctant blogger after 7 years

I recently got a note about the plans to finally shut down the old BlogCentral blogging site. This was the first internal Blogging platform deployed in IBM and although it is technically still active, in recent years employees have been encouraged to use the blogging service that is bundled in IBM Connections instead.

The old service currently gets very little traffic and so it is no surprise that the service will be turned off - however, since blogs can be a useful historical record there is a plan to migrate content from the old service to the new one. In order to reduce the load on the migration tool, the administrators asked all blog owners to review their blogs and to delete anything that they didn't think should be migrated.

This is the first social networking site that I ever used, so I was a little bit nervous to see what embarrassing rubbish I had been writing back then. I saw that my old blog was called "Brian's Braindump" to reflect that I wasn't sure what I wanted to blog about (not much has changed there), but I was surprised to see that my first blog post was entitled "A Reluctant Blogger", and it was published almost exactly 7 years ago. In this initial post I explained that I didn't really see the point in all of these social networking tools and that I was only establishing the blog due to peer pressure of colleagues telling me that this would be the next big thing and that I ought to try it out.

I suppose a lot has changed over the last seven years. While I might still struggle to explain my own motivation for using social networking tools, I definitely could no longer be described as a reluctant blogger. In fact I suppose that I have taken to the concept with the zeal of a convert. I wonder what I will be writing about in 2020 and what tools will I be using to write it?

Thursday, February 7, 2013

FixMyStreet.ie really works!

I am a big fan of the idea of applications that allow citizens to directly communicate with their local council. As a result I installed the FixMyStreet Ireland app on my phone which is a convenient interface to the FixmyStreet.ie web site which can be used to report issues to your local council. In common with similar services in other countries this is a deceptively simple application that automatically directs your problem reports to the appropriate council. Whenever, citizens spot a problem the most common reason given for not reporting it is that the person doesn't know whom to report it to. The beautiy of FixMyStreet is that it knows where to forward your report based upon what type of problem it is and where you are (it can use the location from your GPS to figure out which council is responsible for solving the issue).

My inner geek was keen to try out the app. However, I was reluctant to divert valuable council resources into fixing minor issues just so I could see whether or not the app works properly. For the last few months I have been searching for a real problem that I could report (there is never a pothole available when you need one).

There is a bridge that I needs to cycle across on my route to work each morning.  A few weeks ago I noticed a minor blemish in the road where somebody had dug up the road to lay a cable and had not repaired the road properly. Initially the problem was very minor, but the recent cold weather seemed to cause the material used to repair the road to crack and come loose. Gradually the problem became worse until there was a sizable hole in the road which made one entire lane impassable on a bike.

For a few days I got off my bike and carried it past the hole, until I suddenly remembered that this was an ideal opportunity to try out the app. Yesterday morning, I filed my first problem report describing the issue. Unfortunately I didn't properly save the picture. I suspected that the council officials would think I was exaggerating about the size of the hole so I decided to stop again this morning and update my problem report with a picture.  To my amazement I saw that the problem was already fixed.

I must publicly applaud Fingal County Council - even in this tough economic climate they are doing a great job of responding to complaints,