Monday, January 25, 2010

Can MT help you have Twitter interactions across langauge barriers

IBM has been involved with the Centre for Next Generation Localisation (CNGL) which is is a research group funded by Science Foundation Ireland. However, this is not an ivory-tower research group. They are tacking very practical problems about how to improve translation and localisation technologies so that they can be applied to new challenges that are emerging in the modern world.

For example, many machine translation systems only perform better on long texts where the words are appearing in context, but Tweets posted to Twitter are very short with no obvious context. Furthermore, they may contain abbreviations, mis-spellings and specialised terminology. As a result the GNGL team have launched a special project to tackle the issues that arise in this specific context.

They are now looking for volunteers to test out their system and even better provide them with feedback. Ideally they would like testers who follow people who tweet in different languages from their target set (English, French, German, Spanish and Italian).

Here are the instructions directly from the author:
My name is Declan Dagger and I work on DCM3 in Trinity College with Vincent Wade. We have built an application called twanslator on top of the twitter micro-blogging network that allows you to translate tweets into different languages. Initially we are looking to collect user centric data on the capabilities and limitations of text analytics and MT in limited character/context environments such as twitter.
 
I would appreciate your help in conducting this research. You can get involved by doing the following:
  1. Go to http://www.myisle.org/twanslate and login in using your twitter account details (if you don’t have an account you can sign up at http://twitter.com).
  2. When your tweets arrive there is a simple drop down menu of languages available which the tweet can be translated in to. When you translate a tweet, a rating system is then available (thumbs up / thumbs down) to indicate whether the translation was accurate or not. You can also add comments to the translation using the comment feature. Please rate all the translations you invoke as this is critical to our research.
  3. We appreciate any feedback, comments and/or suggestions you may have on how to improve the twanslator application. To leave general feedback, please click on the “app feedback” icon on the left of the page. You can also make suggestions by tweeting them to “@myisle #twanslator”.
  4. We will be adding new features to the application over time and ask that you follow @myisle so we can keep you informed of any updates.
  5. As we need a relatively large user base to conduct our research, we would be grateful if you could suggest to your followers and colleagues to also use twanslator.
  6. We would encourage you to suggest tweeters to @myisle that you recommend following for CNGL regardless of the language they tweet in. For example “@myisle you should follow @joe_bloggs [a French expert in localisation]”.
 
Part of the MyISLE goal is to build an open research platform for CNGL members within the social networking space. As such, twanslator has been developed using web services, workflow and customisable interfaces. So for those in CNGL interested in using twitter as part of a research study, please contact me at Declan.Dagger@cs.tcd.ie and I would be more than happy to make these services available to you.


The initial user interface is somewhat rudimentary, but I am sure it will improve over time. I have sent some feedback to Declan and I would encourage all of you to do the same (the more people who ask for the same feature the more likely it will be implemented).


No comments:

Post a Comment