mercredi 11 mai 2011

Nullius in verba

In a question session after a very interesting key-note speech on terminology at a recent conference, someone asked a question about term bases in machine translation. I made the point that machine translation has been around for a very long time, and that its vogue today is mainly due to Google Translate, a tool that does not implement linguistic resources such as grammatical rules and term bases. During the networking event of the same day, someone who considers himself a great expert on translation technology came up to me to tell me that I was quite wrong. A Google executive supposedly told this expert at a conference in Australia that Google Translate uses terminology. Of course, this statement could have several interpretations:
1. you can obviously feed glossaries to Google Translate as two files one in the source language, and the other in the target language, and tell Google Translate that they're translations of each other;
2. you can compare statistical analysis of parallel texts with statistical terminology extraction programs.
What you can't do, is assume that, like some other machine translation tools, Google's purely statistical algorithm has a terminological loop that compares texts that it is translating to more complex terminological information (such as information on field, genre, grammar, use etc.) or grammatical information.

 In fact, the person who contradicted me, and who apparently held this exact view, failed to provide any context. Can we assume that the reported statement was a marketing ploy, i.e. it's better to be all things to all people, if you can? I suspect so. In any case, my answer at the time was that we would just have to agree to disagree. My answer today is the motto of the Royal Society: nullius in verba, i.e. take no man's word for it. And by the way, Google is so unconcerned by debates between the statistically-oriented and the linguistically-oriented schools of machine translation, that they don't even attend the professional conferences in this field (see "Why the Machine Translation Crowd hates Google"). Why should they? What they have realized is that their huge resources of data - from web sites to books - along with their resources in terms of computing power (thousands of servers around the world), put them in a unique position to do statistical machine translation. In other words, machine translation for them is a way to generate another revenue stream from their data and their infrastructure. So, take no man's word for it.

Actually, to return to the subject of marketing, taking someone's word for something when that person works for a software company, is highly risky. Even within the same company. Years ago, I worked for a CAD/CAM software publisher. They set out to develop and market a state of the art design tool to try to take the lead in the industry. At a meeting where the development team presented the features of the product, they used animation to show what Designer, the new product, could do. The marketing team was impressed. So that's Designer, they said. Yes, answered the development team. The trouble is that neither of them meant the same thing: the marketing team thought that they had seen the finished product in action; the development team actually meant that it was their as yet unreached goal. This misunderstanding led the marketers to oversell an as yet non-existent product. A serious mistake that they paid a very high price for in the end. So, really, in technical areas, take no man's word for it.

Aucun commentaire:

Enregistrer un commentaire