Trados Tageditor is a tool that has been widely used for website translation. I own it myself since 2003, and have upgraded through the different versions (if there was such, I did not notice almost any difference between the two or three last upgrades) so I by now I am quite familiar with it. TagEditor disappeared with the new SDL Trados 2009, but it is still widely used by translators. There is a lot of information about it on the web, but little is commented about how and whether it is really adequate for website translation.
Yes, TagEditor can handle part of the job
The first question is whether it can handle HTML properly, and the question is a big “YES, BUT”… Let’s be clear: It does a great job in identifying the HTML tags, and making sure that you do not touch them and concentrate on the translation of the text. Provide you do not touch the tags, and leave them in the correct places, you will certainly maintain the correct page format.
Tageditor stores the bilingual file in a proprietary format, and you have also the possibility to view the source and translated texts side by side, properly formatted. Mind you, this is a *very* important feature, which explains why TagEditor has been widely used for the translation of web pages. It is really impressive to see the original format and the translated result and identify immediately where you have committed a translation error or skipped a tag. It even allows you to verify (manually, by clicking on them) whether a link is or not the same in both the source page and the translated one. In this context, kudos for this software, old as it is.
Yes, there is a “but”: The “Save target as…” function, which restores the translated text into an HTML file, does all kinds of crazy things with it.
Problems with the character sets
1. For starters, it often changes the character set. In one translation (though not always, I don’t know whether it simply doesn’t like some character sets), I had the following meta tag:
<meta http-equiv=”content-type” content=”text/html; charset=ISO-8859-1″>
But TagEditor changed it arbitrarily to:
<meta http-equiv=”content-type” content=”text/html; charset=windows-1252″>
Now, I don’t like that a piece of software tries to be smarter than I am, specially if it makes the stupid assumption that because I ran TagEditor on Windows the target HTML file would also run on Windows, which obviously is not necessarily true on the Internet.
Interference with CMS codes
It started changing “>” by “>” and “<” by “<”. Yes, that is the correct representation of these characters in HTML, but it is plainly WRONG when there is code embedded in the HTML. One of the texts I translated had tags such as “<#=customername#>” that would let a CMS-preprocessor insert the customer name at the location of such tag. But, as these tags were converted happily into “<#=customername#>”, the web page showed ultimately some very curious things.
PHP scripts cannot be translated in TagEditor
A complete PHP script (meaning everything between “<PHP” and “PHP>”) was identified as a “PHP” tag, and was not translatable. Unfortunately, the script contained text that was printed to the page on the server side, and, given that you could not edit it in the TagEditor, this text would be printed on the page in the original language. Now, in theory you can edit tags in TagEditor, bot only if the tag is in a translation segment. But a PHP tag is NEVER part of translation segment, so it is not possible to edit it. This stupid detail would be usually go undetected by most translators, unless they happened to have the tags expanded, which usually is not so because of the visual clutter they cause.
META Tags cannot be edited
Another thing I really hated was that though it did recognize META tags, the process is exactly the same as for PHP: TagEditor marks it as a tag, and there is no way to edit it. This is specially annoying with important meta tags such as the description or the keywords, where the only way to translate it was AFTER the HTML in the target language was generated.
Special characters will clash with CMS software
Finally, special characters such as “á” (which appears in many languages) were converted into their HTML equivalent (in the case of “ó”, “ó”). In theory this should be OK for plain HTML pages, except that in those cases where you actually had to paste or import the translated text back into a CMS system, the translation of a word like “adiós” (goodbye in Spanish) eventually became “adi&ós” in the HTML and showed up in the browser as “adiós;”.
To be fair, I must point out that TagEditor DID recognizes “alt” and “title” attributes in pictures, and simply hides as tags the remaining information such as height, width, etc, so it is reasonably smart and allows to edit as plain text the “alt” attributes and titles of pictures. It would be also unfair to blame it for other post-processing that a CMS might do, but it should at least have the option to disable the automatic changing of the special characters into their HTML codes.
Is there a way around?. Well, there is, in the sense that you can create a special DTD (Document Type Definition) which does not consider these elements as tags, but just simple letters, but writing such a DTD is not for the faint-hearted, takes quite a while and does not solve all issues. Another way is to edit the file as if it were simple text, disabling the tag recognition, but then you lose a significant advantage of the tool.
So ultimately yes, TagEditor is “somewhat” adequate for website translation, provided you are aware of the traps that you might fall into. And you should note that, though it does have some good points, it will not suffice: You will need to perform some post-editing (say, with a text editor or HTML-editor) to clean up things like the character set of the META tags that TagEditor refused to allow you to translate.
Though Trados TagEditor is still widely used, it has disappeared in the last version of the Trados suite. Is the latest Trados version better suited for the translation of websites? Well, we will look into that in another post…