TAUS Data Association

Wednesday
Mar 17th
Text size
  • Increase font size
  • Default font size
  • Decrease font size

The more domain specific data the better

Pilot conducted by Cross Language

TreeQuality of output is highlighted as the main barrier to greater adoption of machine translation. This study illustrates how training statistical machine translation (SMT) engines using domain specific data improves output quality. The data set used for this test is less 15% of that used in the Microsoft pilot project.



TDA members are seeking quality improvements to better meet the demand for real-time translation as well as to need increase productivity rates in the industry.

The first stage of this pilot highlights a higher SMT quality output from the engine built with joint data from four TDA members compared to the output from the engine built with data from only one TDA member. The increase in quality from this experiment was 7%. It also demonstrates the impact of data training with a 37% quality increase from baseline level output to customized output.

The second stage of this pilot shows the results of SMT quality evaluation based on text from TDA members in the same domain whose data was not used for training the SMT engine.


The tests were performed with the following materials:
Language direction: English-French

Datasets:

  • Merged translation memories from four TDA member companies A, B, C, D: 10,980,019 words
  • All four are from the computer software industry
  • Translation memory from TDA member company A: 1,515,214 words
  • English source language documents:
    • Document A: same domain, similar content type from TDA member company A: 7777 words. This document was used for the quality evaluation.
    • Document E: same domain, similar content type from TDA member company E. Company E did not provide any data for training the SMT engine: 7047 words. This document was used for the quality evaluation.
    • Document F: same domain, similar content type from TDA member company F. Company F did not provide any data for training the SMT engine: 5846 words. This document was used for the quality evaluation.

Approach taken

Two separate customized SMT engines were built for the purpose of this exercise. Engine 1 was trained with the translation memory of one TDA member company (company A) whereas engine 2 was trained with the merged translations memories from four TDA member companies (companies A, B, C, D).

A source document from company A was translated with each of the two customized engines as well as with the baseline version of the SMT system. All full matches were removed from the source document.

The quality of the various MT outputs has been evaluated by two independent native informants. The assessment took place in a web-based environment specifically designed for human MT quality evaluation using the DARPA metrics for evaluating MT adequacy. Each segment was rated on a 1-5 scale indicating how much of the meaning in the source text appears in the translation.

Two more source documents from TDA member companies (companies E, F), who did not provide any data for training the SMT engine were translated with the baseline version of the SMT engine and the customized engine 2. TDA member companies E and F both belong to the same domain.

The quality of the MT output of both engines has been evaluated by one native informant. The assessment took place in the same web-based MT evaluation environment using the DARPA metrics for evaluating MT adequacy.

Results
Part one: quality evaluation based on source document from company A

Cross Language table

The results show that domain data training of the baseline SMT engine clearly has a positive impact on the output quality. The average quality of the engine trained with company A data increases with 1.20 points, or 37% as compared to the quality as produced by the baseline engine.

Adding additional data from other companies in the same domain to the training set also seems to have a positive effect, adding another 0.25 points, or ending up 7% higher as compared to the quality as produced by the engine trained with company A data only. A quality evaluation of 4.71 (out of 5) is without doubt a very high score.

Part two: quality evaluation based on source document from companies E and F

Cross Language table

Cross Language table

The results show that in both cases the quality of the output as produced by the customized engine trained with data originating from other companies operating in the same domain is higher than the baseline engine output quality. In both cases the quality increases with 0.3 points which represents respectively 9% and 8% as compared to the baseline engine output quality.


Considerations

  • These results indicate that the TDA platform creates an opportunity for building SMT domain engines bringing down barriers for the use of MT especially for companies who may not have sufficient data to take advantage of SMT technology.
  • The improved quality rates do not necessarily ‘translate' into productivity rates. To what extent the increased quality as shown in this test potentially leads to increased productivity and cost savings needs further in-context analysis.