Pangeanic creates revenue stream with TDA data
PangeaMT is the first example of a language service provider creating a new revenue stream with data from TDA. The new division provides industry specific statistical machine translation (SMT) engines for automotive, consumer electronics, and industrial sectors. The service was launched at the recent TAUS User Conference with an offer to train engines for free for companies seriously looking into deploying open source MT with a TMX workflow.
”I am delighted that we’ve been able to make use Pangeanic's SMT developments and the continued improvements we’re experiencing by adding data from TDA.”
Two years ago Pangeanic began partnering with the computational linguistics department at Valencia's Polytechnic with the intention of taking the Open Source SMT engine Moses to market and to increase internal efficiencies.
They identified the handling of in-line tags as a major pain point for corporate clients and devoted significant attention to developing a tag parser for handling 3rd party xml-type code coming from design packages, DTP and authoring tools.
The engines “plug-in” with existing, commercially available translation memory-based software using the industry standard TMX file format for both input and output. There are feedback loops to ensure quality improvement through retraining.
In April 2009, Pangeanic began to use PangeaMT for its existing clients in automotive and consumer electronics sectors. Initial productivity increases were on average 30% for Spanish, French and Italian, and approached 20% for German.
By September, data from TDA had been added and productivity increased between a further 33-100% on the initial improvements in April. New domains were added and several tests conducted to prove ROI.
At the TAUS User Conference, Manuel Herranz, senior strategy officer at Pangeanic, announced the public launch of PangeaMT with the offer of a free training period until the end of January 2010 in a choice of 8 European languages and no commitment to buy.
Tips from the Pangeanic team:
1. Expect initial resistance from established linguists. You are dealing with new technology and probably will be a pioneer yourself.
- Train on “fit for publishing” post-edition versus “too much translation”
- Final quality may require a second proof, detached from original, further QA terminology checks.
- Integrate terminology tools in the process (as you would with existing workflows!)
- Turning a translator into a post editor is a challenge. Younger linguists are more flexible.
2. Invest in time. Data gathering for a SMT implementation can take 2-6 months (TM “gains” seem immediate!)
3. Establish a quality feedback loop (feed post-editing back to the engine). Remember that engines get better over time with re-training and your critical mass will grow. Initial “statistical” issues will be resolved with updates.4. Do not forget the human component in post-editing. SMT will speed time-to-market and lower your costs, but the final result is only as good as human skills.
See more detailed information on PangeaMT’s quality scores using data from TDA
Ask a question: This e-mail address is being protected from spambots. You need JavaScript enabled to view it



