CafeTran not preserving source formatting
Thread poster: Philip Lees
Philip Lees
Philip Lees  Identity Verified
Greece
Local time: 01:25
Greek to English
Jun 8, 2022

CafeTran Espresso 10.8.1 Cornetto.

I've noticed recently that CafeTran doesn't always preserve the source formatting in the target document.

For example, I've just done the draft translation of a short text in which the heading is Calibri 20 pt and the body text is Calibri 14 pt. In the target text, the entire text, header and body, is set to Calibri 11 pt. The line spacing and breaks are the same in both texts.

This seems to be a recent development, though
... See more
CafeTran Espresso 10.8.1 Cornetto.

I've noticed recently that CafeTran doesn't always preserve the source formatting in the target document.

For example, I've just done the draft translation of a short text in which the heading is Calibri 20 pt and the body text is Calibri 14 pt. In the target text, the entire text, header and body, is set to Calibri 11 pt. The line spacing and breaks are the same in both texts.

This seems to be a recent development, though I'm not aware of having updated CafeTran recently. Is there some setting that I've accidentally messed up so that it causes this discrepancy?
Collapse


 
Hans Lenting
Hans Lenting
Netherlands
Member (2006)
German to Dutch
Didn't notice any loss of formatting Jun 8, 2022

Can you determine whether the formatting was applied directly or via styles? Perhaps that gives you a clue?

On a side note: Is the glitch a big problem for you? If so, we could perhaps try to come up with a workaround.


vird
 
Philip Lees
Philip Lees  Identity Verified
Greece
Local time: 01:25
Greek to English
TOPIC STARTER
No styles Jun 8, 2022

Hans Lenting wrote:

Can you determine whether the formatting was applied directly or via styles? Perhaps that gives you a clue?

On a side note: Is the glitch a big problem for you? If so, we could perhaps try to come up with a workaround.


All the text is tagged as "Normal" style.

It's not a big problem in the current job, but last week I was sent an OCR'd slide presentation saved as a Word file that had lots of different text styles, much of it in text boxes, and the output of CafeTran reduced everything to the same, small font, which meant it was a real pain to restore the original appearance.

That started me wondering whether I'd messed something up in the settings, as I don't remember having this problem before. So I looked at a more simple text as an example and found the same thing.


vird
 
Hans Lenting
Hans Lenting
Netherlands
Member (2006)
German to Dutch
OCR'd documents can be a real pain Jun 8, 2022

Philip Lees wrote:

It's not a big problem in the current job, but last week I was sent an OCR'd slide presentation saved as a Word file that had lots of different text styles, much of it in text boxes, and the output of CafeTran reduced everything to the same, small font, which meant it was a real pain to restore the original appearance.


Did you, at that time, use the OCR filter? It can ignore font changes.

If you have access to Ms Word on Windows, you could try to clean the document with CodeZapper or TransTools.


 
Philip Lees
Philip Lees  Identity Verified
Greece
Local time: 01:25
Greek to English
TOPIC STARTER
You found the culprit! Jun 8, 2022

Hans Lenting wrote:

Did you, at that time, use the OCR filter? It can ignore font changes.



That's it! Since I learned about the OCR filter I've just left it on that setting for everything, since a lot of the texts I translate come from that process.

I just tried running the same text through CafeTran again, but just as a plain .docx, and the font sizes were preserved.

On the other hand, the loss of font information isn't a problem for simple texts. I suppose that for more complex jobs like those slides, I will just have to decide which is the bigger pain.


Hans Lenting
 
Tom in London
Tom in London
United Kingdom
Local time: 23:25
Member (2008)
Italian to English
Interesting Jun 8, 2022

Philip Lees wrote:

That's it! Since I learned about the OCR filter I've just left it on that setting for everything, since a lot of the texts I translate come from that process.



Interesting- I didn't know about that. What does the OCR filter do, exactly?


 
Hans Lenting
Hans Lenting
Netherlands
Member (2006)
German to Dutch
I see her Jun 8, 2022

Tom in London wrote:

What does the OCR filter do, exactly?


https://cafetran.freshdesk.com/support/solutions/articles/6000112413-word-documents-after-optical-character-recognition

[Edited at 2022-06-08 08:51 GMT]


 
Hans Lenting
Hans Lenting
Netherlands
Member (2006)
German to Dutch
Use highlighting / colours Jun 8, 2022

Philip Lees wrote:

I suppose that for more complex jobs like those slides, I will just have to decide which is the bigger pain.


Not sure if you can Find and Replace in Ms Powerpoint using fonts in the replacement (just like with Ms Word).

But, if you can do so: you could use highlighting or colours to mark the font formatting that you want to keep.

In the exported file, replace this highlighting or colours with the original font.

https://support.apu.edu/hc/en-us/articles/221902328-Replacing-text-in-Word-document-based-on-Font-Windows-Mac-

EDIT: Don't think that this approach is possible in Ms Powerpoint. You can replace one font with another, but the Advanced F/R dialogue box is way less advanced than that of Ms Word:


Screen Shot 2022-06-08 at 10.57.05

As a workaround you could use your own markup to indicate font changes and run a macro on the exported file to replace your markup with real fonts. But then things start to get complicated.



[Edited at 2022-06-08 08:59 GMT]


 
Tom in London
Tom in London
United Kingdom
Local time: 23:25
Member (2008)
Italian to English
Thanks Jun 8, 2022

Hans Lenting wrote:

Tom in London wrote:

What does the OCR filter do, exactly?


https://cafetran.freshdesk.com/support/solutions/articles/6000112413-word-documents-after-optical-character-recognition

[Edited at 2022-06-08 08:51 GMT]


Wow- CafeTran is like a blushing maiden who keeps all her secret capabilities hidden away. I didn't even know this was possible! Thanks Hans . I'll try using it the next time I get an OCR-ed document to translate (which happens quite often).


 
Hans Lenting
Hans Lenting
Netherlands
Member (2006)
German to Dutch
Onbekend maakt onbemind Jun 8, 2022

Tom in London wrote:

Wow- CafeTran is like a blushing maiden who keeps all her secret capabilities hidden away.


Indeed, CafeTran Espresso is the most unknown and underrated tool.


 
Hans Lenting
Hans Lenting
Netherlands
Member (2006)
German to Dutch
Alt app Jun 8, 2022

An alternative approach:

Use this feature:

1

And CafeTran Espresso's own bold, underlined etc. markup:

Screen Shot 2022-06-08 at 11.26.10

(Of course, this cannot be used for font changes.)


 


To report site rules violations or get help, contact a site moderator:

Moderator(s) of this forum
Natalie[Call to this topic]

You can also contact site staff by submitting a support request »

CafeTran not preserving source formatting






Wordfast Pro
Translation Memory Software for Any Platform

Exclusive discount for ProZ.com users! Save over 13% when purchasing Wordfast Pro through ProZ.com. Wordfast is the world's #1 provider of platform-independent Translation Memory software. Consistently ranked the most user-friendly and highest value

Buy now! »
CafeTran Espresso
You've never met a CAT tool this clever!

Translate faster & easier, using a sophisticated CAT tool built by a translator / developer. Accept jobs from clients who use Trados, MemoQ, Wordfast & major CAT tools. Download and start using CafeTran Espresso -- for free

Buy now! »