Total mismatch between the segmentation settings and the actual segmentation (Wordfast support)

Technical forums » Wordfast support »
Total mismatch between the segmentation settings and the actual segmentation
Track this topic

Total mismatch between the segmentation settings and the actual segmentation

Thread poster: Laurent Pechamat

Laurent Pechamat

France
Local time: 10:50
English to French

May 9, 2018

Hello,

I really have a problem with Wordfast Pro 3: indeed, even if I check "CR" and "LF" in the End of Segmentation markers, these strings are NOT cut!

I have a HTML file (that I have put into HTML mmyself from a text file) like this:

Back
Microphone
Where are you going?
open
closed
Toggle side drawer
Search
Search here
Logging in…
Creating account…
An error occurred communicating with the server
Login failed
Unknown user
Invalid password
Unknown error occurred during login

At the end of each, there is a CR and LF (Carriage return AND line feed).

BUT, the segmentation is like this (in 2 strings):

Back Microphone Where are you going?

open closed Toggle side drawer Search Search here Logging in… Creating account… An error occurred communicating with the server Login failed Unknown user Invalid password Unknown error occurred during login

It's like for these ones, the CR and LF DO NOT EXIST while they are here!!!! In fact, it cuts well after the .:!? but not after the LF and/or CR alone. I don't get it at all.

Thanks in advance for your help.

Best Regards,
Laurent ▲ Collapse

Samuel Murray

Netherlands
Local time: 10:50
Member (2006)
English to Afrikaans
+ ...

One possible explanation

May 10, 2018

Anthocharis88 wrote:
At the end of each, there is a CR and LF (Carriage return AND line feed).
BUT, the segmentation is like this (in 2 strings):

I think...

1. It breaks after the question mark because that is a usual end-of-segment mark.

2. It won't break after a natural CR or LF in HTML because in HTML a line break is represented by <br> or <p>. In HTML, a natural CR or LF is equal to a space, and so when WFP parses the HTML file as HTML, it considers those characters to be spaces, not actual CRs or LFs.

3. Besides, WFP3's HTML filter is not very good. It should recognise CRs and LFs within <pre> tags, but it doesn't -- it still treats those CRs and LFs as spaces.

[Edited at 2018-05-10 10:09 GMT]

Laurent Pechamat

France
Local time: 10:50
English to French

TOPIC STARTER

You are right

May 10, 2018

Samuel Murray wrote:

Anthocharis88 wrote:
At the end of each, there is a CR and LF (Carriage return AND line feed).
BUT, the segmentation is like this (in 2 strings):

I think...

1. It breaks after the question mark because that is a usual end-of-segment mark.

2. It won't break after a natural CR or LF in HTML because in HTML a line break is represented by
or . In HTML, a natural CR or LF is equal to a space, and so when WFP parses the HTML file as HTML, it considers those characters to be spaces, not actual CRs or LFs.

3. Besides, WFP3's HTML filter is not very good. It should recognise CRs and LFs within tags, but it doesn't -- it still treats those CRs and LFs as spaces.

[Edited at 2018-05-10 10:09 GMT]

Hello Samuel, thanks for your help.

In fact yes, you are right, it's a mistake from my side!! I checked on Google:CR and LF are just spaces in HTML.

But in fact I made a mistake: I put together 3 files in 1 to translate all in 1, but in these 3, only 2 are HTML. The other one (the one I showed here) is NOT HTML. So in fact translating as text is much better - and of course I have the good segmentation.

Have a good day!

[Edited at 2018-05-10 12:26 GMT]

Login to reply/comment

To report site rules violations or get help, contact a site moderator:

Moderator(s) of this forum
Natalie	[Call to this topic]
Marco Ramón	[Call to this topic]
Prachya Mruetusatorn	[Call to this topic]

You can also contact site staff by submitting a support request »

Total mismatch between the segmentation settings and the actual segmentation

Translation news related to Wordfast

» Wordfast Provides Free Access to Coronavirus Translation Memories
(0 comments)

Submit translation news about Wordfast »
Read more translation news »

Forum rules

Help and orientation

Trados Business Manager Lite
Create customer quotes and invoices from within Trados Studio Trados Business Manager Lite helps to simplify and speed up some of the daily tasks, such as invoicing and reporting, associated with running your freelance translation business. More info »

TM-Town
Manage your TMs and Terms ... and boost your translation business Are you ready for something fresh in the industry? TM-Town is a unique new site for you -- the freelance translator -- to store, manage and share translation memories (TMs) and glossaries...and potentially meet new clients on the basis of your prior work. More info »

Recent posts | FAQ | Rules | Moderators | Article knowledgebase

Your current localization setting

English

Select a language

More languages...

Total mismatch between the segmentation settings and the actual segmentation

Total mismatch between the segmentation settings and the actual segmentation

You have native languages that can be verified

Your current localization setting

Select a language