Samuel Murray wrote:
Anthocharis88 wrote:
At the end of each, there is a CR and LF (Carriage return AND line feed).
BUT, the segmentation is like this (in 2 strings):
I think...
1. It breaks after the question mark because that is a usual end-of-segment mark.
2. It won't break after a natural CR or LF in HTML because in HTML a line break is represented by
or . In HTML, a natural CR or LF is equal to a space, and so when WFP parses the HTML file
as HTML, it considers those characters to be spaces, not actual CRs or LFs.
3. Besides, WFP3's HTML filter is not very good. It should recognise CRs and LFs within tags, but it doesn't -- it still treats those CRs and LFs as spaces.
[Edited at 2018-05-10 10:09 GMT]
Hello Samuel, thanks for your help.
In fact yes, you are right, it's a mistake from my side!! I checked on Google:CR and LF are just spaces in HTML.
But in fact I made a mistake: I put together 3 files in 1 to translate all in 1, but in these 3, only 2 are HTML. The other one (the one I showed here) is NOT HTML. So in fact translating as text is much better - and of course I have the good segmentation.
Have a good day!
[Edited at 2018-05-10 12:26 GMT]