ApSIC Xbench Forum

Regex for target equals source in brackets


#1

Hi everyone!

I swear I read all questions in this Forum and the guide in documentation folder, but I am unable to create a simple regex code.

The source has expressions inside square brackets that should not be translated, like this:
Bill by Default [Label][Consultants].

The target must be:
Cobrar por padrão [Label][Consultants]

Some segments have 2 sets of brackets (with or without space between the brackets), others have just 1 and others have only the expression inside brackets and nothing more.

I am proofreading this file that has over 55.000 words. The translator did not followed the rules and has translated some of the expressions. For now I am checking one by one, but it is driving me crazy. I spent several hours reading this forum and the documentation, doing a lot of tests and I did not find a regex code that worked as I need.

It must be a regex code that indicates the segments in target that have expressions inside brackets that are different of the source. I think it must be a very simple one, but I cannot manage to write it.

Would you help me, please? Thanks in advance!

Um abraço,
Silvia


#2

Hi Silvia,

The following regex should match one or more expression inside square brackets (with or without spaces between the brackets):

Source: "((\[[[:alphanum:][:space:]]+[:space:]*\])+)=1"
Target: -@1
Search mode: Regular Expressions.
Powersearch: On.

Best regards,
Oscar.


#3

Dear Oscar,

I knew I could count on you, thanks a lot! But unfortunately I need more help. The code functioned very well when in the source there is only one expression in brackets (see lines 1-4 in the image below). But when there are 2 expressions in brackets together in source and they are inverted in target, Xbench is showing these segments as mismatches (lines from 5 to…). These should not appear in the report, because the expressions are correct.

image

Well, I have now a report showing 502 entries as mismatches, but most of all are false positives. Can you help me once more?

Thank you very, very much!

Um abraço,
Silvia


#4

If expressions in brackets are not together in target, you should use a different regular expression that checks segments with one term in brackets:
Source: "(\[[[:alphanum:][:space:]]+[:space:]*\])=1"
Target: -@1
Search mode: Regular Expressions.
Powersearch: On.

In order to check segments with two terms in brackets that have been modified in target, you should use this one:

Source: "(\[[[:alphanum:][:space:]]+[:space:]*\])=1[^\[]*(\[[[:alphanum:][:space:]]+[:space:]*\])=2"
Target: -@1 OR -@2
Search mode: Regular Expressions.
Powersearch: On.


#5

Hi, Òscar!

Now it is everything working as it should be. From those 502 entries in last report, I have now only 33. That is marvelous!

I have no words to thank you.

Um abraço,
Sílvia