RegEx Testing Platform

XbenchUser · May 10, 2017, 9:44am

I have a question regarding RegEx: we know Xbench uses a different RegEx pattern, the POSIX pattern (instead of ASCII) and we know this pattern is not generally used for similar translations tools, just like Trados Studio.

Whenever I have to create a RegEx on Trados Studio, I perform a test in a online platform before implement the rules on the software.

However, I wasn’t able to find a online platform to test the RegEx rules for Xbench. I’d like to make sure my rules on Checklist Manager are going to work fine, but to do it I have to create a dummy file (in .doc format), put it on XLIFF format and then open this file on Xbench just to see if my rules are working or not.

Can you guys help me out there? Do you guys have a way to test if rules are working before implementing them on the Checklist Manager in Xbench?

pcondal · May 10, 2017, 9:48am

We basically use these three workflows (or slight variants of it) when creating new rules.

To compose a rule

If we want to compose a new rule, we create a .TXT document on the Desktop, put there a couple or three of the instances as sourcetarget. We then save the .TXT file, right-click on it and choose Run QA in Xbench to launch an Xbench project with just that file.

Then we use the Search tab to compose the search. Once the search does what we want, we click Add last search to checklist.

To test the effectiveness of a rule:

We have a large existing Xbench project with many items (in the range of hundreds of thousands of segments). Once we have added the checklist item to the checklist, we go to the Checklist Manager, select the entry, right-click and then choose Test. It allows us to see how the rule behaves with a larger data set.

To test the performance of a rule

With the large existing Xbench project, we run a QA with the full checklist where the new entry is found. Then we go back to Checklist Manager and look at the Run time (ms) column to see how the new entry compares to existing entries.

XbenchUser · May 13, 2017, 11:23am

I just have to thank you a lot.

I’ve seen a great time saving in creating new Regular Expressions with the method you provided for now.

The only thing I didn’t get was the “Run time (ms)” thing. What does it mean?

pcondal · May 13, 2017, 11:40am

The Run Time (ms) column in Checklist Manager shows the execution time in milliseconds for each checklist item (larger is worse). To populate this column for a checklist you just have to run a QA pass with Check Ongoing Translation that uses such checklist. For more accurate results, better use a large corpora.

When regular expressions are involved, it allows you to find where your performance bottlenecks are found and see if there is something that can be done about it.

For example let’s consider a search for segments that contain text arbitrary_text and at least one digit.

With the Run Time (ms) column could see that

Source: "arbitratry_text" "<[0-9]+[a-z]+>"
Search Mode: Regular Expressions
PowerSearch: ON

is faster than

Source: "<[0-9]+[a-z]+>" "arbitratry_text"
Search Mode: Regular Expressions
PowerSearch: ON

because in the first case non-matching segments are discarded at a much faster rate than the second.

Topic		Replies	Views
Seeking Regular Expressions Course Recommendation General Discussion	6	898	May 21, 2021
Repository for XBench regular expression General Discussion	0	733	July 1, 2020
Days of the week Technical Support	2	233	January 22, 2024
Issue creating Checklist item with expressions General Discussion	5	1027	December 4, 2020
Which Regex flavor does Xbench use? Technical Support	9	4572	May 4, 2023

RegEx Testing Platform

Related Topics