Good day, I am new on this forum but I have been using XBench for two years now for QA. I need help with RegEx to fix my TM.
That’s the story…
In 2016, a long-standing client introduced a style guide for numbers/measures used in their manuals. Before then, they were using the AmE number ‘spelling’, i.e. 34,000 for 34 thousands, 12.15 for 12 units and 15 decimal points. In 2016 they decided to match some technical standard, for which the thousands separator should be a non-breaking space (i.e. 34 000) and the decimal separator should still be the point (12.15). They decided that also the translated manuals should stick to this rule, regardless of the local custom – for example, in my country we use the comma as the decimal separator, but I should stick to the point (no pun intended) for this client.
After 5 years, my TM has a mix of bad sources and bad targets due to these style changes. I would like to fix my TM so that anything I pre-translate is pre-translating according to the new style guide.
I am working with SDL Trados Studio and thus I have exported my TM from *.sdltm to *.tmx.
I have loaded the *.tmx in XBench.
I need to check and edit those TUs that do not match the current style guide. I.e. I need to run a series of check on the target, according to these rules:
If the number has 5 or more digits, the thousand separator has to be a non-breaking space (i.e. 34 000, however if it has just 4 digits, no space should be used, i.e. 4000)
If the number has decimals, the decimal separator should be the point and not the comma (i.e. 12.15 is okay, 12,15 is not)
If the source contains in. the target should contain in. (with the point) as well.
If the source contains cu.ft. the target should contain cu.ft. (with the points) as well.
If the source contains L the target should contain L (single letter, capitalized) as well.
Ranges should be indicated using a en dash, so 10-20 is okay, while 10-20 is not okay. (in this post they are displayed the same, but the en dash should look slightly longer than the regular dash)
If the source contains red rose the target should contain rosa rossa.
If the target contains km, it should be preceded by a non breaking space.
If the target contains ASTM, it should be followed by a non breaking space
Can you please help me translate this into RegEx? Thank you in advance!
thank you so much for your help. You are a life saver!
I will reference the Xbench RegEx grammar in the future. RegEx is not so straightforward and it looks a little intimidating, though well worth learning!
As per your suggestion, I have been searching directly the *.sdltm file, though to make edits, SDL Studio is open/used. Correct? I thought it was possible to batch/bulk edit in Xbench directly, but maybe it is just for other file formats.
Anyway, most of the strings worked , however these were not working:
This one is not working (the search fields turn red), however the previous one searching for the abbreviation of inches, works.
This one is not working (the search fields turn red).
Xbench is a browser, not really an editor. We always try to find ways to call the home application of the format to ensure data integrity (slight changes for example in a home application update can easily corrupt data for the home application proprietary data).
However, I agree that it would be great that it was possible from Xbench to open the TM in Studio right at the segment.
I created this idea in the SDL Community site. If you vote it (and manage that other interested users vote it), and SDL eventually decides to implement it, the functionality of segment positioning will be eventually available in Xbench.
For the cu.ft. I notice you are missing the backslashes in front of the dot (a dot has an special meaning in Regex). In any case, could you provide specific source/target examples on where does it not work?
"However, I agree that it would be great that it was possible from Xbench to open the TM in Studio right at the segment.
I created this idea3 in the SDL Community site. If you vote it (and manage that other interested users vote it), and SDL eventually decides to implement it, the functionality of segment positioning will be eventually available in Xbench."
As an alternative, while this is still not possible with sdltm memories, it should be possible to access the “offending segment” of a TM for editing by first exporting the TM to TMX format, then loading it in Xbench as “ongoing translation”.