Impossible? Word by word check

If we have a sentence that lists a number of things separated by punctuation, can aspic verify word by word via perhaps RegEx or any other way?

For example,

A long sentence, that keeps going, extending, and going.

The idea here is to be able to catch any missing words in the target. I realize that languages behave differently so the target may have two or more words per source words, however I am wondering if via punctuation?

I understand that you wish to ensure that the number of something-delimited items does not increase nor decrease.

I cannot come up with any RegEx search that can do that so the options with Xbench would not be a direct approach:

Option A. Write an Xbench QA plugin (documentation and sample code in GitHub) that especifically tests that. It is a high difficulty item because it requires programming skills in your team.

Option B. Make a copy of the input files and do a global change in source and target of the delimiter by something tag-shaped, for example <comma>. Then the Xbench tag mismatch check would be able to catch if there are extra or missing commas in the sequence. You can also seach for appearances of two <comma> tag in a row. This is a much simpler approach, the skills required are basic scripting skills in your team.

1 Like