ApSIC Xbench Forum

Regex for detecting missing Chinese Period in the middle of string

Source: You have unchecked the Participate. The selection is about to close.
Target: 你取消勾選「參加」此次選拔即將結束。

There are actually two sentences in the source for the same segment.
How can I use regex to find the missing space in the middle of the string?
I can’t either use $ or ^ because they are used for “Start” and “End” of the string.
Please kindly help.

Do you mean to find a dot not followed by a space in source when there is more than one sentence?

If so, this regex should suit your needs:

Source: "[^[:space:]]\.[^[:space:]\.].{2,}"

Search mode: Regular Expressions.
PowerSearch: On.

Hi thanks for your help.

As you may see in the source sentence, there are two periods.
However, there is only one period in target.
It means the target sentence has one missing period.
Is there anyway to use regex to find the missing period?

The correct target sentence should be:
你取消勾選「參加」。此次選拔即將結束。
One “。” is missing.

Try this one:

Source: "^+\.[:space:].+\."
Target: -"^[^。]+。[^。]+。"
Regex and PowerSearch: On.

Hi Thank you so much! It works.