How to detect the Korean character right before "을 " or "를 " and do some unicode value calculation of its final consonant?


#1

What I need to detect is

  1. the character right before "을 " or "를 ",
  2. and get the unicode value of the character,
  3. and do the calculation like “(above unicode value-44032)%28”
  4. and check if the result is zero(0) or not.

The final result(Zero or Non-Zero) will decide which one of “을” or “를” should be appended after the character, which is very helpful to detect common errors in Korean.

For example, from “사람를 만나다”, I need to detect “람” right before "를 ", then get the unicode value of “람”, then do the calculation “(above unicode value-44032)%28”, then check if the result is zero or not.
For this case, it should be non-zero, so I can see that "를 " is an error and should be changed to "을 ".

Thank you.


#2

Xbench Regex in checklists does not support math formulas. To detect segments with this pattern, you would need to write a QA plugin (it requires programming skills in your team).