Using a Korean Input Method Editor (IME), it's possible to type 버리 + 어 and it will automatically become 버려.
Is there a way to programmatically do that in Python?
>>> x, y = '버리', '어'
>>> z = '버려'
>>> ord(z[-1])
47140
>>> ord(x[-1]), ord(y)
(47532, 50612)
Is there a way to compute that 47532 + 50612 -> 47140?
Here's some more examples:
가보 + 아 -> 가봐
끝나 + ㄹ -> 끝날
I'm a Korean. First, if you type
버리+어, it becomes버리어not버려.버려is an abbreviation of버리어and it's not automatically generated. Also가보아cannot becomes가봐automatically during typing by the same reason.Second, by contrast,
끝나+ㄹbecomes끝날because나has no jongseong(종성). Note that one character of Hangul is made of choseong(초성), jungseong(중성), and jongseong. choseong and jongseong are a consonant, jungseong is a vowel. See more at Wikipedia. So only when there's no jongseong during typing (like 끝나), there's a chance that it can have jongseong(ㄹ).If you want to make
버리+어to버려, you should implement some Korean language grammar like, especially for this case, abbreviation of jungseong. For exampleㅣ+ㅓ=ㅕ,ㅗ+ㅏ=ㅘas you provided. 한글 맞춤법 chapter 4. section 5 (I can't find English pages right now) defines abbreviation like this. It's possible, but not so easy job especially for non-Koreans.Next, if what you want is just to make
끝나+ㄹto끝날, it can be a relatively easy job since there're libraries which can handle composition and decomposition of choseong, jungseong, jongseong. In case of Python, I found hgtk. You can try like this (nonpractical code):Still, without proper knowledge of Hangul, it will be very hard to get it done.