如何在VBA中从韩语字符串中提取字符
需要从 MS-Excel 和 MS-Access 中的韩语单词中提取首字符。 当我使用 Left("한글",1) 时,它将返回第一个音节,即 한,我需要的是初始字符,即 ㅎ 。 有一个函数可以做到这一点吗?或者至少是一个习语?
如果您知道如何从字符串中获取 Unicode 值,我就可以从那里解决它,但我确信我会重新发明轮子。 (再一次)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
免责声明:我对 Access 或 VBA 知之甚少,但您遇到的是通用 Unicode 问题,它不是特定于这些工具的。我重新标记了您的问题以添加与此问题相关的标签。
Access 通过返回 한 来执行正确的操作,它确实是该两个字符字符串的第一个字符。这里您想要的是该韩文在其组成部分 jamos 中的规范分解,也称为标准化形式 D (NFD),意为“分解”。 NFD 形式是 ᄒ ᅡ ᆫ,其中第一个字符就是您想要的。
另请注意,根据您的示例,您似乎想要一个函数返回 jamo (ᄒ) 的等效朝鲜文 (ㅎ) - 实际上有两个不同的代码点,因为它们代表不同的语义单元(一个成熟的朝鲜文音节,或韩文的一部分)。从前者到后者没有预定义的映射,您可以为此编写一个小函数,因为 jamos 的数量限制为几十个(真正的工作在第一个函数 NFD 中完成)。
Disclaimer: I know little about Access or VBA, but what you're having is a generic Unicode problem, it's not specific to those tools. I retagged your question to add tags related to this issue.
Access is doing the right thing by returning 한, it is indeed the first character of that two-character string. What you want here is the canonical decomposition of this hangul in its constituent jamos, also known as Normalization Form D (NFD), for “decomposed”. The NFD form is ᄒ ᅡ ᆫ, of which the first character is what you want.
Note also that as per your example, you seem to want a function to return the equivalent hangul (ㅎ) for the jamo (ᄒ) – there really are two different code points because they represent different semantic units (a full-fledged hangul syllable, or a part of a hangul). There is no pre-defined mapping from the former to the latter, you could write a small function to that effect, as the number of jamos is limited to a few dozens (the real work is done in the first function, NFD).
除了亚瑟的出色回答之外,我想指出,从标准中提取韩文音节中的 jamo 是非常简单的。虽然该解决方案并非特定于 Excel 或 Access(它是一个 Python 模块),但它只涉及算术表达式,因此应该可以轻松翻译为其他语言。可以看出,这些公式与标准。分解结果以
integers编码字符串的元组形式返回,可以轻松验证其是否与 韩文 Jamo 代码表。这是我的控制台中的输出:
Adding to Arthur's excellent answer, I want to point out that extracting jamo from hangeul syllables is very straightforward from the standard. While the solution isn't specific to Excel or Access (it's a Python module), it only involves arithmetic expressions so it should be easily translated to other languages. The formulas, as can be seen, are identical to those in page 109 of the standard. The decomposition is returned as a tuple of
integersencoded strings, which can be easily verified to correspond to the Hangul Jamo Code Chart.This is the output in my console:
我认为你正在寻找的是一个字节数组
将 aByte() 调暗为字节
aByte="한글"
应该为您提供字符串中每个字符的两个 unicode 值
I think what you are looking for is a Byte Array
Dim aByte() as byte
aByte="한글"
should give you the two unicode values for each character in the string
我想你已经得到了你需要的东西,但它看起来相当复杂。我对此一无所知,但最近做了一些处理 Unicode 的调查,并研究了所有字符串 Byte 函数,例如 LeftB()、RightB()、InputB()、InStrB()、LenB()、AscB ()、ChrB() 和 MidB(),还有 StrConv(),它有一个 vbUnicode 参数。这些都是我认为可以在任何双字节上下文中使用的函数,但是,我不在该环境中工作,因此可能会丢失一些非常重要的东西。
I assume you got what you needed, but it seems rather convoluted. I don't know anything about this, but recently did some investigating of handling Unicode, and looked into all the string Byte functions, such as LeftB(), RightB(), InputB(), InStrB(), LenB(), AscB(), ChrB() and MidB(), and there's also StrConv(), which has a vbUnicode argument. These are all functions that I'd think would be used in any double-byte context, but then, I don't work in that environment so might be missing something very important.