仅在所有字符匹配时替换字符串（泰语）

发布于 2025-02-02 02:56:41 字数 277 浏览 1 评论 0原文

问题是มาก从技术上讲是在มาก็中。因为มาก็是มาก +็。

因此，当我这样做时

"แชมพูมาก็เยอะ".replace("มาก", " X ")

，我最终会得到

แชมพู X  ็เยอะ

我想要的

แชมพู X เยอะ

真正想要的是强迫最后一个角色client tum算作一个字符，因此不再匹配มาก็。

原文

The problem is that มาก technically is in มาก็. Because มาก็ is มาก + ็.

So when I do

"แชมพูมาก็เยอะ".replace("มาก", " X ")

I end up with

แชมพู X  ็เยอะ

And what I want

แชมพู X เยอะ

What I really want is to force the last character ก็ to count as a single character, so that มาก no longer matches มาก็.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

◇流星雨 2025-02-09 02:56:41

虽然我没有找到适当的解决方案，但我能够找到 a 解决方案。我将每个字符串分为单独的（组合）字符。然后，我将这些列表彼此比较。

# Check is list is inside other list
def is_slice_in_list(s,l):
    len_s = len(s) #so we don't recompute length of s on every iteration
    return any(s == l[i:len_s+i] for i in range(len(l) - len_s+1))

def is_word_in_string(w, s):
    a = regex.findall(u'\X', w)
    b = regex.findall(u'\X', s)
    return is_slice_in_list(a, b)

assert is_word_in_string("มาก็", "พูมาก็เยอะ") == True
assert is_word_in_string("มาก", "พูมาก็เยอะ") == False

正则表达式会像这样拆分：

พู ม า ก็ เ ย อ ะ
ม า ก

并且随着比较ก็ก็函数函数数字不相同。

我会标记为回答，但是如果有一个不错的或“适当”的解决方案，我会选择那个解决方案。

While I haven't found a proper solution, I was able to find a solution. I split each string into separate (combined) characters via regex. Then I compare those lists to each other.

# Check is list is inside other list
def is_slice_in_list(s,l):
    len_s = len(s) #so we don't recompute length of s on every iteration
    return any(s == l[i:len_s+i] for i in range(len(l) - len_s+1))

def is_word_in_string(w, s):
    a = regex.findall(u'\X', w)
    b = regex.findall(u'\X', s)
    return is_slice_in_list(a, b)

assert is_word_in_string("มาก็", "พูมาก็เยอะ") == True
assert is_word_in_string("มาก", "พูมาก็เยอะ") == False

The regex will split like this:

พู ม า ก็ เ ย อ ะ
ม า ก

And as it compares ก็ to ก the function figures the words are not the same.

I will mark as answered but if there is a nice or "proper" solution I will chose that one.

回复收藏 0 原文

~没有更多了~

关于作者

萌︼了一个春

暂无简介

文章

27 人气

关注发私信

882123719

文章 0 评论 0

关注

朦胧时间

文章 0 评论 0

关注

alipaysp_DQOPIT9H5Y

文章 0 评论 0

关注

眼藏柔

文章 0 评论 0

关注

微信用户

文章 0 评论 0

关注

寻梦旅人

文章 0 评论 0

友情链接

文江博客

仅在所有字符匹配时替换字符串（泰语）

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（1）

关于作者

相关话题

热门标签

推荐作者

882123719

朦胧时间

alipaysp_DQOPIT9H5Y

眼藏柔

微信用户

寻梦旅人

友情链接

仅在所有字符匹配时替换字符串（泰语）

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（1）

关于作者

相关话题

热门标签

推荐作者

882123719

朦胧时间

alipaysp_DQOPIT9H5Y

眼藏柔

微信用户

寻梦旅人

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。