印地语单词长度

发布于 2025-01-12 15:36:12 字数 126 浏览 5 评论 0原文

我试图找出Python中印地语单词的长度,例如据我所知,“प्रवीण”的长度为3。

w1 = 'प्रवीण'
print(len(w1))

我尝试了这段代码,但没有成功。

I am trying to find out the length of Hindi words in Python, like 'प्रवीण' has length of 3 as per my knowledge.

w1 = 'प्रवीण'
print(len(w1))

I tried this code but it didn't work.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

雨巷深深 2025-01-19 15:36:12

正如@betelgeuse 所说,印地语并不像你想象的那样发挥作用。这里有一些伪代码(有效)可以完成您所期望的操作:

w1 = 'प्रवीण'

def hindi_len(word):
    hindi_letts = 'कखगघङचछजझञटठडढणतथदधनपफबभमक़ख़ग़ज़ड़ढ़फ़यरलळवहशषसऱऴअआइईउऊऋॠऌॡएऐओऔॐऍऑऎऒ'
    # List of hindi letters that aren't halves or mantras
    count = 0
    for i in word:
        if i in hindi_letts:
            count += 1 if word[word.index(i) - 1] != '्' else 0 # Make sure it's not a half-letter
    return count

print(hindi_len(w1))

输出 3。不过,您可以根据需要自定义它。

编辑:确保您使用 python 3.x 或在 python 2.x 中使用 u 前缀印地语字符串,我之前在某处看到过 python 2.x 非 unicode 编码的一些语言错误...

As @betelgeuse has said, Hindi does not function the way you think it does. Here's some pseudocode (working) to do what you expect though:

w1 = 'प्रवीण'

def hindi_len(word):
    hindi_letts = 'कखगघङचछजझञटठडढणतथदधनपफबभमक़ख़ग़ज़ड़ढ़फ़यरलळवहशषसऱऴअआइईउऊऋॠऌॡएऐओऔॐऍऑऎऒ'
    # List of hindi letters that aren't halves or mantras
    count = 0
    for i in word:
        if i in hindi_letts:
            count += 1 if word[word.index(i) - 1] != '्' else 0 # Make sure it's not a half-letter
    return count

print(hindi_len(w1))

This outputs 3. It's up to you to customize it as you'd like, though.

Edit: Make sure you use python 3.x or prefix Hindi strings with u in python 2.x, I've seen some language errors with python 2.x non-unicode encoding somewhere before...

酒与心事 2025-01-19 15:36:12

在印地语中,每个字符的长度不必像英语中那样为一。例如,वी 不是一个字符,而是两个字符组合成一个字符:

因此,在您的情况下,单词 प्रवीण 的长度不是 3,而是 6

w1 = "प्रवीण"
for w in w1:
    print(w)

。输出将是

प
्
र
व
ी
ण

In the Hindi language, each character need not be of length one as is in English. For example, वी is not one character but rather two characters combined into one:

So in your case, the word प्रवीण is not of length 3 but rather 6.

w1 = "प्रवीण"
for w in w1:
    print(w)

And the output would be

प
्
र
व
ी
ण
浅暮の光 2025-01-19 15:36:12

编写与 Codeman 提供的伪代码相对应的工作 kotlin 代码。这可以帮助您获得以下两件事:-

  1. 以基本字符表示的字符串长度 根据
  2. 基本字符将字符串拆分为多个部分
const val HINDI_LETTERS = "कखगघङचछजझञटठडढणतथदधनपफबभमक़ख़ग़ज़ड़ढ़फ़यरलळवहशषसऱऴअआइईउऊऋॠऌॡएऐओऔॐऍऑऎऒ"

fun getHindiWordLength(word: String): Int{
    var count = 0
    var n = word.length
    for(i in 0..n-1){
        println(word[i])    //Just to see how each character in the string looks like
        if(word[i] in HINDI_LETTERS && (i == 0 || word[i-1] != '्'))        // Make sure not a half-letter
            count++
    }
    return count
}

fun splitHindiWordOnBaseLetter(word: String): MutableList<String>{
    var n = word.length
    var curWord = ""
    val splitWords: MutableList<String> = mutableListOf()
    for(i in 0..n-1){
        if(word[i] in HINDI_LETTERS && (i > 0 && word[i-1] != '्'))     // Make sure not a half-letter
        {
            splitWords.add(curWord)
            curWord = ""
        }
        curWord += word[i]
    }
    splitWords.add(curWord)         //last letter
    return splitWords
}

我已经在这些输入上测试了此代码:-

    println(getHindiWordLength("प्रवीण"))
    println(splitHindiWordOnBaseLetter("प्रवीण"))
    
    println(getHindiWordLength("आम"))
    println(splitHindiWordOnBaseLetter("आम"))
    
    println(getHindiWordLength("पेड़"))
    println(splitHindiWordOnBaseLetter("पेड़"))
    
    println(getHindiWordLength("अक्षर"))
    println(splitHindiWordOnBaseLetter("अक्षर"))
    
    println(getHindiWordLength("दिल"))
    println(splitHindiWordOnBaseLetter("दिल"))

这是我得到的输出:-

प
्
र
व
ी
ण
3
[प्र, वी, ण]
आ
म
2
[आ, म]
प
े
ड
़
2
[पे, ड़]
अ
क
्
ष
र
3
[अ, क्ष, र]
द
ि
ल
2
[दि, ल]

Writing working kotlin code corresponding to the pseudo code provided by Codeman. This can help you get these 2 things:-

  1. Length of the string in terms of base characters
  2. Split the string into parts on the basis of base characters
const val HINDI_LETTERS = "कखगघङचछजझञटठडढणतथदधनपफबभमक़ख़ग़ज़ड़ढ़फ़यरलळवहशषसऱऴअआइईउऊऋॠऌॡएऐओऔॐऍऑऎऒ"

fun getHindiWordLength(word: String): Int{
    var count = 0
    var n = word.length
    for(i in 0..n-1){
        println(word[i])    //Just to see how each character in the string looks like
        if(word[i] in HINDI_LETTERS && (i == 0 || word[i-1] != '्'))        // Make sure not a half-letter
            count++
    }
    return count
}

fun splitHindiWordOnBaseLetter(word: String): MutableList<String>{
    var n = word.length
    var curWord = ""
    val splitWords: MutableList<String> = mutableListOf()
    for(i in 0..n-1){
        if(word[i] in HINDI_LETTERS && (i > 0 && word[i-1] != '्'))     // Make sure not a half-letter
        {
            splitWords.add(curWord)
            curWord = ""
        }
        curWord += word[i]
    }
    splitWords.add(curWord)         //last letter
    return splitWords
}

I have tested this code on these inputs:-

    println(getHindiWordLength("प्रवीण"))
    println(splitHindiWordOnBaseLetter("प्रवीण"))
    
    println(getHindiWordLength("आम"))
    println(splitHindiWordOnBaseLetter("आम"))
    
    println(getHindiWordLength("पेड़"))
    println(splitHindiWordOnBaseLetter("पेड़"))
    
    println(getHindiWordLength("अक्षर"))
    println(splitHindiWordOnBaseLetter("अक्षर"))
    
    println(getHindiWordLength("दिल"))
    println(splitHindiWordOnBaseLetter("दिल"))

This is the output that I am getting:-

प
्
र
व
ी
ण
3
[प्र, वी, ण]
आ
म
2
[आ, म]
प
े
ड
़
2
[पे, ड़]
अ
क
्
ष
र
3
[अ, क्ष, र]
द
ि
ल
2
[दि, ल]
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文