有没有办法将数字转换为整数?

发布于 2024-07-12 13:37:46 字数 125 浏览 10 评论 0原文

我需要将 one 转换为 1,将 two 转换为 2 等等。

有没有办法通过图书馆、课程或其他东西来做到这一点?

I need to convert one into 1, two into 2 and so on.

Is there a way to do this with a library or a class or anything?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(19

陌伤ぢ 2024-07-19 13:37:47

我一直在寻找一个库来帮助我支持所有上述以及更多边缘情况场景,例如序数(第一,第二),更大的数字,运算符等,我发现了这个 numwords-to-nums

您可以通过

pip install numwords_to_nums

以下方式安装 这是一个基本示例

from numwords_to_nums.numwords_to_nums import NumWordsToNum
num = NumWordsToNum()
   
result = num.numerical_words_to_numbers("twenty ten and twenty one")
print(result)  # Output: 2010 and 21
   
eval_result = num.evaluate('Hey calculate 2+5')
print(eval_result) # Output: 7

result = num.numerical_words_to_numbers('first')
print(result) # Output: 1st

I was looking for a library that will help me support all above and more edge case scenarios like ordinal numbers(first, second), bigger numbers , operators, etc and I found this numwords-to-nums

You can install via

pip install numwords_to_nums

Here's a basic example

from numwords_to_nums.numwords_to_nums import NumWordsToNum
num = NumWordsToNum()
   
result = num.numerical_words_to_numbers("twenty ten and twenty one")
print(result)  # Output: 2010 and 21
   
eval_result = num.evaluate('Hey calculate 2+5')
print(eval_result) # Output: 7

result = num.numerical_words_to_numbers('first')
print(result) # Output: 1st
怪我闹别瞎闹 2024-07-19 13:37:47

进行更改,以便 text2int(scale) 将返回正确的转换。 例如,text2int(“百”) => 100.

import re

numwords = {}


def text2int(textnum):

    if not numwords:

        units = [ "zero", "one", "two", "three", "four", "five", "six",
                "seven", "eight", "nine", "ten", "eleven", "twelve",
                "thirteen", "fourteen", "fifteen", "sixteen", "seventeen",
                "eighteen", "nineteen"]

        tens = ["", "", "twenty", "thirty", "forty", "fifty", "sixty", 
                "seventy", "eighty", "ninety"]

        scales = ["hundred", "thousand", "million", "billion", "trillion", 
                'quadrillion', 'quintillion', 'sexillion', 'septillion', 
                'octillion', 'nonillion', 'decillion' ]

        numwords["and"] = (1, 0)
        for idx, word in enumerate(units): numwords[word] = (1, idx)
        for idx, word in enumerate(tens): numwords[word] = (1, idx * 10)
        for idx, word in enumerate(scales): numwords[word] = (10 ** (idx * 3 or 2), 0)

    ordinal_words = {'first':1, 'second':2, 'third':3, 'fifth':5, 
            'eighth':8, 'ninth':9, 'twelfth':12}
    ordinal_endings = [('ieth', 'y'), ('th', '')]
    current = result = 0
    tokens = re.split(r"[\s-]+", textnum)
    for word in tokens:
        if word in ordinal_words:
            scale, increment = (1, ordinal_words[word])
        else:
            for ending, replacement in ordinal_endings:
                if word.endswith(ending):
                    word = "%s%s" % (word[:-len(ending)], replacement)

            if word not in numwords:
                raise Exception("Illegal word: " + word)

            scale, increment = numwords[word]

        if scale > 1:
            current = max(1, current)

        current = current * scale + increment
        if scale > 100:
            result += current
            current = 0

    return result + current

Made change so that text2int(scale) will return correct conversion. Eg, text2int("hundred") => 100.

import re

numwords = {}


def text2int(textnum):

    if not numwords:

        units = [ "zero", "one", "two", "three", "four", "five", "six",
                "seven", "eight", "nine", "ten", "eleven", "twelve",
                "thirteen", "fourteen", "fifteen", "sixteen", "seventeen",
                "eighteen", "nineteen"]

        tens = ["", "", "twenty", "thirty", "forty", "fifty", "sixty", 
                "seventy", "eighty", "ninety"]

        scales = ["hundred", "thousand", "million", "billion", "trillion", 
                'quadrillion', 'quintillion', 'sexillion', 'septillion', 
                'octillion', 'nonillion', 'decillion' ]

        numwords["and"] = (1, 0)
        for idx, word in enumerate(units): numwords[word] = (1, idx)
        for idx, word in enumerate(tens): numwords[word] = (1, idx * 10)
        for idx, word in enumerate(scales): numwords[word] = (10 ** (idx * 3 or 2), 0)

    ordinal_words = {'first':1, 'second':2, 'third':3, 'fifth':5, 
            'eighth':8, 'ninth':9, 'twelfth':12}
    ordinal_endings = [('ieth', 'y'), ('th', '')]
    current = result = 0
    tokens = re.split(r"[\s-]+", textnum)
    for word in tokens:
        if word in ordinal_words:
            scale, increment = (1, ordinal_words[word])
        else:
            for ending, replacement in ordinal_endings:
                if word.endswith(ending):
                    word = "%s%s" % (word[:-len(ending)], replacement)

            if word not in numwords:
                raise Exception("Illegal word: " + word)

            scale, increment = numwords[word]

        if scale > 1:
            current = max(1, current)

        current = current * scale + increment
        if scale > 100:
            result += current
            current = 0

    return result + current
七度光 2024-07-19 13:37:47

一个快速的解决方案是使用 inflect.py 生成用于翻译的字典。

inflect.py 有一个 number_to_words() 函数,它将数字(例如 2)转换为单词形式(例如 'two') 。 不幸的是,它的反向(这将允许您避免翻译字典路线)不提供。 尽管如此,您可以使用该函数来构建翻译字典:

>>> import inflect
>>> p = inflect.engine()
>>> word_to_number_mapping = {}
>>>
>>> for i in range(1, 100):
...     word_form = p.number_to_words(i)  # 1 -> 'one'
...     word_to_number_mapping[word_form] = i
...
>>> print word_to_number_mapping['one']
1
>>> print word_to_number_mapping['eleven']
11
>>> print word_to_number_mapping['forty-three']
43

如果您愿意花一些时间,也许可以检查 inflect.py 的 number_to_words() 函数的内部工作原理并构建您自己的代码来动态执行此操作(我还没有尝试这样做)。

A quick solution is to use the inflect.py to generate a dictionary for translation.

inflect.py has a number_to_words() function, that will turn a number (e.g. 2) to it's word form (e.g. 'two'). Unfortunately, its reverse (which would allow you to avoid the translation dictionary route) isn't offered. All the same, you can use that function to build the translation dictionary:

>>> import inflect
>>> p = inflect.engine()
>>> word_to_number_mapping = {}
>>>
>>> for i in range(1, 100):
...     word_form = p.number_to_words(i)  # 1 -> 'one'
...     word_to_number_mapping[word_form] = i
...
>>> print word_to_number_mapping['one']
1
>>> print word_to_number_mapping['eleven']
11
>>> print word_to_number_mapping['forty-three']
43

If you're willing to commit some time, it might be possible to examine inflect.py's inner-workings of the number_to_words() function and build your own code to do this dynamically (I haven't tried to do this).

假面具 2024-07-19 13:37:47

Marc Burns 的 ruby gem 可以做到这一点。 我最近分叉了它以增加多年来的支持。 您可以从 python 调用 ruby 代码

  require 'numbers_in_words'
  require 'numbers_in_words/duck_punch'

  nums = ["fifteen sixteen", "eighty five sixteen",  "nineteen ninety six",
          "one hundred and seventy nine", "thirteen hundred", "nine thousand two hundred and ninety seven"]
  nums.each {|n| p n; p n.in_numbers}

结果:
“十五十六”
1516
《八十五十六》
8516
“一九九六年”
1996年
《一百七十九》
179
“一千三百”
1300
“九千二百九十七”
9297

There's a ruby gem by Marc Burns that does it. I recently forked it to add support for years. You can call ruby code from python.

  require 'numbers_in_words'
  require 'numbers_in_words/duck_punch'

  nums = ["fifteen sixteen", "eighty five sixteen",  "nineteen ninety six",
          "one hundred and seventy nine", "thirteen hundred", "nine thousand two hundred and ninety seven"]
  nums.each {|n| p n; p n.in_numbers}

results:
"fifteen sixteen"
1516
"eighty five sixteen"
8516
"nineteen ninety six"
1996
"one hundred and seventy nine"
179
"thirteen hundred"
1300
"nine thousand two hundred and ninety seven"
9297

心碎无痕… 2024-07-19 13:37:47

我采用了 @recursive 的 逻辑 并转换为 Ruby。 我还对查找表进行了硬编码,因此它不是那么酷,但可能会帮助新手了解发生了什么。

WORDNUMS = {"zero"=> [1,0], "one"=> [1,1], "two"=> [1,2], "three"=> [1,3],
            "four"=> [1,4], "five"=> [1,5], "six"=> [1,6], "seven"=> [1,7], 
            "eight"=> [1,8], "nine"=> [1,9], "ten"=> [1,10], 
            "eleven"=> [1,11], "twelve"=> [1,12], "thirteen"=> [1,13], 
            "fourteen"=> [1,14], "fifteen"=> [1,15], "sixteen"=> [1,16], 
            "seventeen"=> [1,17], "eighteen"=> [1,18], "nineteen"=> [1,19], 
            "twenty"=> [1,20], "thirty" => [1,30], "forty" => [1,40], 
            "fifty" => [1,50], "sixty" => [1,60], "seventy" => [1,70], 
            "eighty" => [1,80], "ninety" => [1,90],
            "hundred" => [100,0], "thousand" => [1000,0], 
            "million" => [1000000, 0]}

def text_2_int(string)
  numberWords = string.gsub('-', ' ').split(/ /) - %w{and}
  current = result = 0
  numberWords.each do |word|
    scale, increment = WORDNUMS[word]
    current = current * scale + increment
    if scale > 100
      result += current
      current = 0
    end
  end
  return result + current
end

我正在寻找处理像 2146 这样的字符串

I took @recursive's logic and converted to Ruby. I've also hardcoded the lookup table so its not as cool but might help a newbie understand what is going on.

WORDNUMS = {"zero"=> [1,0], "one"=> [1,1], "two"=> [1,2], "three"=> [1,3],
            "four"=> [1,4], "five"=> [1,5], "six"=> [1,6], "seven"=> [1,7], 
            "eight"=> [1,8], "nine"=> [1,9], "ten"=> [1,10], 
            "eleven"=> [1,11], "twelve"=> [1,12], "thirteen"=> [1,13], 
            "fourteen"=> [1,14], "fifteen"=> [1,15], "sixteen"=> [1,16], 
            "seventeen"=> [1,17], "eighteen"=> [1,18], "nineteen"=> [1,19], 
            "twenty"=> [1,20], "thirty" => [1,30], "forty" => [1,40], 
            "fifty" => [1,50], "sixty" => [1,60], "seventy" => [1,70], 
            "eighty" => [1,80], "ninety" => [1,90],
            "hundred" => [100,0], "thousand" => [1000,0], 
            "million" => [1000000, 0]}

def text_2_int(string)
  numberWords = string.gsub('-', ' ').split(/ /) - %w{and}
  current = result = 0
  numberWords.each do |word|
    scale, increment = WORDNUMS[word]
    current = current * scale + increment
    if scale > 100
      result += current
      current = 0
    end
  end
  return result + current
end

I was looking to handle strings like two thousand one hundred and forty-six

想念有你 2024-07-19 13:37:47

这可以处理印度风格的单词中的数字、一些分数、数字和单词的组合以及加法。

def words_to_number(words):
    numbers = {"zero":0, "a":1, "half":0.5, "quarter":0.25, "one":1,"two":2,
               "three":3, "four":4,"five":5,"six":6,"seven":7,"eight":8,
               "nine":9, "ten":10,"eleven":11,"twelve":12, "thirteen":13,
               "fourteen":14, "fifteen":15,"sixteen":16,"seventeen":17,
               "eighteen":18,"nineteen":19, "twenty":20,"thirty":30, "forty":40,
               "fifty":50,"sixty":60,"seventy":70, "eighty":80,"ninety":90}

    groups = {"hundred":100, "thousand":1_000, 
              "lac":1_00_000, "lakh":1_00_000, 
              "million":1_000_000, "crore":10**7, 
              "billion":10**9, "trillion":10**12}
    
    split_at = ["and", "plus"]
    
    n = 0
    skip = False
    words_array = words.split(" ")
    for i, word in enumerate(words_array):
        if not skip:
            if word in groups:
                n*= groups[word]
            elif word in numbers:
                n += numbers[word]
            elif word in split_at:
                skip = True
                remaining = ' '.join(words_array[i+1:])
                n+=words_to_number(remaining)
            else:
                try:
                    n += float(word)
                except ValueError as e:
                    raise ValueError(f"Invalid word {word}") from e
    return n

测试:

print(words_to_number("a million and one"))
>> 1000001

print(words_to_number("one crore and one"))
>> 1000,0001

print(words_to_number("0.5 million one"))
>> 500001.0

print(words_to_number("half million and one hundred"))
>> 500100.0

print(words_to_number("quarter"))
>> 0.25

print(words_to_number("one hundred plus one"))
>> 101

This handles number in words of Indian style, some fractions, combination of numbers and words and also addition.

def words_to_number(words):
    numbers = {"zero":0, "a":1, "half":0.5, "quarter":0.25, "one":1,"two":2,
               "three":3, "four":4,"five":5,"six":6,"seven":7,"eight":8,
               "nine":9, "ten":10,"eleven":11,"twelve":12, "thirteen":13,
               "fourteen":14, "fifteen":15,"sixteen":16,"seventeen":17,
               "eighteen":18,"nineteen":19, "twenty":20,"thirty":30, "forty":40,
               "fifty":50,"sixty":60,"seventy":70, "eighty":80,"ninety":90}

    groups = {"hundred":100, "thousand":1_000, 
              "lac":1_00_000, "lakh":1_00_000, 
              "million":1_000_000, "crore":10**7, 
              "billion":10**9, "trillion":10**12}
    
    split_at = ["and", "plus"]
    
    n = 0
    skip = False
    words_array = words.split(" ")
    for i, word in enumerate(words_array):
        if not skip:
            if word in groups:
                n*= groups[word]
            elif word in numbers:
                n += numbers[word]
            elif word in split_at:
                skip = True
                remaining = ' '.join(words_array[i+1:])
                n+=words_to_number(remaining)
            else:
                try:
                    n += float(word)
                except ValueError as e:
                    raise ValueError(f"Invalid word {word}") from e
    return n

TEST:

print(words_to_number("a million and one"))
>> 1000001

print(words_to_number("one crore and one"))
>> 1000,0001

print(words_to_number("0.5 million one"))
>> 500001.0

print(words_to_number("half million and one hundred"))
>> 500100.0

print(words_to_number("quarter"))
>> 0.25

print(words_to_number("one hundred plus one"))
>> 101
£噩梦荏苒 2024-07-19 13:37:47

我发现更快的方法:

Da_Unità_a_Cifre = {'one': 1, 'two': 2, 'three': 3, 'four': 4, 'five': 5, 'six': 6, 'seven': 7, 'eight': 8, 'nine': 9, 'ten': 10, 'eleven': 11,
 'twelve': 12, 'thirteen': 13, 'fourteen': 14, 'fifteen': 15, 'sixteen': 16, 'seventeen': 17, 'eighteen': 18, 'nineteen': 19}

Da_Lettere_a_Decine = {"tw": 20, "th": 30, "fo": 40, "fi": 50, "si": 60, "se": 70, "ei": 80, "ni": 90, }

elemento = input("insert the word:")
Val_Num = 0
try:
    elemento.lower()
    elemento.strip()
    Unità = elemento[elemento.find("ty")+2:] # è uguale alla str: five

    if elemento[-1] == "y":
        Val_Num = int(Da_Lettere_a_Decine[elemento[0] + elemento[1]])
        print(Val_Num)
    elif elemento == "onehundred":
        Val_Num = 100
        print(Val_Num)
    else:
        Cifre_Unità = int(Da_Unità_a_Cifre[Unità])
        Cifre_Decine = int(Da_Lettere_a_Decine[elemento[0] + elemento[1]])
        Val_Num = int(Cifre_Decine + Cifre_Unità)
        print(Val_Num)
except:
    print("invalid input")

I find I faster way:

Da_Unità_a_Cifre = {'one': 1, 'two': 2, 'three': 3, 'four': 4, 'five': 5, 'six': 6, 'seven': 7, 'eight': 8, 'nine': 9, 'ten': 10, 'eleven': 11,
 'twelve': 12, 'thirteen': 13, 'fourteen': 14, 'fifteen': 15, 'sixteen': 16, 'seventeen': 17, 'eighteen': 18, 'nineteen': 19}

Da_Lettere_a_Decine = {"tw": 20, "th": 30, "fo": 40, "fi": 50, "si": 60, "se": 70, "ei": 80, "ni": 90, }

elemento = input("insert the word:")
Val_Num = 0
try:
    elemento.lower()
    elemento.strip()
    Unità = elemento[elemento.find("ty")+2:] # è uguale alla str: five

    if elemento[-1] == "y":
        Val_Num = int(Da_Lettere_a_Decine[elemento[0] + elemento[1]])
        print(Val_Num)
    elif elemento == "onehundred":
        Val_Num = 100
        print(Val_Num)
    else:
        Cifre_Unità = int(Da_Unità_a_Cifre[Unità])
        Cifre_Decine = int(Da_Lettere_a_Decine[elemento[0] + elemento[1]])
        Val_Num = int(Cifre_Decine + Cifre_Unità)
        print(Val_Num)
except:
    print("invalid input")
谜泪 2024-07-19 13:37:47

这是一个很酷的解决方案,所以我从他们的答案中获取了 @recursive 的 Python 代码,并在 ChatGPT 的帮助下将其转换为 C#,并对其进行了简化、格式化,并使其更加紧凑。

是的,我必须向 ChatGPT 发出大量指令。 我花了一段时间,但它就在这里。

我相信这段代码以及算法的工作原理会更清晰、更容易理解:

public class Parser
{
    public static int ParseInt(string s)
    {
        Dictionary<string, (int scale, int increment)> numwords = new Dictionary<string, (int, int)>
        {
            {"and", (1, 0)}, {"zero", (1, 0)}, {"one", (1, 1)}, {"two", (1, 2)}, {"three", (1, 3)},
            {"four", (1, 4)}, {"five", (1, 5)}, {"six", (1, 6)}, {"seven", (1, 7)}, {"eight", (1, 8)},
            {"nine", (1, 9)}, {"ten", (1, 10)}, {"eleven", (1, 11)}, {"twelve", (1, 12)}, {"thirteen", (1, 13)},
            {"fourteen", (1, 14)}, {"fifteen", (1, 15)}, {"sixteen", (1, 16)}, {"seventeen", (1, 17)}, {"eighteen", (1, 18)},
            {"nineteen", (1, 19)}, {"twenty", (1, 20)}, {"thirty", (1, 30)}, {"forty", (1, 40)}, {"fifty", (1, 50)},
            {"sixty", (1, 60)}, {"seventy", (1, 70)}, {"eighty", (1, 80)}, {"ninety", (1, 90)}, {"hundred", (100, 0)},
            {"thousand", (1000, 0)}, {"million", (1000000, 0)}, {"billion", (1000000000, 0)}
        };

        int current = 0;
        int result = 0;

        foreach (string word in s.Replace("-", " ").Split())
        {
            var (scale, increment) = numwords[word];

            current = current * scale + increment;

            if (scale > 100)
            {
                result += current;
                current = 0;
            }
        }

        return result + current;
    }
}

It's a cool solution, so I took @recursive's Python code from their answer and with help of ChatGPT I converted it to C# and also simplified it, formatted it, and made it a bit more compact.

Yes, I had to give a ton of instructions to ChatGPT. It took me a while, but here it is.

I believe it is clearer and easier to understand this code and how the algorithm works:

public class Parser
{
    public static int ParseInt(string s)
    {
        Dictionary<string, (int scale, int increment)> numwords = new Dictionary<string, (int, int)>
        {
            {"and", (1, 0)}, {"zero", (1, 0)}, {"one", (1, 1)}, {"two", (1, 2)}, {"three", (1, 3)},
            {"four", (1, 4)}, {"five", (1, 5)}, {"six", (1, 6)}, {"seven", (1, 7)}, {"eight", (1, 8)},
            {"nine", (1, 9)}, {"ten", (1, 10)}, {"eleven", (1, 11)}, {"twelve", (1, 12)}, {"thirteen", (1, 13)},
            {"fourteen", (1, 14)}, {"fifteen", (1, 15)}, {"sixteen", (1, 16)}, {"seventeen", (1, 17)}, {"eighteen", (1, 18)},
            {"nineteen", (1, 19)}, {"twenty", (1, 20)}, {"thirty", (1, 30)}, {"forty", (1, 40)}, {"fifty", (1, 50)},
            {"sixty", (1, 60)}, {"seventy", (1, 70)}, {"eighty", (1, 80)}, {"ninety", (1, 90)}, {"hundred", (100, 0)},
            {"thousand", (1000, 0)}, {"million", (1000000, 0)}, {"billion", (1000000000, 0)}
        };

        int current = 0;
        int result = 0;

        foreach (string word in s.Replace("-", " ").Split())
        {
            var (scale, increment) = numwords[word];

            current = current * scale + increment;

            if (scale > 100)
            {
                result += current;
                current = 0;
            }
        }

        return result + current;
    }
}
提笔书几行 2024-07-19 13:37:47

此代码适用于一系列数据:

import pandas as pd
mylist = pd.Series(['one','two','three'])
mylist1 = []
for x in range(len(mylist)):
    mylist1.append(w2n.word_to_num(mylist[x]))
print(mylist1)

This code works for a series data:

import pandas as pd
mylist = pd.Series(['one','two','three'])
mylist1 = []
for x in range(len(mylist)):
    mylist1.append(w2n.word_to_num(mylist[x]))
print(mylist1)
川水往事 2024-07-19 13:37:47

这段代码只适用于99以下的数字。无论是word到int还是int到word(其余的需要实现10-20行代码和简单的逻辑。这只是初学者的简单代码):

num = input("Enter the number you want to convert : ")
mydict = {'1': 'One', '2': 'Two', '3': 'Three', '4': 'Four', '5': 'Five','6': 'Six', '7': 'Seven', '8': 'Eight', '9': 'Nine', '10': 'Ten','11': 'Eleven', '12': 'Twelve', '13': 'Thirteen', '14': 'Fourteen', '15': 'Fifteen', '16': 'Sixteen', '17': 'Seventeen', '18': 'Eighteen', '19': 'Nineteen'}
mydict2 = ['', '', 'Twenty', 'Thirty', 'Fourty', 'fifty', 'sixty', 'Seventy', 'Eighty', 'Ninty']

if num.isdigit():
    if(int(num) < 20):
        print(" :---> " + mydict[num])
    else:
        var1 = int(num) % 10
        var2 = int(num) / 10
        print(" :---> " + mydict2[int(var2)] + mydict[str(var1)])
else:
    num = num.lower()
    dict_w = {'one': 1, 'two': 2, 'three': 3, 'four': 4, 'five': 5, 'six': 6, 'seven': 7, 'eight': 8, 'nine': 9, 'ten': 10, 'eleven': 11, 'twelve': 12, 'thirteen': 13, 'fourteen': 14, 'fifteen': 15, 'sixteen': 16, 'seventeen': '17', 'eighteen': '18', 'nineteen': '19'}
    mydict2 = ['', '', 'twenty', 'thirty', 'fourty', 'fifty', 'sixty', 'seventy', 'eighty', 'ninty']
    divide = num[num.find("ty")+2:]
    if num:
        if(num in dict_w.keys()):
            print(" :---> " + str(dict_w[num]))
        elif divide == '' :
            for i in range(0, len(mydict2)-1):
                if mydict2[i] == num:
                    print(" :---> " + str(i * 10))
        else :
            str3 = 0
            str1 = num[num.find("ty")+2:]
            str2 = num[:-len(str1)]
            for i in range(0, len(mydict2)):
                if mydict2[i] == str2:
                    str3 = i
            if str2 not in mydict2:
                print("----->Invalid Input<-----")                
            else:
                try:
                    print(" :---> " + str((str3*10) + dict_w[str1]))
                except:
                    print("----->Invalid Input<-----")
    else:
        print("----->Please Enter Input<-----")

This code works only for numbers below 99. Both word to int and int to word (for rest need to implement 10-20 lines of code and simple logic. This is just simple code for beginners):

num = input("Enter the number you want to convert : ")
mydict = {'1': 'One', '2': 'Two', '3': 'Three', '4': 'Four', '5': 'Five','6': 'Six', '7': 'Seven', '8': 'Eight', '9': 'Nine', '10': 'Ten','11': 'Eleven', '12': 'Twelve', '13': 'Thirteen', '14': 'Fourteen', '15': 'Fifteen', '16': 'Sixteen', '17': 'Seventeen', '18': 'Eighteen', '19': 'Nineteen'}
mydict2 = ['', '', 'Twenty', 'Thirty', 'Fourty', 'fifty', 'sixty', 'Seventy', 'Eighty', 'Ninty']

if num.isdigit():
    if(int(num) < 20):
        print(" :---> " + mydict[num])
    else:
        var1 = int(num) % 10
        var2 = int(num) / 10
        print(" :---> " + mydict2[int(var2)] + mydict[str(var1)])
else:
    num = num.lower()
    dict_w = {'one': 1, 'two': 2, 'three': 3, 'four': 4, 'five': 5, 'six': 6, 'seven': 7, 'eight': 8, 'nine': 9, 'ten': 10, 'eleven': 11, 'twelve': 12, 'thirteen': 13, 'fourteen': 14, 'fifteen': 15, 'sixteen': 16, 'seventeen': '17', 'eighteen': '18', 'nineteen': '19'}
    mydict2 = ['', '', 'twenty', 'thirty', 'fourty', 'fifty', 'sixty', 'seventy', 'eighty', 'ninty']
    divide = num[num.find("ty")+2:]
    if num:
        if(num in dict_w.keys()):
            print(" :---> " + str(dict_w[num]))
        elif divide == '' :
            for i in range(0, len(mydict2)-1):
                if mydict2[i] == num:
                    print(" :---> " + str(i * 10))
        else :
            str3 = 0
            str1 = num[num.find("ty")+2:]
            str2 = num[:-len(str1)]
            for i in range(0, len(mydict2)):
                if mydict2[i] == str2:
                    str3 = i
            if str2 not in mydict2:
                print("----->Invalid Input<-----")                
            else:
                try:
                    print(" :---> " + str((str3*10) + dict_w[str1]))
                except:
                    print("----->Invalid Input<-----")
    else:
        print("----->Please Enter Input<-----")
我的奇迹 2024-07-19 13:37:46

此代码的大部分内容是设置 numwords 字典,这仅在第一次调用时完成。

def text2int(textnum, numwords={}):
    if not numwords:
      units = [
        "zero", "one", "two", "three", "four", "five", "six", "seven", "eight",
        "nine", "ten", "eleven", "twelve", "thirteen", "fourteen", "fifteen",
        "sixteen", "seventeen", "eighteen", "nineteen",
      ]

      tens = ["", "", "twenty", "thirty", "forty", "fifty", "sixty", "seventy", "eighty", "ninety"]

      scales = ["hundred", "thousand", "million", "billion", "trillion"]

      numwords["and"] = (1, 0)
      for idx, word in enumerate(units):    numwords[word] = (1, idx)
      for idx, word in enumerate(tens):     numwords[word] = (1, idx * 10)
      for idx, word in enumerate(scales):   numwords[word] = (10 ** (idx * 3 or 2), 0)

    current = result = 0
    for word in textnum.split():
        if word not in numwords:
          raise Exception("Illegal word: " + word)

        scale, increment = numwords[word]
        current = current * scale + increment
        if scale > 100:
            result += current
            current = 0

    return result + current

print text2int("seven billion one hundred million thirty one thousand three hundred thirty seven")
#7100031337

The majority of this code is to set up the numwords dict, which is only done on the first call.

def text2int(textnum, numwords={}):
    if not numwords:
      units = [
        "zero", "one", "two", "three", "four", "five", "six", "seven", "eight",
        "nine", "ten", "eleven", "twelve", "thirteen", "fourteen", "fifteen",
        "sixteen", "seventeen", "eighteen", "nineteen",
      ]

      tens = ["", "", "twenty", "thirty", "forty", "fifty", "sixty", "seventy", "eighty", "ninety"]

      scales = ["hundred", "thousand", "million", "billion", "trillion"]

      numwords["and"] = (1, 0)
      for idx, word in enumerate(units):    numwords[word] = (1, idx)
      for idx, word in enumerate(tens):     numwords[word] = (1, idx * 10)
      for idx, word in enumerate(scales):   numwords[word] = (10 ** (idx * 3 or 2), 0)

    current = result = 0
    for word in textnum.split():
        if word not in numwords:
          raise Exception("Illegal word: " + word)

        scale, increment = numwords[word]
        current = current * scale + increment
        if scale > 100:
            result += current
            current = 0

    return result + current

print text2int("seven billion one hundred million thirty one thousand three hundred thirty seven")
#7100031337
短叹 2024-07-19 13:37:46

为了确切的目的,我刚刚向 PyPI 发布了一个名为 word2number 的 python 模块。 https://github.com/akshaynagpal/w2n

安装它:

pip install word2number

确保您的 pip 已更新到最新版本。

用法:

from word2number import w2n

print w2n.word_to_num("two million three thousand nine hundred and eighty four")
2003984

I have just released a python module to PyPI called word2number for the exact purpose. https://github.com/akshaynagpal/w2n

Install it using:

pip install word2number

make sure your pip is updated to the latest version.

Usage:

from word2number import w2n

print w2n.word_to_num("two million three thousand nine hundred and eighty four")
2003984
单身狗的梦 2024-07-19 13:37:46

我需要一些不同的东西,因为我的输入来自语音到文本的转换,并且解决方案并不总是对数字求和。 例如,“我的邮政编码是一二三四五”不应转换为“我的邮政编码是 15”。

我采用了 Andrew 的答案,并对其进行了调整以处理人们突出显示为错误的其他一些情况,并且还添加了对邮政编码等示例的支持我上面提到过。 下面显示了一些基本的测试用例,但我确信仍有改进的空间。

def is_number(x):
    if type(x) == str:
        x = x.replace(',', '')
    try:
        float(x)
    except:
        return False
    return True

def text2int (textnum, numwords={}):
    units = [
        'zero', 'one', 'two', 'three', 'four', 'five', 'six', 'seven', 'eight',
        'nine', 'ten', 'eleven', 'twelve', 'thirteen', 'fourteen', 'fifteen',
        'sixteen', 'seventeen', 'eighteen', 'nineteen',
    ]
    tens = ['', '', 'twenty', 'thirty', 'forty', 'fifty', 'sixty', 'seventy', 'eighty', 'ninety']
    scales = ['hundred', 'thousand', 'million', 'billion', 'trillion']
    ordinal_words = {'first':1, 'second':2, 'third':3, 'fifth':5, 'eighth':8, 'ninth':9, 'twelfth':12}
    ordinal_endings = [('ieth', 'y'), ('th', '')]

    if not numwords:
        numwords['and'] = (1, 0)
        for idx, word in enumerate(units): numwords[word] = (1, idx)
        for idx, word in enumerate(tens): numwords[word] = (1, idx * 10)
        for idx, word in enumerate(scales): numwords[word] = (10 ** (idx * 3 or 2), 0)

    textnum = textnum.replace('-', ' ')

    current = result = 0
    curstring = ''
    onnumber = False
    lastunit = False
    lastscale = False

    def is_numword(x):
        if is_number(x):
            return True
        if word in numwords:
            return True
        return False

    def from_numword(x):
        if is_number(x):
            scale = 0
            increment = int(x.replace(',', ''))
            return scale, increment
        return numwords[x]

    for word in textnum.split():
        if word in ordinal_words:
            scale, increment = (1, ordinal_words[word])
            current = current * scale + increment
            if scale > 100:
                result += current
                current = 0
            onnumber = True
            lastunit = False
            lastscale = False
        else:
            for ending, replacement in ordinal_endings:
                if word.endswith(ending):
                    word = "%s%s" % (word[:-len(ending)], replacement)

            if (not is_numword(word)) or (word == 'and' and not lastscale):
                if onnumber:
                    # Flush the current number we are building
                    curstring += repr(result + current) + " "
                curstring += word + " "
                result = current = 0
                onnumber = False
                lastunit = False
                lastscale = False
            else:
                scale, increment = from_numword(word)
                onnumber = True

                if lastunit and (word not in scales):                                                                                                                                                                                                                                         
                    # Assume this is part of a string of individual numbers to                                                                                                                                                                                                                
                    # be flushed, such as a zipcode "one two three four five"                                                                                                                                                                                                                 
                    curstring += repr(result + current)                                                                                                                                                                                                                                       
                    result = current = 0                                                                                                                                                                                                                                                      

                if scale > 1:                                                                                                                                                                                                                                                                 
                    current = max(1, current)                                                                                                                                                                                                                                                 

                current = current * scale + increment                                                                                                                                                                                                                                         
                if scale > 100:                                                                                                                                                                                                                                                               
                    result += current                                                                                                                                                                                                                                                         
                    current = 0                                                                                                                                                                                                                                                               

                lastscale = False                                                                                                                                                                                                              
                lastunit = False                                                                                                                                                
                if word in scales:                                                                                                                                                                                                             
                    lastscale = True                                                                                                                                                                                                         
                elif word in units:                                                                                                                                                                                                             
                    lastunit = True

    if onnumber:
        curstring += repr(result + current)

    return curstring

一些测试...

one two three -> 123
three forty five -> 345
three and forty five -> 3 and 45
three hundred and forty five -> 345
three hundred -> 300
twenty five hundred -> 2500
three thousand and six -> 3006
three thousand six -> 3006
nineteenth -> 19
twentieth -> 20
first -> 1
my zip is one two three four five -> my zip is 12345
nineteen ninety six -> 1996
fifty-seventh -> 57
one million -> 1000000
first hundred -> 100
I will buy the first thousand -> I will buy the 1000  # probably should leave ordinal in the string
thousand -> 1000
hundred and six -> 106
1 million -> 1000000

I needed something a bit different since my input is from a speech-to-text conversion and the solution is not always to sum the numbers. For example, "my zipcode is one two three four five" should not convert to "my zipcode is 15".

I took Andrew's answer and tweaked it to handle a few other cases people highlighted as errors, and also added support for examples like the zipcode one I mentioned above. Some basic test cases are shown below, but I'm sure there is still room for improvement.

def is_number(x):
    if type(x) == str:
        x = x.replace(',', '')
    try:
        float(x)
    except:
        return False
    return True

def text2int (textnum, numwords={}):
    units = [
        'zero', 'one', 'two', 'three', 'four', 'five', 'six', 'seven', 'eight',
        'nine', 'ten', 'eleven', 'twelve', 'thirteen', 'fourteen', 'fifteen',
        'sixteen', 'seventeen', 'eighteen', 'nineteen',
    ]
    tens = ['', '', 'twenty', 'thirty', 'forty', 'fifty', 'sixty', 'seventy', 'eighty', 'ninety']
    scales = ['hundred', 'thousand', 'million', 'billion', 'trillion']
    ordinal_words = {'first':1, 'second':2, 'third':3, 'fifth':5, 'eighth':8, 'ninth':9, 'twelfth':12}
    ordinal_endings = [('ieth', 'y'), ('th', '')]

    if not numwords:
        numwords['and'] = (1, 0)
        for idx, word in enumerate(units): numwords[word] = (1, idx)
        for idx, word in enumerate(tens): numwords[word] = (1, idx * 10)
        for idx, word in enumerate(scales): numwords[word] = (10 ** (idx * 3 or 2), 0)

    textnum = textnum.replace('-', ' ')

    current = result = 0
    curstring = ''
    onnumber = False
    lastunit = False
    lastscale = False

    def is_numword(x):
        if is_number(x):
            return True
        if word in numwords:
            return True
        return False

    def from_numword(x):
        if is_number(x):
            scale = 0
            increment = int(x.replace(',', ''))
            return scale, increment
        return numwords[x]

    for word in textnum.split():
        if word in ordinal_words:
            scale, increment = (1, ordinal_words[word])
            current = current * scale + increment
            if scale > 100:
                result += current
                current = 0
            onnumber = True
            lastunit = False
            lastscale = False
        else:
            for ending, replacement in ordinal_endings:
                if word.endswith(ending):
                    word = "%s%s" % (word[:-len(ending)], replacement)

            if (not is_numword(word)) or (word == 'and' and not lastscale):
                if onnumber:
                    # Flush the current number we are building
                    curstring += repr(result + current) + " "
                curstring += word + " "
                result = current = 0
                onnumber = False
                lastunit = False
                lastscale = False
            else:
                scale, increment = from_numword(word)
                onnumber = True

                if lastunit and (word not in scales):                                                                                                                                                                                                                                         
                    # Assume this is part of a string of individual numbers to                                                                                                                                                                                                                
                    # be flushed, such as a zipcode "one two three four five"                                                                                                                                                                                                                 
                    curstring += repr(result + current)                                                                                                                                                                                                                                       
                    result = current = 0                                                                                                                                                                                                                                                      

                if scale > 1:                                                                                                                                                                                                                                                                 
                    current = max(1, current)                                                                                                                                                                                                                                                 

                current = current * scale + increment                                                                                                                                                                                                                                         
                if scale > 100:                                                                                                                                                                                                                                                               
                    result += current                                                                                                                                                                                                                                                         
                    current = 0                                                                                                                                                                                                                                                               

                lastscale = False                                                                                                                                                                                                              
                lastunit = False                                                                                                                                                
                if word in scales:                                                                                                                                                                                                             
                    lastscale = True                                                                                                                                                                                                         
                elif word in units:                                                                                                                                                                                                             
                    lastunit = True

    if onnumber:
        curstring += repr(result + current)

    return curstring

Some tests...

one two three -> 123
three forty five -> 345
three and forty five -> 3 and 45
three hundred and forty five -> 345
three hundred -> 300
twenty five hundred -> 2500
three thousand and six -> 3006
three thousand six -> 3006
nineteenth -> 19
twentieth -> 20
first -> 1
my zip is one two three four five -> my zip is 12345
nineteen ninety six -> 1996
fifty-seventh -> 57
one million -> 1000000
first hundred -> 100
I will buy the first thousand -> I will buy the 1000  # probably should leave ordinal in the string
thousand -> 1000
hundred and six -> 106
1 million -> 1000000
爱情眠于流年 2024-07-19 13:37:46

如果有人感兴趣,我编写了一个保留字符串其余部分的版本(尽管它可能有错误,但没有对其进行太多测试)。

def text2int (textnum, numwords={}):
    if not numwords:
        units = [
        "zero", "one", "two", "three", "four", "five", "six", "seven", "eight",
        "nine", "ten", "eleven", "twelve", "thirteen", "fourteen", "fifteen",
        "sixteen", "seventeen", "eighteen", "nineteen",
        ]

        tens = ["", "", "twenty", "thirty", "forty", "fifty", "sixty", "seventy", "eighty", "ninety"]

        scales = ["hundred", "thousand", "million", "billion", "trillion"]

        numwords["and"] = (1, 0)
        for idx, word in enumerate(units):  numwords[word] = (1, idx)
        for idx, word in enumerate(tens):       numwords[word] = (1, idx * 10)
        for idx, word in enumerate(scales): numwords[word] = (10 ** (idx * 3 or 2), 0)

    ordinal_words = {'first':1, 'second':2, 'third':3, 'fifth':5, 'eighth':8, 'ninth':9, 'twelfth':12}
    ordinal_endings = [('ieth', 'y'), ('th', '')]

    textnum = textnum.replace('-', ' ')

    current = result = 0
    curstring = ""
    onnumber = False
    for word in textnum.split():
        if word in ordinal_words:
            scale, increment = (1, ordinal_words[word])
            current = current * scale + increment
            if scale > 100:
                result += current
                current = 0
            onnumber = True
        else:
            for ending, replacement in ordinal_endings:
                if word.endswith(ending):
                    word = "%s%s" % (word[:-len(ending)], replacement)

            if word not in numwords:
                if onnumber:
                    curstring += repr(result + current) + " "
                curstring += word + " "
                result = current = 0
                onnumber = False
            else:
                scale, increment = numwords[word]

                current = current * scale + increment
                if scale > 100:
                    result += current
                    current = 0
                onnumber = True

    if onnumber:
        curstring += repr(result + current)

    return curstring

示例:

 >>> text2int("I want fifty five hot dogs for two hundred dollars.")
 I want 55 hot dogs for 200 dollars.

如果您有“200 美元”,则可能会出现问题。 但是,这确实很艰难。

If anyone is interested, I hacked up a version that maintains the rest of the string (though it may have bugs, haven't tested it too much).

def text2int (textnum, numwords={}):
    if not numwords:
        units = [
        "zero", "one", "two", "three", "four", "five", "six", "seven", "eight",
        "nine", "ten", "eleven", "twelve", "thirteen", "fourteen", "fifteen",
        "sixteen", "seventeen", "eighteen", "nineteen",
        ]

        tens = ["", "", "twenty", "thirty", "forty", "fifty", "sixty", "seventy", "eighty", "ninety"]

        scales = ["hundred", "thousand", "million", "billion", "trillion"]

        numwords["and"] = (1, 0)
        for idx, word in enumerate(units):  numwords[word] = (1, idx)
        for idx, word in enumerate(tens):       numwords[word] = (1, idx * 10)
        for idx, word in enumerate(scales): numwords[word] = (10 ** (idx * 3 or 2), 0)

    ordinal_words = {'first':1, 'second':2, 'third':3, 'fifth':5, 'eighth':8, 'ninth':9, 'twelfth':12}
    ordinal_endings = [('ieth', 'y'), ('th', '')]

    textnum = textnum.replace('-', ' ')

    current = result = 0
    curstring = ""
    onnumber = False
    for word in textnum.split():
        if word in ordinal_words:
            scale, increment = (1, ordinal_words[word])
            current = current * scale + increment
            if scale > 100:
                result += current
                current = 0
            onnumber = True
        else:
            for ending, replacement in ordinal_endings:
                if word.endswith(ending):
                    word = "%s%s" % (word[:-len(ending)], replacement)

            if word not in numwords:
                if onnumber:
                    curstring += repr(result + current) + " "
                curstring += word + " "
                result = current = 0
                onnumber = False
            else:
                scale, increment = numwords[word]

                current = current * scale + increment
                if scale > 100:
                    result += current
                    current = 0
                onnumber = True

    if onnumber:
        curstring += repr(result + current)

    return curstring

Example:

 >>> text2int("I want fifty five hot dogs for two hundred dollars.")
 I want 55 hot dogs for 200 dollars.

There could be issues if you have, say, "$200". But, this was really rough.

假装不在乎 2024-07-19 13:37:46

我需要处理一些额外的解析情况,例如序数词(“第一”、“第二”)、连字符词(“一百”)和连字符序数词(如“第五十七”),所以我添加了几行:

def text2int(textnum, numwords={}):
    if not numwords:
        units = [
        "zero", "one", "two", "three", "four", "five", "six", "seven", "eight",
        "nine", "ten", "eleven", "twelve", "thirteen", "fourteen", "fifteen",
        "sixteen", "seventeen", "eighteen", "nineteen",
        ]

        tens = ["", "", "twenty", "thirty", "forty", "fifty", "sixty", "seventy", "eighty", "ninety"]

        scales = ["hundred", "thousand", "million", "billion", "trillion"]

        numwords["and"] = (1, 0)
        for idx, word in enumerate(units):  numwords[word] = (1, idx)
        for idx, word in enumerate(tens):       numwords[word] = (1, idx * 10)
        for idx, word in enumerate(scales): numwords[word] = (10 ** (idx * 3 or 2), 0)

    ordinal_words = {'first':1, 'second':2, 'third':3, 'fifth':5, 'eighth':8, 'ninth':9, 'twelfth':12}
    ordinal_endings = [('ieth', 'y'), ('th', '')]

    textnum = textnum.replace('-', ' ')

    current = result = 0
    for word in textnum.split():
        if word in ordinal_words:
            scale, increment = (1, ordinal_words[word])
        else:
            for ending, replacement in ordinal_endings:
                if word.endswith(ending):
                    word = "%s%s" % (word[:-len(ending)], replacement)

            if word not in numwords:
                raise Exception("Illegal word: " + word)

            scale, increment = numwords[word]
        
         current = current * scale + increment
         if scale > 100:
            result += current
            current = 0

    return result + current`

I needed to handle a couple extra parsing cases, such as ordinal words ("first", "second"), hyphenated words ("one-hundred"), and hyphenated ordinal words like ("fifty-seventh"), so I added a couple lines:

def text2int(textnum, numwords={}):
    if not numwords:
        units = [
        "zero", "one", "two", "three", "four", "five", "six", "seven", "eight",
        "nine", "ten", "eleven", "twelve", "thirteen", "fourteen", "fifteen",
        "sixteen", "seventeen", "eighteen", "nineteen",
        ]

        tens = ["", "", "twenty", "thirty", "forty", "fifty", "sixty", "seventy", "eighty", "ninety"]

        scales = ["hundred", "thousand", "million", "billion", "trillion"]

        numwords["and"] = (1, 0)
        for idx, word in enumerate(units):  numwords[word] = (1, idx)
        for idx, word in enumerate(tens):       numwords[word] = (1, idx * 10)
        for idx, word in enumerate(scales): numwords[word] = (10 ** (idx * 3 or 2), 0)

    ordinal_words = {'first':1, 'second':2, 'third':3, 'fifth':5, 'eighth':8, 'ninth':9, 'twelfth':12}
    ordinal_endings = [('ieth', 'y'), ('th', '')]

    textnum = textnum.replace('-', ' ')

    current = result = 0
    for word in textnum.split():
        if word in ordinal_words:
            scale, increment = (1, ordinal_words[word])
        else:
            for ending, replacement in ordinal_endings:
                if word.endswith(ending):
                    word = "%s%s" % (word[:-len(ending)], replacement)

            if word not in numwords:
                raise Exception("Illegal word: " + word)

            scale, increment = numwords[word]
        
         current = current * scale + increment
         if scale > 100:
            result += current
            current = 0

    return result + current`
芸娘子的小脾气 2024-07-19 13:37:46

这是简单的案例方法:

>>> number = {'one':1,
...           'two':2,
...           'three':3,}
>>> 
>>> number['two']
2

或者您正在寻找可以处理“一万二千一百七十二”的东西?

Here's the trivial case approach:

>>> number = {'one':1,
...           'two':2,
...           'three':3,}
>>> 
>>> number['two']
2

Or are you looking for something that can handle "twelve thousand, one hundred seventy-two"?

§普罗旺斯的薰衣草 2024-07-19 13:37:46
def parse_int(string):
    ONES = {'zero': 0,
            'one': 1,
            'two': 2,
            'three': 3,
            'four': 4,
            'five': 5,
            'six': 6,
            'seven': 7,
            'eight': 8,
            'nine': 9,
            'ten': 10,
            'eleven': 11,
            'twelve': 12,
            'thirteen': 13,
            'fourteen': 14,
            'fifteen': 15,
            'sixteen': 16,
            'seventeen': 17,
            'eighteen': 18,
            'nineteen': 19,
            'twenty': 20,
            'thirty': 30,
            'forty': 40,
            'fifty': 50,
            'sixty': 60,
            'seventy': 70,
            'eighty': 80,
            'ninety': 90,
              }

    numbers = []
    for token in string.replace('-', ' ').split(' '):
        if token in ONES:
            numbers.append(ONES[token])
        elif token == 'hundred':
            numbers[-1] *= 100
        elif token == 'thousand':
            numbers = [x * 1000 for x in numbers]
        elif token == 'million':
            numbers = [x * 1000000 for x in numbers]
    return sum(numbers)

使用 1 到 100 万范围内的 700 个随机数进行测试,效果良好。

def parse_int(string):
    ONES = {'zero': 0,
            'one': 1,
            'two': 2,
            'three': 3,
            'four': 4,
            'five': 5,
            'six': 6,
            'seven': 7,
            'eight': 8,
            'nine': 9,
            'ten': 10,
            'eleven': 11,
            'twelve': 12,
            'thirteen': 13,
            'fourteen': 14,
            'fifteen': 15,
            'sixteen': 16,
            'seventeen': 17,
            'eighteen': 18,
            'nineteen': 19,
            'twenty': 20,
            'thirty': 30,
            'forty': 40,
            'fifty': 50,
            'sixty': 60,
            'seventy': 70,
            'eighty': 80,
            'ninety': 90,
              }

    numbers = []
    for token in string.replace('-', ' ').split(' '):
        if token in ONES:
            numbers.append(ONES[token])
        elif token == 'hundred':
            numbers[-1] *= 100
        elif token == 'thousand':
            numbers = [x * 1000 for x in numbers]
        elif token == 'million':
            numbers = [x * 1000000 for x in numbers]
    return sum(numbers)

Tested with 700 random numbers in range 1 to million works well.

乱了心跳 2024-07-19 13:37:46

利用Python包: WordToDigits

pip install wordtodigits

它可以找到句子中以单词形式出现的数字,然后将它们转换为正确的数字格式。 还处理小数部分(如果存在)。 数字的单词表示可以出现在段落中的任何位置

Make use of the Python package: WordToDigits

pip install wordtodigits

It can find numbers present in word form in a sentence and then convert them to the proper numeric format. Also takes care of the decimal part, if present. The word representation of numbers could be anywhere in the passage.

纵情客 2024-07-19 13:37:46

如果您想要解析的数字数量有限,可以轻松地将其硬编码到字典中。

对于稍微复杂的情况,您可能希望根据相对简单的数字语法自动生成此字典。 类似的东西(当然,广义的......)

for i in range(10):
   myDict[30 + i] = "thirty-" + singleDigitsDict[i]

如果您需要更广泛的东西,那么看起来您将需要自然语言处理工具。 这篇文章可能是一个很好的起点。

This could be easily be hardcoded into a dictionary if there's a limited amount of numbers you'd like to parse.

For slightly more complex cases, you'll probably want to generate this dictionary automatically, based on the relatively simple numbers grammar. Something along the lines of this (of course, generalized...)

for i in range(10):
   myDict[30 + i] = "thirty-" + singleDigitsDict[i]

If you need something more extensive, then it looks like you'll need natural language processing tools. This article might be a good starting point.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文