有没有办法将数字转换为整数?
我需要将 one
转换为 1
,将 two
转换为 2
等等。
有没有办法通过图书馆、课程或其他东西来做到这一点?
I need to convert one
into 1
, two
into 2
and so on.
Is there a way to do this with a library or a class or anything?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(19)
我一直在寻找一个库来帮助我支持所有上述以及更多边缘情况场景,例如序数(第一,第二),更大的数字,运算符等,我发现了这个 numwords-to-nums
您可以通过
以下方式安装 这是一个基本示例
I was looking for a library that will help me support all above and more edge case scenarios like ordinal numbers(first, second), bigger numbers , operators, etc and I found this numwords-to-nums
You can install via
Here's a basic example
进行更改,以便 text2int(scale) 将返回正确的转换。 例如,text2int(“百”) => 100.
Made change so that text2int(scale) will return correct conversion. Eg, text2int("hundred") => 100.
一个快速的解决方案是使用 inflect.py 生成用于翻译的字典。
inflect.py 有一个
number_to_words()
函数,它将数字(例如2
)转换为单词形式(例如'two'
) 。 不幸的是,它的反向(这将允许您避免翻译字典路线)不提供。 尽管如此,您可以使用该函数来构建翻译字典:如果您愿意花一些时间,也许可以检查 inflect.py 的
number_to_words()
函数的内部工作原理并构建您自己的代码来动态执行此操作(我还没有尝试这样做)。A quick solution is to use the inflect.py to generate a dictionary for translation.
inflect.py has a
number_to_words()
function, that will turn a number (e.g.2
) to it's word form (e.g.'two'
). Unfortunately, its reverse (which would allow you to avoid the translation dictionary route) isn't offered. All the same, you can use that function to build the translation dictionary:If you're willing to commit some time, it might be possible to examine inflect.py's inner-workings of the
number_to_words()
function and build your own code to do this dynamically (I haven't tried to do this).Marc Burns 的 ruby gem 可以做到这一点。 我最近分叉了它以增加多年来的支持。 您可以从 python 调用 ruby 代码。
结果:
“十五十六”
1516
《八十五十六》
8516
“一九九六年”
1996年
《一百七十九》
179
“一千三百”
1300
“九千二百九十七”
9297
There's a ruby gem by Marc Burns that does it. I recently forked it to add support for years. You can call ruby code from python.
results:
"fifteen sixteen"
1516
"eighty five sixteen"
8516
"nineteen ninety six"
1996
"one hundred and seventy nine"
179
"thirteen hundred"
1300
"nine thousand two hundred and ninety seven"
9297
我采用了 @recursive 的 逻辑 并转换为 Ruby。 我还对查找表进行了硬编码,因此它不是那么酷,但可能会帮助新手了解发生了什么。
我正在寻找处理像
2146
这样的字符串I took @recursive's logic and converted to Ruby. I've also hardcoded the lookup table so its not as cool but might help a newbie understand what is going on.
I was looking to handle strings like
two thousand one hundred and forty-six
这可以处理印度风格的单词中的数字、一些分数、数字和单词的组合以及加法。
测试:
This handles number in words of Indian style, some fractions, combination of numbers and words and also addition.
TEST:
我发现更快的方法:
I find I faster way:
这是一个很酷的解决方案,所以我从他们的答案中获取了 @recursive 的 Python 代码,并在 ChatGPT 的帮助下将其转换为 C#,并对其进行了简化、格式化,并使其更加紧凑。
是的,我必须向 ChatGPT 发出大量指令。 我花了一段时间,但它就在这里。
我相信这段代码以及算法的工作原理会更清晰、更容易理解:
It's a cool solution, so I took @recursive's Python code from their answer and with help of ChatGPT I converted it to C# and also simplified it, formatted it, and made it a bit more compact.
Yes, I had to give a ton of instructions to ChatGPT. It took me a while, but here it is.
I believe it is clearer and easier to understand this code and how the algorithm works:
此代码适用于一系列数据:
This code works for a series data:
这段代码只适用于99以下的数字。无论是word到int还是int到word(其余的需要实现10-20行代码和简单的逻辑。这只是初学者的简单代码):
This code works only for numbers below 99. Both word to int and int to word (for rest need to implement 10-20 lines of code and simple logic. This is just simple code for beginners):
此代码的大部分内容是设置 numwords 字典,这仅在第一次调用时完成。
The majority of this code is to set up the numwords dict, which is only done on the first call.
为了确切的目的,我刚刚向 PyPI 发布了一个名为 word2number 的 python 模块。 https://github.com/akshaynagpal/w2n
安装它:
确保您的 pip 已更新到最新版本。
用法:
I have just released a python module to PyPI called word2number for the exact purpose. https://github.com/akshaynagpal/w2n
Install it using:
make sure your pip is updated to the latest version.
Usage:
我需要一些不同的东西,因为我的输入来自语音到文本的转换,并且解决方案并不总是对数字求和。 例如,“我的邮政编码是一二三四五”不应转换为“我的邮政编码是 15”。
我采用了 Andrew 的答案,并对其进行了调整以处理人们突出显示为错误的其他一些情况,并且还添加了对邮政编码等示例的支持我上面提到过。 下面显示了一些基本的测试用例,但我确信仍有改进的空间。
一些测试...
I needed something a bit different since my input is from a speech-to-text conversion and the solution is not always to sum the numbers. For example, "my zipcode is one two three four five" should not convert to "my zipcode is 15".
I took Andrew's answer and tweaked it to handle a few other cases people highlighted as errors, and also added support for examples like the zipcode one I mentioned above. Some basic test cases are shown below, but I'm sure there is still room for improvement.
Some tests...
如果有人感兴趣,我编写了一个保留字符串其余部分的版本(尽管它可能有错误,但没有对其进行太多测试)。
示例:
如果您有“200 美元”,则可能会出现问题。 但是,这确实很艰难。
If anyone is interested, I hacked up a version that maintains the rest of the string (though it may have bugs, haven't tested it too much).
Example:
There could be issues if you have, say, "$200". But, this was really rough.
我需要处理一些额外的解析情况,例如序数词(“第一”、“第二”)、连字符词(“一百”)和连字符序数词(如“第五十七”),所以我添加了几行:
I needed to handle a couple extra parsing cases, such as ordinal words ("first", "second"), hyphenated words ("one-hundred"), and hyphenated ordinal words like ("fifty-seventh"), so I added a couple lines:
这是简单的案例方法:
或者您正在寻找可以处理“一万二千一百七十二”的东西?
Here's the trivial case approach:
Or are you looking for something that can handle "twelve thousand, one hundred seventy-two"?
使用 1 到 100 万范围内的 700 个随机数进行测试,效果良好。
Tested with 700 random numbers in range 1 to million works well.
利用Python包: WordToDigits
它可以找到句子中以单词形式出现的数字,然后将它们转换为正确的数字格式。 还处理小数部分(如果存在)。 数字的单词表示可以出现在段落中的任何位置。
Make use of the Python package: WordToDigits
It can find numbers present in word form in a sentence and then convert them to the proper numeric format. Also takes care of the decimal part, if present. The word representation of numbers could be anywhere in the passage.
如果您想要解析的数字数量有限,可以轻松地将其硬编码到字典中。
对于稍微复杂的情况,您可能希望根据相对简单的数字语法自动生成此字典。 类似的东西(当然,广义的......)
如果您需要更广泛的东西,那么看起来您将需要自然语言处理工具。 这篇文章可能是一个很好的起点。
This could be easily be hardcoded into a dictionary if there's a limited amount of numbers you'd like to parse.
For slightly more complex cases, you'll probably want to generate this dictionary automatically, based on the relatively simple numbers grammar. Something along the lines of this (of course, generalized...)
If you need something more extensive, then it looks like you'll need natural language processing tools. This article might be a good starting point.