浮点数解析:是否有 Catch All 算法?
多元文化编程的有趣部分之一是数字格式。
- 美国人使用 10,000.50
- 德国人使用 10.000,50
- 法国人使用 10 000,50
我的第一个方法是获取字符串,向后解析它,直到遇到分隔符并将其用作小数点分隔符。 这有一个明显的缺陷:10.000 将被解释为 10。
另一种方法:如果字符串包含 2 个不同的非数字字符,则使用最后一个作为小数分隔符并丢弃其他字符。 如果我只有一个,请检查它是否出现多次,如果出现则将其丢弃。 如果只出现一次,检查后面是否有3位数字。 如果是,则丢弃它,否则,将其用作小数点分隔符。
明显的“最佳解决方案”是检测用户的文化或浏览器,但如果您有一个使用美国 Windows/浏览器的法国人,则此方法不起作用。
.net 框架是否包含一些神秘的黑魔法浮点解析器,在尝试自动检测数字格式方面比 Double.(Try)Parse() 更好?
One of the fun parts of multi-cultural programming is number formats.
- Americans use 10,000.50
- Germans use 10.000,50
- French use 10 000,50
My first approach would be to take the string, parse it backwards until I encounter a separator and use this as my decimal separator. There is an obvious flaw with that: 10.000 would be interpreted as 10.
Another approach: if the string contains 2 different non-numeric characters, use the last one as the decimal separator and discard the others. If I only have one, check if it occurs more than once and discards it if it does. If it only appears once, check if it has 3 digits after it. If yes, discard it, otherwise, use it as decimal separator.
The obvious "best solution" would be to detect the User's culture or Browser, but that does not work if you have a Frenchman using an en-US Windows/Browser.
Does the .net Framework contain some mythical black magic floating point parser that is better than Double.(Try)Parse()
in trying to auto-detect the number format?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
你无法取悦所有人。 如果我输入 10 作为 10.000,而某人输入一万作为 10.000,如果不了解输入的文化,您将无法处理该问题。 以某种方式检测文化(浏览器、系统设置 - 用例是什么?ASP?内部应用程序,还是向世界开放?),或者提供预期格式的示例,并使用最宽松的解析器。 大概是这样的:
You can't please everyone. If I enter ten as 10.000, and someone enters ten thousand as 10.000, you cannot handle that without some knowledge of the culture of the input. Detect the culture somehow (browser, system setting - what is the use case? ASP? Internal app, or open to the world?), or provide an example of the expected formatting, and use the most lenient parser you can. Probably something like:
法语和英语中的 12.345 之间的差异是 1000 倍。如果您提供一个预期范围,其中 max < 1000*分钟,你很容易猜到。
以人(包括婴儿和儿童)的身高(以毫米为单位)为例。
通过使用 200-3000 的范围,输入 1.800 或 1,800 可以明确地解释为 1 米和 80 厘米,而输入 912.300 或 912,300 可以明确地解释为 91 厘米和 2.3 毫米。
The difference between 12.345 in French and English is a factor of 1000. If you supply an expected range where max < 1000*min, you can easily guess.
Take for example the height of a person (including babies and children) in mm.
By using a range of 200-3000, an input of 1.800 or 1,800 can unambiguously be interpreted as 1 meter and 80 centimeters, whereas an input of 912.300 or 912,300 can unambiguously be interpreted as 91 centimeters and 2.3 millimeters.
我不知道问题的 ASP.NET 方面,但 .NET 有一个非常强大的类: System.Globalization.CultureInfo。 您可以使用以下代码来解析包含双精度值的字符串:
如果 ASP.NET 以某种方式(即使用 HTTP 请求标头)将当前用户的 CultureInfo 传递到 CultureInfo.CurrentCulture 或 CultureInfo.CurrentUICulture,则这些将正常工作。
I don't know the ASP.NET side of the problem but .NET has a pretty powerful class: System.Globalization.CultureInfo. You can use the following code to parse a string containing a double value:
If ASP.NET somehow (i.e. using HTTP Request headers) passes current user's CultureInfo to either CultureInfo.CurrentCulture or CultureInfo.CurrentUICulture, these will work fine.
我认为在这种情况下你能做的最好的事情就是听取他们的意见,然后向他们展示你认为他们的意思。 如果他们不同意,请向他们展示您期望的格式并让他们再次输入。
I think the best you can do in this case is to take their input and then show them what you think they meant. If they disagree, show them the format you're expecting and get them to enter it again.