明智地解析科学记数法?
我希望能够编写一个函数,该函数以字符串形式接收科学记数法中的数字,并将系数和指数作为单独的项目从中分离出来。 我可以只使用正则表达式,但传入的数字可能不会标准化,我更希望能够标准化然后分解各个部分。
一位同事已经使用 VB6 获得了部分解决方案,但还没有完全实现,如下面的文字记录所示。
cliVe> a = 1e6
cliVe> ? "coeff: " & o.spt(a) & " exponent: " & o.ept(a)
coeff: 10 exponent: 5
应该是 1 和 6
cliVe> a = 1.1e6
cliVe> ? "coeff: " & o.spt(a) & " exponent: " & o.ept(a)
coeff: 1.1 exponent: 6
正确
cliVe> a = 123345.6e-7
cliVe> ? "coeff: " & o.spt(a) & " exponent: " & o.ept(a)
coeff: 1.233456 exponent: -2
正确
cliVe> a = -123345.6e-7
cliVe> ? "coeff: " & o.spt(a) & " exponent: " & o.ept(a)
coeff: 1.233456 exponent: -2
应该是 -1.233456 和 -2
cliVe> a = -123345.6e+7
cliVe> ? "coeff: " & o.spt(a) & " exponent: " & o.ept(a)
coeff: 1.233456 exponent: 12
正确 >
有什么想法吗? 顺便说一下,Clive 是一个基于 VBScript 的 CLI,可以在我的博客上找到。
I want to be able to write a function which receives a number in scientific notation as a string and splits out of it the coefficient and the exponent as separate items. I could just use a regular expression, but the incoming number may not be normalised and I'd prefer to be able to normalise and then break the parts out.
A colleague has got part way of an solution using VB6 but it's not quite there, as the transcript below shows.
cliVe> a = 1e6
cliVe> ? "coeff: " & o.spt(a) & " exponent: " & o.ept(a)
coeff: 10 exponent: 5
should have been 1 and 6
cliVe> a = 1.1e6
cliVe> ? "coeff: " & o.spt(a) & " exponent: " & o.ept(a)
coeff: 1.1 exponent: 6
correct
cliVe> a = 123345.6e-7
cliVe> ? "coeff: " & o.spt(a) & " exponent: " & o.ept(a)
coeff: 1.233456 exponent: -2
correct
cliVe> a = -123345.6e-7
cliVe> ? "coeff: " & o.spt(a) & " exponent: " & o.ept(a)
coeff: 1.233456 exponent: -2
should be -1.233456 and -2
cliVe> a = -123345.6e+7
cliVe> ? "coeff: " & o.spt(a) & " exponent: " & o.ept(a)
coeff: 1.233456 exponent: 12
correct
Any ideas? By the way, Clive is a CLI based on VBScript and can be found on my weblog.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
Google “科学记数法正则表达式” 显示了许多匹配项,包括 < a href="http://www.regular-expressions.info/floatingpoint.html" rel="noreferrer">这个(不要使用它!!!)其中使用
包括 -.5e7 和 +00000e33 等情况(您可能不想允许这两种情况)。
相反,我强烈建议您使用 Doug Crockford 的 JSON 网站 上的语法,该网站明确记录 JSON 中数字的构成。 以下是取自该页面的相应语法图:
(来源:json.org)
如果您查看第 456 行他的 json2.js 脚本(在javascript),你会看到正则表达式的这一部分:
具有讽刺意味的是,它与他的语法图不匹配......(看起来我应该提交一个错误)我相信确实实现该语法图的正则表达式就是这个:
如果您也想允许首字母 +,您将得到:
根据您的喜好添加捕获括号。
我还强烈建议您充实一堆测试用例,以确保包含您想要包含(或不包含)的那些可能性,例如:
祝您好运!
Google on "scientific notation regexp" shows a number of matches, including this one (don't use it!!!!) which uses
which includes cases such as -.5e7 and +00000e33 (both of which you may not want to allow).
Instead, I would highly recommend you use the syntax on Doug Crockford's JSON website which explicitly documents what constitutes a number in JSON. Here's the corresponding syntax diagram taken from that page:
(source: json.org)
If you look at line 456 of his json2.js script (safe conversion to/from JSON in javascript), you'll see this portion of a regexp:
which, ironically, doesn't match his syntax diagram.... (looks like I should file a bug) I believe a regexp that does implement that syntax diagram is this one:
and if you want to allow an initial + as well, you get:
Add capturing parentheses to your liking.
I would also highly recommend you flesh out a bunch of test cases, to ensure you include those possibilities you want to include (or not include), such as:
Good luck!
基于评分最高的答案,我将正则表达式稍微修改为
/^[+\-]?(?=.)(?:0|[1-9]\d*)?(?:\ .\d*)?(?:\d[eE][+\-]?\d+)?$/
。这样做的好处是:
.9
之类的数字(我将(?:0|[1-9]\d*)
设为可选,而?
)(?=.)
)e9
,因为它需要科学记数法之前的\d
我的目标是使用它来捕获重要数字并进行重要数学运算。 因此,我还将使用捕获组将其分割,如下所示:
/^[+\-]?(?=.)(0|[1-9]\d*)?(\.\d *)?(?:(\d)[eE][+\-]?\d+)?$/
.关于如何从中获取有效数字的解释:
parseFloat()
的数字。>undefined
's with''
) 应该给出可以提取有效数字的原始数字。这个正则表达式还可以防止匹配左侧填充的零,JavaScript 有时会接受这种匹配,但我发现它会导致问题,并且不会给有效数字添加任何内容,因此我认为防止左侧填充的零是一个好处(尤其是在表单中)。 但是,我确信可以修改正则表达式以吞噬左侧填充的零。
我发现这个正则表达式的另一个问题是它不会匹配
90.e9
或其他此类数字。 然而,我发现这个或类似的匹配极不可能,因为科学计数法中的惯例是避免此类数字。 虽然您可以在 JavaScript 中输入它,但您也可以轻松输入9.0e10
并获得相同的有效数字。更新
在我的测试中,我还发现了它可能匹配
'.'
的错误。 因此,应将前瞻修改为(?=\.\d|\d)
,这将导致最终的正则表达式:Building off of the highest rated answer, I modified the regex slightly to be
/^[+\-]?(?=.)(?:0|[1-9]\d*)?(?:\.\d*)?(?:\d[eE][+\-]?\d+)?$/
.The benefits this provides are:
.9
(I made the(?:0|[1-9]\d*)
optional with?
)(?=.)
)e9
because it requires the\d
before the scientific notationMy goal in this is to use it for capturing significant figures and doing significant math. So I'm also going to slice it up with capturing groups like so:
/^[+\-]?(?=.)(0|[1-9]\d*)?(\.\d*)?(?:(\d)[eE][+\-]?\d+)?$/
.An explanation of how to get significant figures from this:
parseFloat()
undefined
's with''
) should give the original number from which significant figures can be extracted.This regex also prevents matching left-padded zeros, which JavaScript sometimes accepts but which I have seen cause issues and which adds nothing to significant figures, so I see preventing left-padded zeros as a benefit (especially in forms). However, I'm sure the regex could be modified to gobble up left-padded zeros.
Another problem I see with this regex is it won't match
90.e9
or other such numbers. However, I find this or similar matches highly unlikely as it is the convention in scientific notation to avoid such numbers. Though you can enter it in JavaScript, you can just as easily enter9.0e10
and achieve the same significant figures.UPDATE
In my testing, I also caught the error that it could match
'.'
. So the look-ahead should be modified to(?=\.\d|\d)
which leads to the final regex:在 @Troy Weber 的基础上,我建议
根据 @Jason S 的规则避免匹配
3.
Building on @Troy Weber, I would suggest
to avoid matching
3.
, per @Jason S's rules这是我刚刚快速编写的一些 Perl 代码。
Here is some Perl code I just hacked together quickly.