关于如何解析这个数据集有什么优雅的想法吗?
我正在使用 PHP 5.3 从 Web 服务调用接收数据集,该调用返回有关一项或多项交易的信息。每个事务的返回值均由竖线 (|
) 分隔,事务的开始/结束由空格分隔。
2109695|49658|25446|4|NSF|2010-11-24 13:34:00Z 2110314|45276|26311|4|NSF|2010-11-24 13:34:00Z 2110311|52117|26308|4|NSF|2010-11-24 13:34:00Z (etc)
由于日期时间戳中存在空格,因此对空间进行简单的分割不起作用。我对正则表达式非常了解,知道总是有不同的方法来解决这个问题,所以我认为获得一些专家的意见将帮助我想出最无懈可击的正则表达式。
I'm using PHP 5.3 to receive a Dataset from a web service call that brings back information on one or many transactions. Each transaction's return values are delimited by a pipe (|
), and beginning/ending of a transaction is delimited by a space.
2109695|49658|25446|4|NSF|2010-11-24 13:34:00Z 2110314|45276|26311|4|NSF|2010-11-24 13:34:00Z 2110311|52117|26308|4|NSF|2010-11-24 13:34:00Z (etc)
Doing a simple split on space doesn't work because of the space in the datetime stamp. I know regex well enough to know that there are always different ways to break this down, so I thought getting a few expert opinions would help me come up with the most airtight regex.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
如果每个时间戳末尾都有一个
Z
,则仅当它前面有一个Z
时,您才可以使用正后向断言在空间上进行分割,如下所示:一旦获得交易,您可以将它们拆分为
|
以获得各个部分。键盘链接
请注意,如果您的数据后面有
Z
如果时间戳以外的任何地方都有空格,则上述逻辑将失败。为了克服这个问题,只有当前面有时间戳模式时,您才能在空间上进行分割,如下所示:If each timestamp is going to have a
Z
at the end you can use positive lookbehind assertion to split on space only if it's preceded by aZ
as:Once you get the transactions, you can split them on
|
to get the individual parts.Codepad link
Note that if your data has a
Z
followed a space anywhere else other than the timestamp, the above logic will fail. To overcome than you can split on space only if it's preceded by a timestamp pattern as:正如其他人所说,如果您确定除了日期之外的任何地方都不会有
Z
字符,您可以这样做:但是如果您在其他地方有它们,您将需要做一些事情有点花哨。
基本上,该记录查找时间部分 (
00:00:00
),后跟Z
。然后它在下面的空白字符处分割......As others have said, if you know for sure that there will be no
Z
characters anywhere other than in the date, you could just do:But if you have them elsewhere, you'll need to do something a bit fancier.
Basically, that record looks for the time portion (
00:00:00
) followed by aZ
. Then it splits on the following white-space character...每个时间戳的末尾都会有一个 Z,因此将其分解为“Z”。您不需要正则表达式。日期不可能只有时间后面有 Z。
示例
Each timestamp is going to have a Z at the end so explode it by 'Z '. You don't need a regular expression. There's no chance that the date has a Z after it only the time.
example
使用
explode('|', $data)
函数Use
explode('|', $data)
function