关于如何解析这个数据集有什么优雅的想法吗?

发布于 2024-10-05 09:20:17 字数 376 浏览 0 评论 0原文

我正在使用 PHP 5.3 从 Web 服务调用接收数据集,该调用返回有关一项或多项交易的信息。每个事务的返回值均由竖线 (|) 分隔,事务的开始/结束由空格分隔。

2109695|49658|25446|4|NSF|2010-11-24 13:34:00Z 2110314|45276|26311|4|NSF|2010-11-24 13:34:00Z 2110311|52117|26308|4|NSF|2010-11-24 13:34:00Z (etc)

由于日期时间戳中存在空格,因此对空间进行简单的分割不起作用。我对正则表达式非常了解,知道总是有不同的方法来解决这个问题,所以我认为获得一些专家的意见将帮助我想出最无懈可击的正则表达式。

I'm using PHP 5.3 to receive a Dataset from a web service call that brings back information on one or many transactions. Each transaction's return values are delimited by a pipe (|), and beginning/ending of a transaction is delimited by a space.

2109695|49658|25446|4|NSF|2010-11-24 13:34:00Z 2110314|45276|26311|4|NSF|2010-11-24 13:34:00Z 2110311|52117|26308|4|NSF|2010-11-24 13:34:00Z (etc)

Doing a simple split on space doesn't work because of the space in the datetime stamp. I know regex well enough to know that there are always different ways to break this down, so I thought getting a few expert opinions would help me come up with the most airtight regex.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

风苍溪 2024-10-12 09:20:17

如果每个时间戳末尾都有一个 Z,则仅当它前面有一个 Z 时,您才可以使用正后向断言在空间上进行分割,如下所示:

$transaction = preg_split('/(?<=Z) /',$input);

一旦获得交易,您可以将它们拆分为 | 以获得各个部分。

键盘链接

请注意,如果您的数据后面有 Z如果时间戳以外的任何地方都有空格,则上述逻辑将失败。为了克服这个问题,只有当前面有时间戳模式时,您才能在空间上进行分割,如下所示:

$transaction = preg_split('/(?<=\d\d:\d\d:\d\dZ) /',$input);

If each timestamp is going to have a Z at the end you can use positive lookbehind assertion to split on space only if it's preceded by a Z as:

$transaction = preg_split('/(?<=Z) /',$input);

Once you get the transactions, you can split them on | to get the individual parts.

Codepad link

Note that if your data has a Z followed a space anywhere else other than the timestamp, the above logic will fail. To overcome than you can split on space only if it's preceded by a timestamp pattern as:

$transaction = preg_split('/(?<=\d\d:\d\d:\d\dZ) /',$input);
过期以后 2024-10-12 09:20:17

正如其他人所说,如果您确定除了日期之外的任何地方都不会有 Z 字符,您可以这样做:

$records = explode('Z', $data);

但是如果您在其他地方有它们,您将需要做一些事情有点花哨。

$regex = '#(?<=\d{2}:\d{2}:\d{2}Z)\s#i';
$records = preg_split($regex, $data, -1, PREG_SPLIT_NO_EMPTY);

基本上,该记录查找时间部分 (00:00:00),后跟 Z。然后它在下面的空白字符处分割......

As others have said, if you know for sure that there will be no Z characters anywhere other than in the date, you could just do:

$records = explode('Z', $data);

But if you have them elsewhere, you'll need to do something a bit fancier.

$regex = '#(?<=\d{2}:\d{2}:\d{2}Z)\s#i';
$records = preg_split($regex, $data, -1, PREG_SPLIT_NO_EMPTY);

Basically, that record looks for the time portion (00:00:00) followed by a Z. Then it splits on the following white-space character...

少钕鈤記 2024-10-12 09:20:17

每个时间戳的末尾都会有一个 Z,因此将其分解为“Z”。您不需要正则表达式。日期不可能只有时间后面有 Z。

示例

Each timestamp is going to have a Z at the end so explode it by 'Z '. You don't need a regular expression. There's no chance that the date has a Z after it only the time.

example

宁愿没拥抱 2024-10-12 09:20:17

使用 explode('|', $data) 函数

Use explode('|', $data) function

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文