使用正则表达式 split() CSV 文件中的值
我有一个正在解析的 CSV 文件。
我使用 split() 通过逗号分割列。
问题在于它正在拆分字段中包含逗号的列。
解决方案是在分割中使用正则表达式来忽略后面有空格的逗号(例如:“,”),并且仅分割逗号而没有尾随空格(例如:“,”)。
现在我的 split 看起来像这样:
$div = ',';
split('$div',$line);
如何修改我的 split() 调用?
I have a CSV file which I am parsing.
I am using split() to split the columns up by their commas.
The problem is that it is splitting columns that contain commas within the field.
The solution is to use a regular expression in the split to disregard commas with a space after them (EG: ", ") and only split commas with no trailing space (EG: ",").
Right now my split looks like this:
$div = ',';
split('$div',$line);
How would I modify my split() call?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
要使用 PHP 解析完整且有效的 CSV 文件,您只需要:
但是如果您的文件格式确实不一致,那么您确实需要手动拆分方法和更具体的正则表达式。
将是可用于匹配后面不跟空格的命令的正则表达式。请注意,您需要使用 PCRE 库中的
preg_split
,而不是旧的split
调用。To parse a complete and valid CSV file with PHP you just need:
But if your file format is really not consistent, then you would indeed need the manual split method and a more specific regex.
would be the regex you can use to match commans that are not followed by a space. Note that you need to use
preg_split
from the PCRE library, and not the oldersplit
call.CSV 文件的字段(特别是如果字段中包含逗号)应封装在引号中:
如果不是,那么歧义就是您的第一个问题:
有五个字段,而您对此无能为力1。
整理好源数据后,请使用
fgetcsv
来解析它。1 如果这是真的:
所有“内部”逗号后面都有空格,那么您可以运行预处理步骤,将所有
,
替换为\,
。转义 CSV 中的逗号可以解决歧义:The CSV file's fields (especially if fields have commas in them) should be encapsulated in quotes:
If they are not, then that ambiguity is your first problem:
has five fields, and there's nothing you can do about it1.
When you have your source data sorted out, use
fgetcsv
to parse it.1 If this is really true:
that all your "internal" commas have spaces after them, then you could run a pre-processing step, replacing all
,<space>
with\,
. Escaping the commas within CSV resolves the ambiguity:您正在重新发明轮子:PHP 有很好的方法可以自行完成此任务,即 fgetcsv:
You're reinventing the wheel: PHP has fine methods of accomplishing this by itself, namely fgetcsv:
始终将它们用作字符串。像这样
Always use them as a string. like this