使用正则表达式 split() CSV 文件中的值

发布于 2024-11-10 08:24:59 字数 262 浏览 3 评论 0原文

我有一个正在解析的 CSV 文件。

我使用 split() 通过逗号分割列。

问题在于它正在拆分字段中包含逗号的列。

解决方案是在分割中使用正则表达式来忽略后面有空格的逗号(例如:“,”),并且仅分割逗号而没有尾随空格(例如:“,”)。

现在我的 split 看起来像这样:

$div = ',';
split('$div',$line);

如何修改我的 split() 调用?

I have a CSV file which I am parsing.

I am using split() to split the columns up by their commas.

The problem is that it is splitting columns that contain commas within the field.

The solution is to use a regular expression in the split to disregard commas with a space after them (EG: ", ") and only split commas with no trailing space (EG: ",").

Right now my split looks like this:

$div = ',';
split('$div',$line);

How would I modify my split() call?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

只是我以为 2024-11-17 08:24:59

要使用 PHP 解析完整且有效的 CSV 文件,您只需要:

$data = array_map("str_getcsv", file($fn));

但是如果您的文件格式确实不一致,那么您确实需要手动拆分方法和更具体的正则表达式。

preg_split('/,(?!\s)/', $line)

将是可用于匹配后面不跟空格的命令的正则表达式。请注意,您需要使用 PCRE 库中的 preg_split,而不是旧的 split 调用。

To parse a complete and valid CSV file with PHP you just need:

$data = array_map("str_getcsv", file($fn));

But if your file format is really not consistent, then you would indeed need the manual split method and a more specific regex.

preg_split('/,(?!\s)/', $line)

would be the regex you can use to match commans that are not followed by a space. Note that you need to use preg_split from the PCRE library, and not the older split call.

允世 2024-11-17 08:24:59

CSV 文件的字段(特别是如果字段中包含逗号)应封装在引号中:

 "A","B1,B2","C","D"

如果不是,那么歧义就是您的第一个问题:

 A,B1,B2,C,D

有五个字段,而您对此无能为力1。

整理好源数据后,请使用 fgetcsv 来解析它。


1 如果这是真的:

解决方案是在拆分中使用正则表达式来忽略后面有空格的逗号(例如:“,”),并且仅拆分逗号而没有尾随空格(例如:“,”)。

所有“内部”逗号后面都有空格,那么您可以运行预处理步骤,将所有 , 替换为 \,。转义 CSV 中的逗号可以解决歧义:

A,B1\,B2,C,D

The CSV file's fields (especially if fields have commas in them) should be encapsulated in quotes:

 "A","B1,B2","C","D"

If they are not, then that ambiguity is your first problem:

 A,B1,B2,C,D

has five fields, and there's nothing you can do about it1.

When you have your source data sorted out, use fgetcsv to parse it.


1 If this is really true:

The solution is to use a regular expression in the split to disregard commas with a space after them (EG: ", ") and only split commas with no trailing space (EG: ",").

that all your "internal" commas have spaces after them, then you could run a pre-processing step, replacing all ,<space> with \,. Escaping the commas within CSV resolves the ambiguity:

A,B1\,B2,C,D
无边思念无边月 2024-11-17 08:24:59

我有一个正在解析的 CSV 文件。

您正在重新发明轮子:PHP 有很好的方法可以自行完成此任务,即 fgetcsv:

if (($handle = fopen("test.csv", "r")) !== FALSE) {
    while (($data = fgetcsv($handle, 1000, ",")) !== FALSE) {
        $num = count($data);
        echo "<p> $num fields in line $row: <br /></p>\n";
        $row++;
        for ($c=0; $c < $num; $c++) {
            echo $data[$c] . "<br />\n";
        }
    }
    fclose($handle);
}

I have a CSV file which I am parsing.

You're reinventing the wheel: PHP has fine methods of accomplishing this by itself, namely fgetcsv:

if (($handle = fopen("test.csv", "r")) !== FALSE) {
    while (($data = fgetcsv($handle, 1000, ",")) !== FALSE) {
        $num = count($data);
        echo "<p> $num fields in line $row: <br /></p>\n";
        $row++;
        for ($c=0; $c < $num; $c++) {
            echo $data[$c] . "<br />\n";
        }
    }
    fclose($handle);
}
乖乖哒 2024-11-17 08:24:59

始终将它们用作字符串。像这样

$outstr .='"'.$line->linename.'",';

Always use them as a string. like this

$outstr .='"'.$line->linename.'",';

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文