在 PHP 中处理 csv 文件时如何指定编码？

发布于 2024-08-17 17:45:33 字数 377 浏览 5 评论 0原文

<?php
$row = 1;
$handle = fopen ("test.csv","r");
while ($data = fgetcsv ($handle, 1000, ",")) {
    $num = count ($data);
    print "<p> $num fields in line $row: <br>\n";
    $row++;
    for ($c=0; $c < $num; $c++) {
        print $data[$c] . "<br>\n";
    }
}
fclose ($handle);
?>

上面来自php手册，但我没有看到在哪里指定编码（如utf8左右）

原文

<?php
$row = 1;
$handle = fopen ("test.csv","r");
while ($data = fgetcsv ($handle, 1000, ",")) {
    $num = count ($data);
    print "<p> $num fields in line $row: <br>\n";
    $row++;
    for ($c=0; $c < $num; $c++) {
        print $data[$c] . "<br>\n";
    }
}
fclose ($handle);
?>

The above comes from php manual,but I didn't see where to specify the encoding(like utf8 or so)

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

写给空气的情书 2024-08-24 17:45:33

尝试更改区域设置。

正如您提供的手册中的示例所示：

注意：此函数会考虑区域设置。如果 LANG 是例如 en_US.UTF-8，
该函数读取一字节编码的文件是错误的。

同一页面上评论建议的方法：

setlocale(LC_ALL, 'ja_JP.UTF8'); // for japanese locale

来自 setlocale()：

区域设置名称可以在 RFC 1766 和 ISO 639。不同的系统有
区域设置的不同命名方案。 [...] 在 Windows 上，setlocale(LC_ALL, '') 设置
来自系统区域/语言设置的区域设置名称（可通过控制面板访问）。

Try to change the locale.

Like it says below the example in the manual you gave:

Note: Locale setting is taken into account by this function. If LANG is e.g. en_US.UTF-8,
files in one-byte encoding are read wrong by this function.

setlocale(LC_ALL, 'ja_JP.UTF8'); // for japanese locale

From setlocale():

Locale names can be found in RFC 1766 and ISO 639. Different systems have
different naming schemes for locales. […] On Windows, setlocale(LC_ALL, '') sets the
locale names from the system's regional/language settings (accessible via Control Panel).

回复收藏 0 原文

街道布景 2024-08-24 17:45:33

其中之一就是 UTF 字节顺序标记 (BOM) 的出现。字节顺序标记的 UTF-8 字符是 U+FEFF，或者更确切地说，三个字节 - 0xef、0xbb 和 0xbf - 位于文本文件的开头。对于 UTF-16，它用于指示字节顺序。对于 UTF-8 来说，这并不是必需的。

所以需要检测这三个字节并去掉BOM。下面是有关如何检测和删除这三个字节的简化示例。

$str = file_get_contents('file.utf8.csv');
$bom = pack("CCC", 0xef, 0xbb, 0xbf);
if (0 == strncmp($str, $bom, 3)) {
    echo "BOM detected - file is UTF-8\n";
    $str = substr($str, 3);
}

就这样

One such thing is the occurrence of the UTF byte order mark, or BOM. The UTF-8 character for the byte order mark is U+FEFF, or rather three bytes – 0xef, 0xbb and 0xbf – that sits in the beginning of the text file. For UTF-16 it is used to indicate the byte order. For UTF-8 it is not really necessary.

So you need to detect the three bytes and remove the BOM. Below is a simplified example on how to detect and remove the three bytes.

$str = file_get_contents('file.utf8.csv');
$bom = pack("CCC", 0xef, 0xbb, 0xbf);
if (0 == strncmp($str, $bom, 3)) {
    echo "BOM detected - file is UTF-8\n";
    $str = substr($str, 3);
}

That's all

回复收藏 0 原文

小耗子 2024-08-24 17:45:33

试试这个：

<?php
$handle = fopen ("specialchars.csv","r");
echo '<table border="1"><tr><td>First name</td><td>Last name</td></tr><tr>';
while ($data = fgetcsv ($handle, 1000, ";")) {
        $data = array_map("utf8_encode", $data); //added
        $num = count ($data);
        for ($c=0; $c < $num; $c++) {
            // output data
            echo "<td>$data[$c]</td>";
        }
        echo "</tr><tr>";
}
?>

try this:

<?php
$handle = fopen ("specialchars.csv","r");
echo '<table border="1"><tr><td>First name</td><td>Last name</td></tr><tr>';
while ($data = fgetcsv ($handle, 1000, ";")) {
        $data = array_map("utf8_encode", $data); //added
        $num = count ($data);
        for ($c=0; $c < $num; $c++) {
            // output data
            echo "<td>$data[$c]</td>";
        }
        echo "</tr><tr>";
}
?>

回复收藏 0 原文

~没有更多了~