使用 PHP 处理国家/地区名称的 .txt 文件时,字符编码丢失
我正在处理 ISO-3166 中的国家/地区名称文本文件,以仅将国家/地区名称提取到数组中。问题是当我输出数组时,某些国家/地区的特殊字符丢失或更改:
$country_1 = fopen("./country_names_iso_3166.txt", "r");
while( !feof($country_1) ) { // go through every line of file
$country = fgets( $country_1 );
if( strstr($country, ",") ) // if country name contains a comma
$i = strpos( $country, "," ); // index position of comma
else
$i = strpos( $country, ";" ); // index position of semicolon
$country = substr( $country, 0, $i ); // extract just the country name
$countries[] = $country;
}
因此,现在当我输出数组时,例如,第二个国家/地区名称应该是 ÅLAND ISLANDS,但它输出为 LAND ISLANDS... 请建议如何解决这个问题。
I am processing a text file of country names from ISO-3166 to extract just the country names into an array. The problem is when I output the array, the special characters for some countries is lost or changed:
$country_1 = fopen("./country_names_iso_3166.txt", "r");
while( !feof($country_1) ) { // go through every line of file
$country = fgets( $country_1 );
if( strstr($country, ",") ) // if country name contains a comma
$i = strpos( $country, "," ); // index position of comma
else
$i = strpos( $country, ";" ); // index position of semicolon
$country = substr( $country, 0, $i ); // extract just the country name
$countries[] = $country;
}
So now when I output the array, for example, the second country name should be ÅLAND ISLANDS, however it outputs as LAND ISLANDS... Please advise on how to fix this.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
尝试改用支持多字节的字符串函数。 mb_strstr()、mb_strpos()、mb_substr()(基本上只是以
mb_
为前缀)。Try using the multibyte-aware string functions instead. mb_strstr(), mb_strpos(), mb_substr() (basically just prefix with
mb_
).确保输出数据的流使用与输入文件相同的字符集。
(删除了说 ISO-3166 是字符集的错误)
Make sure the stream you are outputting the data is using the same character set as the input file.
(Removed mistake of saying that ISO-3166 is a charset)