输入的不是正确的UTF-8,请指示编码!字节:0xA0 0x20 0x42 0x72 in - 谷歌地理编码器
你能帮我解决这个问题吗?我有一个很大的列表,其中包含要进行地理编码的地址,但它不断给出此错误:
警告:simplexml_load_file() [function.simplexml-load-file]: http://maps.google.com/maps/geo?output=xml&key=KEY&q=928+Broadway%A0+Brooklyn%2C+11206+%2C+:3 : 解析器错误:输入不是正确的 UTF-8,指示编码!字节: 0xA0 0x20 0x42 0x72 英寸
有办法解决这个问题吗?
Can you please help me out on this? I have a big list with addresses to geocode and it keeps giving this error:
Warning: simplexml_load_file() [function.simplexml-load-file]:
http://maps.google.com/maps/geo?output=xml&key=KEY&q=928+Broadway%A0+Brooklyn%2C+11206+%2C+:3:
parser error : Input is not proper UTF-8, indicate encoding ! Bytes:
0xA0 0x20 0x42 0x72 in
Is there a way to solve this?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
您的输入不是 UTF-8 文档。 0xA0 将是 2 到 4 字节序列的后续字节(0xA0 为
10100000
,所有初始字节以11
开头,所有单字节字符以零),但这里它显示为前导字节。这可能意味着您的文档要么已损坏(根据 XML 定义,其格式不正确),要么是使用代码页(或者不太可能是 UTF-16)创建的。
您必须告知 XML 解析器如何翻译 0-128 ASCII 范围之外的字符,或者根据您的需要删除错误的字节序列。
另一种方法是使用更宽容的解析器,例如 Beautiful Soup。
您应该很高兴收到错误消息 - 唯一可能发生的其他事情是无声损坏。
Your input is not a UTF-8 document. 0xA0 would be a following byte of a 2- to 4-byte sequence (0xA0 is
10100000
, all initial bytes start with11
, and all one-byte characters start with a zero), but here it's shown as the leading byte.This likely means that your document is either corrupted (according to the XML definitions, it is not well formed) or it was created using a codepage (or, very unlikely, UTF-16).
You will have to inform your XML parser how to translate characters outside the 0-128 ASCII range, or remove the errant byte sequences as you see fit.
An alternative is to use a more tolerant parser such as Beautiful Soup.
You should be very glad you got the error message - the only other thing that could happen is silent corruption.
该错误是由 %A0 引起的,它是不间断空格的 Latin-1。对于英语,用空格替换它可能就足够了(编码为
+
);这里可以删除。您也可以执行
utf8_encode($city)
。The error is caused by %A0 which is Latin-1 for non-breaking space. For English it probably would suffice to replace this by a space (encoded as
+
); here it could be deleted.You can do
utf8_encode($city)
too.您应该切换到 Google Maps API 地理编码网络服务。您的请求将如下所示:
You should switch to the Google Maps API Geocoding Web Service. Your request would look something like this: