hpricot-UTF-8 中的无效字节序列
我已经做了一些搜索,但没有一个可以解决这个特殊的、意想不到的问题。 看看下面的代码:
require 'open-uri'
require 'hpricot'
doc = Hpricot(open("http://www.baidu.com/")) #this web page's encoding is GB2312
我不知道这里发生了什么,你可以在你的irb中看看是否能解决问题
它只是弹出“ArgumentError:UTF-8中的无效字节序列”
我尝试过通过 Iconv 将原始 HTML 转换为 utf-8 但它仍然无法工作
伙计们,我现在真的不知道该怎么办,请帮助我
I already done some searches but none of that can solve this peculiar,unexpected problem.
Just look at the code blow:
require 'open-uri'
require 'hpricot'
doc = Hpricot(open("http://www.baidu.com/")) #this web page's encoding is GB2312
I don't know what's going on here,you can this in your irb to see if you can get the problem
It just pop up "ArgumentError: invalid byte sequence in UTF-8"
I have try to convert the original HTML into utf-8 by Iconv but it still won't work
Guys,I really don't what to do now,please help me
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
Hpricot - UTF-8 问题
UTF-8 中的无效字节序列(ArgumentError)
Hpricot - UTF-8 issues
invalid byte sequence in UTF-8 (ArgumentError)
我知道它如何与 Net::HTTP (Ruby 1.9.2) 一起工作:
这有帮助吗?
I know how it could work with Net::HTTP (Ruby 1.9.2):
Does that help?