“ascii” beautifulsoup 中的编解码器错误
我正在使用 beautifulsoup 从 html 页面抓取数据。直到昨天一切都很好。但是现在我收到错误:
'ascii' codec can't encode character u'\xa9' in position 86700: ordinal not in range(128)
我正在使用代码:
import urllib2
from BeautifulSoup import BeautifulSoup
page = urllib2.urlopen(url).read()
soup = BeautifulSoup(page)
这给了我错误。
I am using beautifulsoup for scraping data from the html page. Till yesterday every thing was fine. But Now i am getting the error:
'ascii' codec can't encode character u'\xa9' in position 86700: ordinal not in range(128)
i am using the code:
import urllib2
from BeautifulSoup import BeautifulSoup
page = urllib2.urlopen(url).read()
soup = BeautifulSoup(page)
This is giving me the error.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
一个疯狂的猜测:
尝试指定页面的编码?
这也可能是Python安装的问题。如果您在没有 BeautifulSoup 的情况下打印非 ASCII 字符,您是否面临同样的问题?如果是,则需要设置编码:
A wild guess:
Try specifying the encoding of the page?
This can also be a problem with the Python installation. If you print non-ASCII characters without BeautifulSoup, do you face the same problem? If yes, then you need to set the encoding:
黑暗中的疯狂刺探:您正在阅读的页面没有明确声明编码,但不是 7 位 ASCII?
A wild stab in the dark: you're reading a page that doesn't explicitly declare an encoding and yet is not 7-bit ASCII?