PHP 的 cUrl 函数返回错误字符
我正在尝试使用 cURL 检索远程 HTML 页面 - 但是,当我分析返回的文本时,我注意到很多奇怪的字符,例如 [–€€
,这使得我认为文本编码中的某个地方出了问题。
如何确保从 cURL 返回的文本经过正确编码,以及如何对其进行规范化,以便可以安全地将结果存储在数据库中而不会出现任何编码问题?
I'm attempting to retrieve a remote HTML page with cURL - however, when I analyze the text that gets returned, I'm noticing alot of odd characters like ▀Ã
, which makes me think that something went wrong with the text encoding somewhere along the line.
How can I ensure that the text I get back from cURL is properly encoded, and how can I normalize it so I can safely store results in a database without any encoding issues?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
我希望您已将 CURLOPT_ENCODING 设置为“”,并且页面中没有充满您看到的那些乱码,我建议的第二件事是通过诸如 html 实体之类的东西运行字符串来清理它。 Curl 只是获取/发布数据,恕我直言,不会更改编码
I hope you have set CURLOPT_ENCODING to "" and the page is not full of those gibberish which you see, second thing I can suggest is to run the string through some thing like html entities to sanitise it. Curl simply gets/posts the data and, IMHO, doesn't change the encodings
您需要在页面顶部包含以下内容:
You need to include the following on the top of your page: