使用 file: 协议从 URL 读取的 API 的默认编码应该是什么?
我正在设计一个 API,它将 URL 作为输入,并读取该 URL 处的内容。当 URL 是“file:”协议时,什么可以为字符编码提供更好的默认值?
- 系统的本机编码
- UTF-8
API 允许显式设置它。此外,我们可以使用一些启发式方法来确定字符编码,例如 BOM(如果可用),但是当所有这些都失败时,默认值应该是什么?
据我所知,标准对这个问题保持沉默。在其他条件相同的情况下,我希望对于那些甚至不知道字符编码这样的东西的人来说,正确的事情最常发生。
I'm designing an API which takes an URL as an input, and reads the content at that URL. When the URL is a "file:" protocol, what would make a better default for the character encoding?
- the system's native encoding
- UTF-8
The API allows this to be set explicitly. Also, there are a few heuristics we can use to determine the character encoding, like the BOM if available, but when all of these fail, what should be the default?
As far as I can tell, the standards are silent on this issue. All else being equal, I want the right thing to happen most often for someone who doesn't even know there is such a thing as character encoding.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
如果可能,请始终使用 UTF-8,并将其记录在 API 文档中。 UTF-8 是一个坚如磐石的编码标准,并且非常面向未来 - 我会避免通过支持其他编码来为自己产生潜在的工作 - 如果您迁移 API 以这样的方式使用,UTF-8 也会很容易使用可以通过 Web 服务访问它。
Always use UTF-8 if possible, and document this in your API documentation. UTF-8 is a rock solid standard for encoding and very future proof - I would avoid generating potential work for yourself by supporting other encodings - also UTF-8 will be easy to use if you migrate the API to be used in such a way that it can be accessed via a Web Service.