指定字符编码的方式有什么区别?
我见过几种指定字符串编码的方法,如下:
# -*-coding: utf-8 -*-
#coding: utf-8
#encoding : utf-8
#!/usr/bin/env ruby -Ku
#!/usr/bin/env ruby -Eutf-8
编码.default_external = “utf-8”
还有其他的吗?有人可以告诉我它们的区别(如果有)以及它们的起源(如果有)吗?有旧的和新的吗?次要的和受欢迎的;贬值的和升值的?
I have seen several ways of specifying the string encoding as follows:
# -*- coding: utf-8 -*-
# coding: utf-8
# encoding: utf-8
#!/usr/bin/env ruby -Ku
#!/usr/bin/env ruby -Eutf-8
Encoding.default_external = "utf-8"
Are there any other? Can someone tell me their difference if any, and their origin if there are any? Are there old ones and new ones; minor ones and popular ones; depreciated ones and appreciated ones?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
第二个和第三个基本相同,您在逐个文件的基础上指定编码。您只需要“编码”,但因为“编码”包含“编码”一词,所以它也有效。我不记得其他的了,但是 Peter Cooper 的 Ruby 1.9 演练 回顾了其中的一些差异。
The 2nd and 3rd are basically the same, where you are specifying the encoding on a file by file basis. You only need "coding", but because "encoding" contains the word "coding", that works too. I can't remember the others off-hand, but Peter Cooper's Ruby 1.9 walkthrough goes over some of the differences.
TL;DR 版本:使用
#coding: utf-8
或#encoding: utf-8
;它们是现代的,并且它们之间没有区别。根据这篇最具启发性的文章,在 Ruby 1.9 中,神奇的注释必须是:
因此涵盖了 1, 2 & 。 3 并且可能还包括诸如
# foobarcoding: utf-8
之类的内容。这是 Ruby 1.9 的首选方法。出于兼容性原因,保留了 Ruby 1.8 中的 hash bang
-K*
开关,其中涵盖了 4。数字 5 和 6 涵盖了略有不同的内容。我建议阅读上述链接的文章,了解外部和内部编码的确切工作原理。然而要点是,当您通过 IO 对象读取数据时,如何对数据进行编码以正确读取数据很重要。外部编码正是表达了这一点。因此,当您将外部编码设置为 UTF-8 时,您就意味着您正在读取的文件是以 UTF-8 编码的。内部编码是 Ruby 自动转码该操作生成的字符串的编码。
当未明确设置外部编码时,将使用您设置的默认值。这些默认值可以通过 hash bang 中的
-E
标志进行更改(数字 5;因此 5 和 6 的工作方式相同)。传递
-U
会将内部编码设置为UTF-8(意味着字符串在读取时将自动转码为UTF-8)。TL;DR version: use
# coding: utf-8
or# encoding: utf-8
; they are modern and there is no difference between them.According to this most enlightening article In Ruby 1.9 the rule is that the magic comment must be:
So that covers 1, 2 & 3 and probably would include stuff like
# foobarcoding: utf-8
as well. This is the preferred method for Ruby 1.9.For compatibility reasons there are retained the hash bang
-K*
switches from Ruby 1.8, which covers 4.Numbers five and six cover a slightly different thing. I recommend reading through the aforelinked article to see how external and internal encodings work exactly. However the gist is that when you read data via an IO object it matters how that data is encoded for reading it correctly. External encoding expresses just that. So when you set external encoding to UTF-8 you imply that the file you are reading is encoded in UTF-8. The internal encoding is to what encoding should Ruby automatically transcode the resulting string from that operation.
The default you are setting is used when an external encoding is not set explicitly. These defaults can be changed via the
-E
flag in the hash bang (number 5; thus 5 and six would work identically).Passing
-U
will set the internal encoding to UTF-8 (meaning that strings will be transcoded automatically to UTF-8 when read).