将字符串编码为 UTF-8
我有一个带有“ñ”字符的字符串,但我遇到了一些问题。我需要将此字符串编码为 UTF-8 编码。我已经尝试过这种方式,但它不起作用:
byte ptext[] = myString.getBytes();
String value = new String(ptext, "UTF-8");
How do I Encode that string to utf-8?
I have a String with a "ñ" character and I have some problems with it. I need to encode this String to UTF-8 encoding. I have tried it by this way, but it doesn't work:
byte ptext[] = myString.getBytes();
String value = new String(ptext, "UTF-8");
How do I encode that string to utf-8?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(12)
怎么样使用
How about using
Java 中的
String
对象使用无法修改的 UTF-16 编码*。唯一可以有不同编码的是
byte[]
。因此,如果您需要 UTF-8 数据,那么您需要一个byte[]
。如果您的String
包含意外数据,则问题出在较早的某个位置,错误地将某些二进制数据转换为String
(即,它使用了错误的编码) 。* 作为实现问题,
String
可以内部使用 ISO-8859当字符范围适合时,-1 编码byte[]
,但这是特定于实现的优化,对于String
的用户不可见(也就是说,除非您深入研究源代码或使用反射来深入研究String
对象,否则您永远不会注意到。String
objects in Java use the UTF-16 encoding that can't be modified*.The only thing that can have a different encoding is a
byte[]
. So if you need UTF-8 data, then you need abyte[]
. If you have aString
that contains unexpected data, then the problem is at some earlier place that incorrectly converted some binary data to aString
(i.e. it was using the wrong encoding).* As a matter of implementation,
String
can internally use a ISO-8859-1 encodedbyte[]
when the range of characters fits it, but that is an implementation-specific optimization that isn't visible to users ofString
(i.e. you'll never notice unless you dig into the source code or use reflection to dig into aString
object).在 Java7 中,您可以使用:
与 getBytes(String) 相比,它的优点是它不会声明抛出 UnsupportedEncodingException。
如果您使用的是较旧的 Java 版本,您可以自己声明字符集常量:
In Java7 you can use:
This has the advantage over
getBytes(String)
that it does not declarethrows UnsupportedEncodingException
.If you're using an older Java version you can declare the charset constants yourself:
使用
byte[] ptext = String.getBytes("UTF-8");
而不是getBytes()
。getBytes()
使用所谓的“默认编码”,它可能不是 UTF-8。Use
byte[] ptext = String.getBytes("UTF-8");
instead ofgetBytes()
.getBytes()
uses so-called "default encoding", which may not be UTF-8.Java 字符串在内部始终以 UTF-16 进行编码 - 但您确实应该这样考虑:编码是在字符串和字节之间进行转换的一种方式。
因此,如果您遇到编码问题,那么当您拥有 String 时,再修复就为时已晚了。您需要修复从文件、数据库或网络连接创建该字符串的位置。
A Java String is internally always encoded in UTF-16 - but you really should think about it like this: an encoding is a way to translate between Strings and bytes.
So if you have an encoding problem, by the time you have String, it's too late to fix. You need to fix the place where you create that String from a file, DB or network connection.
你可以试试这个方法。
You can try this way.
一会儿我经历了这个问题并设法通过以下方式解决它:
首先我需要导入
然后我必须声明一个常量来使用
UTF-8
和ISO-8859-1
然后我可以通过以下方式使用它:
In a moment I went through this problem and managed to solve it in the following way
first i need to import
Then i had to declare a constant to use
UTF-8
andISO-8859-1
Then I could use it in the following way:
并且,如果您想从编码为“ISO-8859-1”的文本文件中读取:
and, if you want to read from text file with "ISO-8859-1" encoded:
我使用下面的代码通过指定编码格式对特殊字符进行编码。
I have use below code to encode the special character by specifying encode format.
如何配置 NetBeans 默认编码 UTF-8 的快速分步指南。结果,NetBeans 将以 UTF-8 编码创建所有新文件。
NetBeans 默认编码 UTF-8 分步指南
转到 NetBeans 安装目录中的 etc 文件夹
编辑 netbeans.conf 文件
查找netbeans_default_options行
在该行内的引号内添加 -J-Dfile.encoding=UTF-8
(示例:
netbeans_default_options="-J-Dfile.encoding=UTF-8"
)重新启动 NetBeans
您将 NetBeans 默认编码设置为 UTF-8。
您的 netbeans_default_options 可能在引号内包含其他参数。在这种情况下,请在字符串末尾添加 -J-Dfile.encoding=UTF-8。用空格将其与其他参数分开。
例子:
这里是链接 了解更多详情
A quick step-by-step guide how to configure NetBeans default encoding UTF-8. In result NetBeans will create all new files in UTF-8 encoding.
NetBeans default encoding UTF-8 step-by-step guide
Go to etc folder in NetBeans installation directory
Edit netbeans.conf file
Find netbeans_default_options line
Add -J-Dfile.encoding=UTF-8 inside quotation marks inside that line
(example:
netbeans_default_options="-J-Dfile.encoding=UTF-8"
)Restart NetBeans
You set NetBeans default encoding UTF-8.
Your netbeans_default_options may contain additional parameters inside the quotation marks. In such case, add -J-Dfile.encoding=UTF-8 at the end of the string. Separate it with space from other parameters.
Example:
here is link for Further Details
这解决了我的问题
This solved my problem
正确的解决办法也是:
The correct solution is also: