将字符串编码为 UTF-8

发布于 2024-11-02 11:18:24 字数 224 浏览 0 评论 0原文

我有一个带有“ñ”字符的字符串,但我遇到了一些问题。我需要将此字符串编码为 UTF-8 编码。我已经尝试过这种方式,但它不起作用:

byte ptext[] = myString.getBytes();
String value = new String(ptext, "UTF-8");

How do I Encode that string to utf-8?

I have a String with a "ñ" character and I have some problems with it. I need to encode this String to UTF-8 encoding. I have tried it by this way, but it doesn't work:

byte ptext[] = myString.getBytes();
String value = new String(ptext, "UTF-8");

How do I encode that string to utf-8?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(12

瞄了个咪的 2024-11-09 11:18:24

怎么样使用

ByteBuffer byteBuffer = StandardCharsets.UTF_8.encode(myString)

How about using

ByteBuffer byteBuffer = StandardCharsets.UTF_8.encode(myString)
谁人与我共长歌 2024-11-09 11:18:24

Java 中的 String 对象使用无法修改的 UTF-16 编码*

唯一可以有不同编码的是byte[]。因此,如果您需要 UTF-8 数据,那么您需要一个 byte[]。如果您的 String 包含意外数据,则问题出在较早的某个位置,错误地将某些二进制数据转换为 String(即,它使用了错误的编码) 。

* 作为实现问题,String 可以内部使用 ISO-8859当字符范围适合时,-1 编码 byte[],但这是特定于实现的优化,对于 String 的用户不可见(也就是说,除非您深入研究源代码或使用反射来深入研究 String 对象,否则您永远不会注意到。

String objects in Java use the UTF-16 encoding that can't be modified*.

The only thing that can have a different encoding is a byte[]. So if you need UTF-8 data, then you need a byte[]. If you have a String that contains unexpected data, then the problem is at some earlier place that incorrectly converted some binary data to a String (i.e. it was using the wrong encoding).

* As a matter of implementation, String can internally use a ISO-8859-1 encoded byte[] when the range of characters fits it, but that is an implementation-specific optimization that isn't visible to users of String (i.e. you'll never notice unless you dig into the source code or use reflection to dig into a String object).

浅笑轻吟梦一曲 2024-11-09 11:18:24

在 Java7 中,您可以使用:

import static java.nio.charset.StandardCharsets.*;

byte[] ptext = myString.getBytes(ISO_8859_1); 
String value = new String(ptext, UTF_8); 

与 getBytes(String) 相比,它的优点是它不会声明抛出 UnsupportedEncodingException。

如果您使用的是较旧的 Java 版本,您可以自己声明字符集常量:

import java.nio.charset.Charset;

public class StandardCharsets {
    public static final Charset ISO_8859_1 = Charset.forName("ISO-8859-1");
    public static final Charset UTF_8 = Charset.forName("UTF-8");
    //....
}

In Java7 you can use:

import static java.nio.charset.StandardCharsets.*;

byte[] ptext = myString.getBytes(ISO_8859_1); 
String value = new String(ptext, UTF_8); 

This has the advantage over getBytes(String) that it does not declare throws UnsupportedEncodingException.

If you're using an older Java version you can declare the charset constants yourself:

import java.nio.charset.Charset;

public class StandardCharsets {
    public static final Charset ISO_8859_1 = Charset.forName("ISO-8859-1");
    public static final Charset UTF_8 = Charset.forName("UTF-8");
    //....
}
情归归情 2024-11-09 11:18:24

使用 byte[] ptext = String.getBytes("UTF-8"); 而不是 getBytes()getBytes() 使用所谓的“默认编码”,它可能不是 UTF-8。

Use byte[] ptext = String.getBytes("UTF-8"); instead of getBytes(). getBytes() uses so-called "default encoding", which may not be UTF-8.

辞慾 2024-11-09 11:18:24

Java 字符串在内部始终以 UTF-16 进行编码 - 但您确实应该这样考虑:编码是在字符串和字节之间进行转换的一种方式。

因此,如果您遇到编码问题,那么当您拥有 String 时,再修复就为时已晚了。您需要修复从文件、数据库或网络连接创建该字符串的位置。

A Java String is internally always encoded in UTF-16 - but you really should think about it like this: an encoding is a way to translate between Strings and bytes.

So if you have an encoding problem, by the time you have String, it's too late to fix. You need to fix the place where you create that String from a file, DB or network connection.

最丧也最甜 2024-11-09 11:18:24

你可以试试这个方法。

byte ptext[] = myString.getBytes("ISO-8859-1"); 
String value = new String(ptext, "UTF-8"); 

You can try this way.

byte ptext[] = myString.getBytes("ISO-8859-1"); 
String value = new String(ptext, "UTF-8"); 
原野 2024-11-09 11:18:24

一会儿我经历了这个问题并设法通过以下方式解决它:

首先我需要导入

import java.nio.charset.Charset;

然后我必须声明一个常量来使用 UTF-8ISO-8859-1

private static final Charset UTF_8 = Charset.forName("UTF-8");
private static final Charset ISO = Charset.forName("ISO-8859-1");

然后我可以通过以下方式使用它:

String textwithaccent="Thís ís a text with accent";
String textwithletter="Ñandú";

text1 = new String(textwithaccent.getBytes(ISO), UTF_8);
text2 = new String(textwithletter.getBytes(ISO),UTF_8);

In a moment I went through this problem and managed to solve it in the following way

first i need to import

import java.nio.charset.Charset;

Then i had to declare a constant to use UTF-8 and ISO-8859-1

private static final Charset UTF_8 = Charset.forName("UTF-8");
private static final Charset ISO = Charset.forName("ISO-8859-1");

Then I could use it in the following way:

String textwithaccent="Thís ís a text with accent";
String textwithletter="Ñandú";

text1 = new String(textwithaccent.getBytes(ISO), UTF_8);
text2 = new String(textwithletter.getBytes(ISO),UTF_8);
囚我心虐我身 2024-11-09 11:18:24
String value = new String(myString.getBytes("UTF-8"));

并且,如果您想从编码为“ISO-8859-1”的文本文件中读取:

String line;
String f = "C:\\MyPath\\MyFile.txt";
try {
    BufferedReader br = Files.newBufferedReader(Paths.get(f), Charset.forName("ISO-8859-1"));
    while ((line = br.readLine()) != null) {
        System.out.println(new String(line.getBytes("UTF-8")));
    }
} catch (IOException ex) {
    //...
}
String value = new String(myString.getBytes("UTF-8"));

and, if you want to read from text file with "ISO-8859-1" encoded:

String line;
String f = "C:\\MyPath\\MyFile.txt";
try {
    BufferedReader br = Files.newBufferedReader(Paths.get(f), Charset.forName("ISO-8859-1"));
    while ((line = br.readLine()) != null) {
        System.out.println(new String(line.getBytes("UTF-8")));
    }
} catch (IOException ex) {
    //...
}
死开点丶别碍眼 2024-11-09 11:18:24

我使用下面的代码通过指定编码格式对特殊字符进行编码。

String text = "This is an example é";
byte[] byteText = text.getBytes(Charset.forName("UTF-8"));
//To get original string from byte.
String originalString= new String(byteText , "UTF-8");

I have use below code to encode the special character by specifying encode format.

String text = "This is an example é";
byte[] byteText = text.getBytes(Charset.forName("UTF-8"));
//To get original string from byte.
String originalString= new String(byteText , "UTF-8");
筑梦 2024-11-09 11:18:24

如何配置 NetBeans 默认编码 UTF-8 的快速分步指南。结果,NetBeans 将以 UTF-8 编码创建所有新文件。

NetBeans 默认编码 UTF-8 分步指南

  • 转到 NetBeans 安装目录中的 etc 文件夹

  • 编辑 netbeans.conf 文件

  • 查找netbeans_default_options行

  • 在该行内的引号内添加 -J-Dfile.encoding=UTF-8

    (示例:netbeans_default_options="-J-Dfile.encoding=UTF-8"

  • 重新启动 NetBeans

您将 NetBeans 默认编码设置为 UTF-8。

您的 netbeans_default_options 可能在引号内包含其他参数。在这种情况下,请在字符串末尾添加 -J-Dfile.encoding=UTF-8。用空格将其与其他参数分开。

例子:

netbeans_default_options="-J-client -J-Xss128m -J-Xms256m
-J-XX:PermSize=32m -J-Dapple.laf.useScreenMenuBar=true -J-Dapple.awt.graphics.UseQuartz=true -J-Dsun.java2d.noddraw=true -J-Dsun.java2d.dpiaware=true -J-Dsun.zip.disableMemoryMapping=true -J-Dfile.encoding=UTF-8"

这里是链接 了解更多详情

A quick step-by-step guide how to configure NetBeans default encoding UTF-8. In result NetBeans will create all new files in UTF-8 encoding.

NetBeans default encoding UTF-8 step-by-step guide

  • Go to etc folder in NetBeans installation directory

  • Edit netbeans.conf file

  • Find netbeans_default_options line

  • Add -J-Dfile.encoding=UTF-8 inside quotation marks inside that line

    (example: netbeans_default_options="-J-Dfile.encoding=UTF-8")

  • Restart NetBeans

You set NetBeans default encoding UTF-8.

Your netbeans_default_options may contain additional parameters inside the quotation marks. In such case, add -J-Dfile.encoding=UTF-8 at the end of the string. Separate it with space from other parameters.

Example:

netbeans_default_options="-J-client -J-Xss128m -J-Xms256m
-J-XX:PermSize=32m -J-Dapple.laf.useScreenMenuBar=true -J-Dapple.awt.graphics.UseQuartz=true -J-Dsun.java2d.noddraw=true -J-Dsun.java2d.dpiaware=true -J-Dsun.zip.disableMemoryMapping=true -J-Dfile.encoding=UTF-8"

here is link for Further Details

岛徒 2024-11-09 11:18:24

这解决了我的问题

    String inputText = "some text with escaped chars"
    InputStream is = new ByteArrayInputStream(inputText.getBytes("UTF-8"));

This solved my problem

    String inputText = "some text with escaped chars"
    InputStream is = new ByteArrayInputStream(inputText.getBytes("UTF-8"));
一紙繁鸢 2024-11-09 11:18:24

正确的解决办法也是:

String myUTF8String = new String(sourceISOString.getBytes(Charsets.ISO_8859_1), Charsets.UTF_8);

The correct solution is also:

String myUTF8String = new String(sourceISOString.getBytes(Charsets.ISO_8859_1), Charsets.UTF_8);
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文