在 Java 中从十六进制字符串创建 ISO-8859-1 字符串,移位

发布于 2024-08-09 06:10:22 字数 1047 浏览 3 评论 0原文

我正在尝试将十六进制序列转换为以 ISO-8859-1、UTF-8 或 UTF-16BE 编码的字符串。也就是说,我有一个字符串,看起来像:“0422043504410442”,它代表UTF-16BE中的字符:“Test”

我用来在两种格式之间进行转换的代码是:

private static String hex2String(String hex, String encoding) throws UnsupportedEncodingException {
    char[] hexArray = hex.toCharArray();

    int length = hex.length() / 2;
    byte[] rawData = new byte[length];
    for(int i=0; i<length; i++){
        int high = Character.digit(hexArray[i*2], 16);
        int low = Character.digit(hexArray[i*2+1], 16);
        int value = (high << 4) | low;
        if( value > 127)
                value -= 256;
        rawData[i] = (byte) value;
    }
    return new String(rawData, encoding);
}

这似乎对我来说工作得很好,但我仍然有两个问题:

  1. 是否有更简单的方法(最好没有位处理)来进行此转换?
  2. 我该如何解释这一行: int value = (high << 4) |低;

我熟悉位处理的基础知识,但完全不熟悉 Java 语法。我相信第一部分将所有位向左移动 4 步。尽管其余的我不明白,也不明白为什么它在这种特定情况下会有所帮助。

对于我的问题中的任何困惑,我深表歉意,如果我需要澄清任何事情,请告诉我。 谢谢。 //Abeansits

I am trying to convert a HEX-sequence to a String encoded in either, ISO-8859-1, UTF-8 or UTF-16BE. That is, I have a String looking like: "0422043504410442" this represents the characters: "Test" in UTF-16BE.

The code I used to convert between the two formats was:

private static String hex2String(String hex, String encoding) throws UnsupportedEncodingException {
    char[] hexArray = hex.toCharArray();

    int length = hex.length() / 2;
    byte[] rawData = new byte[length];
    for(int i=0; i<length; i++){
        int high = Character.digit(hexArray[i*2], 16);
        int low = Character.digit(hexArray[i*2+1], 16);
        int value = (high << 4) | low;
        if( value > 127)
                value -= 256;
        rawData[i] = (byte) value;
    }
    return new String(rawData, encoding);
}

This seems to work fine for me, but I still have two questions regarding this:

  1. Is there any simpler way (preferably without bit-handling) to do this conversion?
  2. How am I to interpret the line: int value = (high << 4) | low;?

I am familiar with the basics of bit-handling, though not at all with the Java syntax. I believe the first part shift all bits to the left by 4 steps. Though the rest I don't understand and why it would be helpful in this certain situation.

I apologize for any confusion in my question, please let me know if I should clarify anything.
Thank you.
//Abeansits

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

指尖上得阳光 2024-08-16 06:10:22

有没有更简单的方法(最好没有位处理)来进行这种转换?

我不知道 - 唯一的简化似乎是一次解析整个字节而不是逐位解析(例如使用int value = Integer.parseInt(hex.substring(i * 2, i * 2 + 2), 16);)

public static byte[] hexToBytes(final String hex) {
  final byte[] bytes = new byte[hex.length() / 2];
  for (int i = 0; i < bytes.length; i++) {
    bytes[i] = (byte) Integer.parseInt(hex.substring(i * 2, i * 2 + 2), 16);
  }
  return bytes;
}

我如何解释这一行: int value = (high < ;<4)| low;?

查看此示例,了解最后两位数字 (42):

int high = 4; // binary 0100
int low = 2; // binary 0010
int value = (high << 4) | low;

int value = (0100 << 4) | 0010; // shift 4 to left
int value = 01000000 | 0010; // bitwise or
int value = 01000010;
int value = 66; // 01000010 == 0x42 == 66

Is there any simpler way (preferably without bit-handling) to do this conversion?

None I would know of - the only simplification seems to parse the whole byte at once rather than parsing digit by digit (e.g. using int value = Integer.parseInt(hex.substring(i * 2, i * 2 + 2), 16);)

public static byte[] hexToBytes(final String hex) {
  final byte[] bytes = new byte[hex.length() / 2];
  for (int i = 0; i < bytes.length; i++) {
    bytes[i] = (byte) Integer.parseInt(hex.substring(i * 2, i * 2 + 2), 16);
  }
  return bytes;
}

How am I to interpret the line: int value = (high << 4) | low;?

look at this example for your last two digits (42):

int high = 4; // binary 0100
int low = 2; // binary 0010
int value = (high << 4) | low;

int value = (0100 << 4) | 0010; // shift 4 to left
int value = 01000000 | 0010; // bitwise or
int value = 01000010;
int value = 66; // 01000010 == 0x42 == 66
小猫一只 2024-08-16 06:10:22

在这种情况下,您可以将 <<| 替换为 *+,但我不这样做不推荐它。

该表达式

int value = (high << 4) | low;

相当于

int value = high * 16 + low;

不需要减去 256 来得到 -128 和 127 之间的值。例如,简单地将 128 转换为一个字节将产生正确的结果。 int 128 的最低 8 位与 byte -128:0x80 具有相同的模式。

我会把它简单地写成:

rawData[i] = (byte) ((high << 4) | low);

You can replace the << and | in this case with * and +, but I don't recommend it.

The expression

int value = (high << 4) | low;

is equivalent to

int value = high * 16 + low;

The subtraction of 256 to get a value between -128 and 127 is unnecessary. Simply casting, for example, 128 to a byte will produce the correct result. The lowest 8 bits of the int 128 have the same pattern as the byte -128: 0x80.

I'd write it simply as:

rawData[i] = (byte) ((high << 4) | low);
霊感 2024-08-16 06:10:22

有没有更简单的方法(最好是
没有位处理)来做到这一点
转换?

您可以使用 Hex 类在 Apache commons 中,但在内部,它会做同样的事情,也许有细微的差别。

我该如何解释这一行:int value = (high << 4) |低;

这将两个十六进制数字(每个数字代表 4 位)组合成一个无符号 8 位值,存储为 int。接下来的两行将其转换为有符号的 Java 字节

Is there any simpler way (preferably
without bit-handling) to do this
conversion?

You can use the Hex class in Apache commons, but internally, it will do the same thing, perhaps with minor differences.

How am I to interpret the line: int value = (high << 4) | low;?

This combines two hex digits, each of which represents 4 bits, into one unsigned 8-bit value stored as an int. The next two lines convert this to a signed Java byte.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文