在 Java 中对 Shift-JIS 字符使用 XOR
所以我试图编写一个小解密程序,但遇到了一些麻烦。 我将 XOR 应用于带有“FF”的字符(反转所有位),并且通过将字符串转换为字节数组,然后对其应用 XOR 来实现这一点。但这些字符采用 Shift-JIS 编码,并且某些内容不起作用。 当我尝试使用普通字母的方法时,它似乎有效,但当涉及日语字符时,就会出现问题。
public void sampleMethod(String a)
{
try {
String b = "FF";
byte[] c = a.getBytes("Shift_JIS");
byte[] d = b.getBytes("Shift_JIS");
byte[] e = new byte[50];
for (int i=0; i<c.length; i++)
{
e[i] =(byte)(c[i]^d[i%2]);
}
String t = new String(e, "Shift_JIS");
System.out.println(t);
}
catch (UnsupportedEncodingException e)
{
}
}
但是当我输入日语字符时,它会将每一个字符转换为“yyyyyy”。我尝试打印出字节数组来查看问题,结果表明每个字符都存储为“63”。我怎样才能正确存储字符?实际上,我如何对 Shift-JIS 字符使用 XOR?
我使用 XOR 是因为我基本上只想将位从 0010 反转到 1101,然后将其更改回字符。这可能吗?
谢谢
例如,这是我的输入:“始めまして”,我得到的是:“yyyyy” 当我做“你好”之类的事情时,我得到“.#**)f2.#4#”
So I'm trying to write a little decryption program but I'm running into a little trouble.
I'm applying XOR to the characters with 'FF' (reversing all the bits) and I'm doing that by converting the string to a byte array then applying the XOR to it. But the characters are in Shift-JIS encoding and something's not working.
When I try the method with normal letters, it seems to work but when it gets to the Japanese characters something goes wrong.
public void sampleMethod(String a)
{
try {
String b = "FF";
byte[] c = a.getBytes("Shift_JIS");
byte[] d = b.getBytes("Shift_JIS");
byte[] e = new byte[50];
for (int i=0; i<c.length; i++)
{
e[i] =(byte)(c[i]^d[i%2]);
}
String t = new String(e, "Shift_JIS");
System.out.println(t);
}
catch (UnsupportedEncodingException e)
{
}
}
But when I stick in Japanese characters, it converts every single one of them into just 'yyyyyy'. I tried printing out the byte array to see the problem, and it showed that each character was being stored as '63'. How would I get the characters to be stored correctly? Actually, how would I use XOR on the Shift-JIS characters?
I'm using XOR because I basically just want to reverse the bits from say 0010 to 1101 then change it back to characters. Is that possible?
Thanks
For example, this was my input: '始めまして" and what I get out is: "yyyyy"
And when I do something like "hello there" I get ".#**)f2.#4#"
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
您根本无法对多字节字符进行这种按字节操作。
日语字符(和其他扩展字符)通常由一系列字节表示。改变这些可能会产生无法正确解码的无效序列(我猜这就是您看到的结果)。
来自维基百科文章,Shift JIS
我想通过异或运算你会破坏这个保证。
如果您想反转位并再次返回,请在内部使用
byte[]
数据类型,并且只有在确定它是 Shift JIS 结构化字节数组时才将其返回为字符串。You simply can't do this kind of byte wise manipulation on multi-byte characters.
Japanese characters (and other extended characters) are typically represented by a series of bytes. Changing these around is likely going to produce invalid sequences which can't be decoded properly (and I guess this is the results that you are seeing).
From the Wikipedia article, Shift JIS
I would imagine by XOR'ing you are breaking this guarantee.
If you want to reverse the bits and do it back again work with a
byte[]
data type internally and only turn it back to a string when you're sure it's a Shift JIS structured byte array.