在 Scala 中处理 BZIP 字符串/文件
我通过在 Scala 中完成 Python 挑战系列来惩罚自己。
现在,挑战之一是读取使用 bzip 算法压缩的字符串并输出结果。
BZh91AY&SYA\xaf\x82\r\x00\x00\x01\x01\x80\x02\xc0\x02\x00 \x00!\x9ah3M\x07<]\xc9\x14\xe1BA\x06\xbe\x084
现在,经过一番挖掘,似乎没有用于 bzip 处理的标准 java 库,但 apache ant 项目中有一些东西, 这个人好心地拿出来用作单独的库。
问题是,我似乎无法让它与下面的代码一起工作,它只是挂在 scala REPL 中,并且 JVM 的 CPU 使用率达到 100%
这是我正在尝试的代码...
import java.io.{ByteArrayInputStream}
import org.apache.tools.bzip2.{CBZip2InputStream}
import org.apache.commons.io.{IOUtils}
object ChallengeEight extends Application {
val inputString = """BZh91AY&SYA\xaf\x82\r\x00\x00\x01\x01\x80\x02\xc0\x02\x00 \x00!\x9ah3M\x07<]\xc9\x14\xe1BA\x06\xbe\x084"""
val inputStream = new ByteArrayInputStream( inputString.getBytes("UTF-8") ) //convert string to inputstream
inputStream.skip(2) //skip the 'BZ' part at the start
val bzipInputStream = new CBZip2InputStream(inputStream) //hangs here....
val result = IOUtils.toString(bzipInputStream, "UTF-8");
println(result)
}
任何人都有想法?或者,CBZip2InputStream
类是否需要一些额外的字节,您可能会在使用 bzip2
压缩的文件中找到这些字节?
任何帮助将不胜感激
编辑 根据记录,这是 python 解决方案
import bz2
un = "BZh91AY&SYA\xaf\x82\r\x00\x00\x01\x01\x80\x02\xc0\x02\x00 \x00!" \
"\x9ah3M\x07<]\xc9\x14\xe1BA\x06\xbe\x084"
print [bz2.decompress(elt) for elt in (un)]
I'm punishing myself a bit by doing the python challenges series in Scala.
Now, one of the challenges is to read in a string that's been compressed using the bzip algorithm and output the result.
BZh91AY&SYA\xaf\x82\r\x00\x00\x01\x01\x80\x02\xc0\x02\x00 \x00!\x9ah3M\x07<]\xc9\x14\xe1BA\x06\xbe\x084
Now, after some digging it appears as if there isn't a standard java library for bzip processing, but there is something in the apache ant project, that this guy has kindly taken out for use as a separate library.
The thing is, I can't seem to get it to work with the following code, it just hangs in the scala REPL and the JVM maxes out at 100% CPU usage
This is the code I'm trying...
import java.io.{ByteArrayInputStream}
import org.apache.tools.bzip2.{CBZip2InputStream}
import org.apache.commons.io.{IOUtils}
object ChallengeEight extends Application {
val inputString = """BZh91AY&SYA\xaf\x82\r\x00\x00\x01\x01\x80\x02\xc0\x02\x00 \x00!\x9ah3M\x07<]\xc9\x14\xe1BA\x06\xbe\x084"""
val inputStream = new ByteArrayInputStream( inputString.getBytes("UTF-8") ) //convert string to inputstream
inputStream.skip(2) //skip the 'BZ' part at the start
val bzipInputStream = new CBZip2InputStream(inputStream) //hangs here....
val result = IOUtils.toString(bzipInputStream, "UTF-8");
println(result)
}
Anyone got any ideas? Or is the CBZip2InputStream
class expecting some extra bytes that you might find in a file that has been zipped with bzip2
?
Any help would be appreciated
EDIT For the record this is the python solution
import bz2
un = "BZh91AY&SYA\xaf\x82\r\x00\x00\x01\x01\x80\x02\xc0\x02\x00 \x00!" \
"\x9ah3M\x07<]\xc9\x14\xe1BA\x06\xbe\x084"
print [bz2.decompress(elt) for elt in (un)]
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
要转义字符,请使用 unicode 转义序列 类似于
\uXXXX
语法,其中 XXXX 是 unicode 字符的十六进制序列。To escape characters use a unicode escape sequence like
\uXXXX
syntax where XXXX is the hexadecimal sequence for the unicode character.您将字符串括在三引号中,这意味着您将把文字字符传递给算法,而不是它们代表的控制/二进制字符。
You are enclosing your string in triple quotes which means you will pass the literal characters to the algorithm rather than the control/binary characters they represent.