使用 groovy 将 ISO-8859-1 转换为 UTF-8

发布于 2024-12-02 18:19:16 字数 1109 浏览 0 评论 0原文

我需要将 ISO-8859-1 文件转换为 utf-8 编码,而不丢失内容信息...

我有一个如下所示的文件:

<?xml version="1.0" encoding="ISO-8859-1" ?> 
<HelloEncodingWorld>Üöäüßßß Test!!!</HelloEncodingWorld>

我不想将其编码为 UTF-8。 我尝试了以下操作:

f=new File('c:/temp/myiso88591.xml').getText('ISO-8859-1')
ts=new String(f.getBytes("UTF-8"), "UTF-8")
g=new File('c:/temp/myutf8.xml').write(ts)

由于字符串不兼容而不起作用。 然后我读了一些关于 bytestreamreaders/writers/streamingmarkupbuilder 等的内容...

然后我尝试

f=new File('c:/temp/myiso88591.xml').getText('ISO-8859-1')
mb = new groovy.xml.StreamingMarkupBuilder()
mb.encoding = "UTF-8"

new OutputStreamWriter(new FileOutputStream('c:/temp/myutf8.xml'),'utf-8') << mb.bind {
    mkp.xmlDeclaration()
    out << f
}

这完全不是我想要的..

我只想获取使用 ISO-8859-1 阅读器读取的 xml 内容,然后将它到一个新的(旧的)文件中...为什么这么复杂:-/

结果应该是这样,并且文件应该真正用 utf-8 编码:

<?xml version="1.0" encoding="UTF-8" ?> 
<HelloEncodingWorld>Üöäüßßß Test!!!</HelloEncodingWorld>

感谢您的任何答案 干杯

i need to convert a ISO-8859-1 file to utf-8 encoding, without loosing content intormations...

i have a file which looks like this:

<?xml version="1.0" encoding="ISO-8859-1" ?> 
<HelloEncodingWorld>Üöäüßßß Test!!!</HelloEncodingWorld>

Not i want to encode it into UTF-8.
I tried following:

f=new File('c:/temp/myiso88591.xml').getText('ISO-8859-1')
ts=new String(f.getBytes("UTF-8"), "UTF-8")
g=new File('c:/temp/myutf8.xml').write(ts)

didnt work due to String incompatibilities.
Then i read something about bytestreamreaders/writers/streamingmarkupbuilder and other...

then i tried

f=new File('c:/temp/myiso88591.xml').getText('ISO-8859-1')
mb = new groovy.xml.StreamingMarkupBuilder()
mb.encoding = "UTF-8"

new OutputStreamWriter(new FileOutputStream('c:/temp/myutf8.xml'),'utf-8') << mb.bind {
    mkp.xmlDeclaration()
    out << f
}

this was totally not that what i wanted..

I just want to get the content of an xml read with an ISO-8859-1 reader and then put it into a new (old) file... why this is so complicated :-/

The result should just be, and the file should be really encoded in utf-8:

<?xml version="1.0" encoding="UTF-8" ?> 
<HelloEncodingWorld>Üöäüßßß Test!!!</HelloEncodingWorld>

Thanks for any answers
Cheers

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

↙厌世 2024-12-09 18:19:16
def f=new File('c:/data/myiso88591.xml').getText('ISO-8859-1')
new File('c:/data/myutf8.xml').write(f,'utf-8')

(我刚刚尝试了一下,它有效:-)

与 java 中的相同:库为您进行转换...
正如 deceze 所说:当您指定编码时,它将被转换为内部格式(utf-16 afaik)。当您在写入字符串时指定其他编码时,它将转换为该编码。

但如果您使用 XML,则无论如何都不必担心编码,因为 XML 解析器会处理它。它将读取第一个字符 并根据这些字符确定基本编码。之后,它就可以从 xml 标头读取编码信息并使用它。

def f=new File('c:/data/myiso88591.xml').getText('ISO-8859-1')
new File('c:/data/myutf8.xml').write(f,'utf-8')

(I just gave it a try, it works :-)

same as in java: the libraries do the conversion for you...
as deceze said: when you specify an encoding, it will be converted to an internal format (utf-16 afaik). When you specify another encoding when you write the string, it will be converted to this encoding.

But if you work with XML, you shouldn't have to worry about the encoding anyway because the XML parser will take care of it. It will read the first characters <?xml and determines the basic encoding from those characters. After that, it is able to read the encoding information from your xml header and use this.

寻找我们的幸福 2024-12-09 18:19:16

使其更加 Groovy 一点,并且不需要整个文件适合内存,您可以使用读取器和写入器来流式传输文件。当我的文件对于普通的旧 Unix iconv(1) 来说太大时,这是我的解决方案。

new FileOutputStream('out.txt').withWriter('UTF-8') { writer ->
    new FileInputStream('in.txt').withReader('ISO-8859-1') { reader ->
        writer << reader
    }
}

Making it a little more Groovy, and not requiring the whole file to fit in memory, you can use the readers and writers to stream the file. This was my solution when I had files too big for plain old Unix iconv(1).

new FileOutputStream('out.txt').withWriter('UTF-8') { writer ->
    new FileInputStream('in.txt').withReader('ISO-8859-1') { reader ->
        writer << reader
    }
}
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文