问题编码 java->xls

发布于 2024-09-03 22:07:22 字数 859 浏览 5 评论 0原文

这不是一个纯粹的java问题,也可能与HTML有关

我编写了一个java servlet,它查询数据库表并显示 结果作为 html 表格。用户还可以要求接收结果 Excel 工作表。 我通过打印相同的 html 表格来创建 Excel 工作表,但使用 “application/vnd.ms-excel”的内容类型。 Excel 文件是 创造得很好。 问题是表格可能包含非英语数据所以我想 使用 UTF-8 编码。

PrintWriter out = response.getWriter();
response.setContentType("application/vnd.ms-excel:ISO-8859-1");
//response.setContentType("application/vnd.ms-excel:UTF-8");
response.setHeader("cache-control", "no-cache");
response.setHeader("Content-Disposition", "attachment; filename=file.xls");
out.print(src);
out.flush();

非英语字符显示为垃圾 (áéíóú)

我也尝试从字符串转换为字节

byte[] arrByte = src.getBytes("ISO-8859-1");
String result = new String(arrByte, "UTF-8");

但我仍然收到垃圾,我能做什么? 谢谢

更新:如果我在记事本++中打开Excel文件,文件编码类型是“UTF-8无BOM”,如果我将编码更改为“UTF-8”,然后在Excel中打开文件,字符“áéíóú” “看起来不错。

This is not a pure java question and can also be related to HTML

I've written a java servlet that queries a database table and shows the
result as a html table. The user can also ask to receive the result as
an Excel sheet.
Im creating the Excel sheet by printing the same html table, but with
the content-type of "application/vnd.ms-excel". The Excel file is
created fine.
The problem is that the tables may contain non-english data so I want
to use a UTF-8 encoding.

PrintWriter out = response.getWriter();
response.setContentType("application/vnd.ms-excel:ISO-8859-1");
//response.setContentType("application/vnd.ms-excel:UTF-8");
response.setHeader("cache-control", "no-cache");
response.setHeader("Content-Disposition", "attachment; filename=file.xls");
out.print(src);
out.flush();

The non-english characters appear as garbage (áéíóú)

Also I tried converting to bytes from String

byte[] arrByte = src.getBytes("ISO-8859-1");
String result = new String(arrByte, "UTF-8");

But I Still getting garbage, What can I do?.
Thanks

UPDATE: if I open the excel file in notepad + + the type of file encoding is "UTF-8 without BOM", if I change the encoding to "UTF-8" and then open the file in Excel, the characters "áéíóú" look good.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

夜雨飘雪 2024-09-10 22:07:22

Excel 是二进制格式,而不是文本格式,因此您不需要设置任何编码,因为它根本不适用。无论您使用什么系统来构建 Excel 文件(例如 Apache Poi),都会负责 Excel 文件中文本的编码。

您不应该尝试将接收到的字节转换为字符串,只需将它们存储在字节数组中或将它们写出到文件中。

编辑:从评论来看,听起来您使用的并不是“真正的”二进制 Excel 文件,而是制表符分隔的文本文件 (CSV)。在这种情况下,请确保始终使用一致的编码,例如 UTF-8。

另外,在调用 response.getWriter() 之前,请先调用 setContentType

请参见 HttpServletResponse .getPrintWriter()

编辑:您可以尝试编写 BOM。通常不需要,但 Office 中的文件格式处理远非正常...

Java 并不真正支持 BOM。你必须伪造它。这意味着您需要使用响应输出流而不是编写器,因为您需要写入原始字节(BOM)。所以你将代码更改为:

response.setContentType("application/vnd.ms-excel:UTF-8");
// set other headers also, "cache-control" etc..
OutputStream outputStream = response.getOutputStream();
outputStream.write(0xEF);   // 1st byte of BOM
outputStream.write(0xBB);
outputStream.write(0xBF);   // last byte of BOM
// now get a PrintWriter to stream the chars.
PrintWriter out = new PrintWriter(new OutputStreamWriter(outputStream,"UTF-8"));
out.print(src);

Excel is a binary format, not a text format, so you should not need to set any encoding, since it simply doesn't apply. Whatever system you are using to build the excel file (e.g. Apache Poi) will take care of the encoding of text within the excel file.

You should not try to convert the recieved bytes to a string, just store them in a byte array or write them out to a file.

EDIT: from the comment, it doesn't sound as if you are using a "real" binary excel file, but a tab delimited text file (CSV). In that case, make sure you use consistent encoding, e.g UTF-8 throughout.

Also, before calling response.getWriter(), call setContentType first.

See HttpServletResponse.getPrintWriter()

EDIT: You can try writing the BOM. It's normally not required, but file format handling in Office is far from normal...

Java doesn't really have support for the BOM. You'll have to fake it. It means that you need to use the response outputStream rather than writer, since you need to write raw bytes (the BOM). So you change your code to this:

response.setContentType("application/vnd.ms-excel:UTF-8");
// set other headers also, "cache-control" etc..
OutputStream outputStream = response.getOutputStream();
outputStream.write(0xEF);   // 1st byte of BOM
outputStream.write(0xBB);
outputStream.write(0xBF);   // last byte of BOM
// now get a PrintWriter to stream the chars.
PrintWriter out = new PrintWriter(new OutputStreamWriter(outputStream,"UTF-8"));
out.print(src);
月依秋水 2024-09-10 22:07:22

当你将结果打印到标准输出时,你会得到“垃圾”吗?

编辑(下面评论中的代码标签中的代码):
response.setContentType("application/vnd.ms-excel; charset=UTF-8")

Do you get "garbage" when you print result to standard output?

Edit (code in code tags from the comment below):
response.setContentType("application/vnd.ms-excel; charset=UTF-8")

风透绣罗衣 2024-09-10 22:07:22

尝试使用 ServletResponse.setCharacterEncoding(java.lang.String charset) 方法。

response.setCharacterEncoding("UTF-8");

Try using the ServletResponse.setCharacterEncoding(java.lang.String charset) method.

response.setCharacterEncoding("UTF-8");
还不是爱你 2024-09-10 22:07:22

我有同样的问题..我使用 print() 而不是 write() 修复了它

outputStream.print('\ufeff');

I had the same issue.. i fixed it with using print() instead of write()

outputStream.print('\ufeff');
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文