java分割令人困惑的空格字符

发布于 2024-11-01 23:48:02 字数 420 浏览 1 评论 0原文

我正在分割一个包含 Windows 系统文件名的字符串。该字符串使用 ascii FS 将文件名与其他信息分开,

例如filename.jpgFSotherInformationFSanotherPartOfInformation

这里有一些示例代码:

String fs = new String(new byte[]{(byte)32}); 
String information ="filename (copy).jpg"+fs+"otherInformation"; 
String[] parts = information.split(fs);

为什么 split 将空格分隔符与 ascii-FS 混淆?

我应该使用不同的分割函数吗? Pattern.quote(fs) 确实有帮助...:-(

I am splitting a string which contains a filename from a windows system. The string uses the ascii FS to separate the filename from other information

e.g. filename.jpgFSotherInformationFSanotherPartOfInformation

Here some example code:

String fs = new String(new byte[]{(byte)32}); 
String information ="filename (copy).jpg"+fs+"otherInformation"; 
String[] parts = information.split(fs);

Why does split confuse the space-separator with the ascii-FS?

Should I use a different function that split? Pattern.quote(fs) does help either... :-(

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

计㈡愣 2024-11-08 23:48:02

因为 FS 不是 ascii 值 32。

http://bestofthisweb.com/blogs/tag/ascii- table/

FS 是字符 28,但是这个控制字符不应该在文件名中使用,仅适用于一些罕见的二进制文件格式(我不知道还有哪个使用它) )

空格字符是 32,这就是为什么它看起来与分割相同,因为它确实如此。

对于简单的字段分隔符,我建议您使用“,”或“\t”,它们可以轻松地作为文本读取或使用电子表格包。

我建议在调试器中单步执行代码,以便您可以看到程序正在做什么。

Because FS is not ascii value 32.

http://bestofthisweb.com/blogs/tag/ascii-table/

The FS is character 28, but this control character should not be used in file names, only for some rare binary file formats (I don't know of one which uses it anymore)

The space character is 32 which is why it looks the same the split, because it is.

For a simple field seperator, I suggest you use ',' or '\t' which can be easily read as text or using a spreadsheet package.

I would suggest stepping through the code in a debugger so you can see what you program is doing.

∞觅青森が 2024-11-08 23:48:02

您已经用空格初始化了 fs (以相当复杂的方式)。以下内容相同并显示了您的问题:

String fs = " "; 
String information ="filename (copy).jpg"+fs+"otherInformation"; 
String[] parts = information.split(fs);

ascii 字符 FS 的编号为 0x1C,因此应该可以正常工作:

String fs = "\u001C"; 
String information ="filename (copy).jpg"+fs+"otherInformation"; 
String[] parts = information.split(fs);

背景信息

文件分隔符 FS 是一个有趣的控制代码,因为它让我们深入了解 60 年代计算机技术的组织方式。我们现在习惯于随机访问介质,如 RAM 和磁盘,但当 ASCII 标准定义时,大多数数据都是串行的。我不仅谈论串行通信,还谈论串行存储,例如打孔卡、纸带和磁带。在这种情况下,使用单个控制代码来表示两个文件的分离显然是有效的。 FS 就是为此目的而定义的。 (来源)

FS 的发明是为了在分层文件目录中分隔真实的文件而不是文件名。从技术上讲,是的,你可以使用它,但它有不同的含义。

You've initialized fs with a space (in a rather complicated way). The following is equal and shows your problem:

String fs = " "; 
String information ="filename (copy).jpg"+fs+"otherInformation"; 
String[] parts = information.split(fs);

The ascii char FS has the number 0x1C, so this should work properly:

String fs = "\u001C"; 
String information ="filename (copy).jpg"+fs+"otherInformation"; 
String[] parts = information.split(fs);

Background information

The file separator FS is an interesting control code, as it gives us insight in the way that computer technology was organized in the sixties. We are now used to random access media like RAM and magnetic disks, but when the ASCII standard was defined, most data was serial. I am not only talking about serial communications, but also about serial storage like punch cards, paper tape and magnetic tapes. In such a situation it is clearly efficient to have a single control code to signal the separation of two files. The FS was defined for this purpose. (source)

The FS was invented to separate real files and not filenames in a hierarchical file directory. Technically, yes, you can use it, but it has a different meaning.

紅太極 2024-11-08 23:48:02

因为 FS 是 Ascii 值 28

Ascii 值 32 是 空格

Beacuse FS is Ascii values 28

Ascii value 32 is space

云胡 2024-11-08 23:48:02

split的参数实际上是一个正则表达式,你尝试过

String[] parts = information.split("\\x20");

甚至

String[] parts = information.split("\\s");

Split's parameter is actually a regular expression, have you tried

String[] parts = information.split("\\x20");

Or even

String[] parts = information.split("\\s");
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文