NSTask 输出格式
我正在使用 NSTask 来获取 /usr/bin/man 的输出。我得到了输出,但没有格式化(粗体、下划线)。应该显示为这样的内容:
带有下划线的粗体文本
(请注意,斜体文本实际上带有下划线,这里只是没有格式)
而是像这样返回:
BBoolldd文本与 _u_n_d_e_r_l_i_n_e
我有一个最小的测试项目 http://cl.ly/052u2z2i2R280T3r1K3c 您可以下载并运行;请注意,窗口不执行任何操作;输出被记录到控制台。
我想我需要以某种方式手动解释 NSData 对象,但我不知道从哪里开始。理想情况下,我想将其转换为 NSAttributedString 但首要任务实际上是消除重复项和下划线。有什么想法吗?
I'm using an NSTask to grab the output from /usr/bin/man. I'm getting the output but without formatting (bold, underline). Something that should appear like this:
Bold text with underline
(note the italic text is actually underlined, there's just no formatting for it here)
Instead gets returned like this:
BBoolldd text with _u_n_d_e_r_l_i_n_e
I have a minimal test project at http://cl.ly/052u2z2i2R280T3r1K3c that you can download and run; note the window does nothing; the output gets logged to the Console.
I presume I need to somehow interpret the NSData object manually but I have no idea where to start on that. I'd ideally like to translate it to an NSAttributedString but the first order of business is actually eliminating the duplicates and underscores. Any thoughts?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
你的实际目的是什么?如果您想要显示手册页,一种选择是将其转换为 HTML 并使用 Web 视图呈现。
解析
man
的输出可能很棘手,因为它是由groff
默认使用终端处理器处理的。这意味着输出经过定制以显示在终端设备上。一种替代解决方案是确定手册页源文件的实际位置,例如
,使用
-a
(ASCII 近似值)和-c 手动调用
(禁用颜色输出),例如groff
。这将生成一个没有大部分格式的 ASCII 文件。要生成 HTML 输出,
您还可以在
man
的自定义配置文件(例如 parseman.conf)中指定这些选项,并告诉man
使用该配置文件和man
来生成 HTML 输出。 >-C 选项,而不是调用man -w
、gunzip
和groff
。默认配置文件是/private/etc/man.conf
。此外,您还可以通过将适当的选项传递给 grotty 来定制终端设备处理器的输出。
What is your actual purpose? If you want to show a man page, one option is to convert it to HTML and render it with a Web view.
Parsing
man
’s output can be tricky because it is processed bygroff
using a terminal processor by default. This means that the output is tailored to be shown on terminal devices.One alternative solution is to determine the actual location of the man page source file, e.g.
and manually invoke
groff
on it with-a
(ASCII approximation) and-c
(disable colour output), e.g.This will result in an ASCII file without most of the formatting. To generate HTML output,
You can also specify these options in a custom configuration file for
man
, e.g. parseman.conf, and tellman
to use that configuration file with the-C
option instead of invokingman -w
,gunzip
, andgroff
. The default configuration file is/private/etc/man.conf
.Also, you can probably tailor the output of the terminal device processor by passing appropriate options to
grotty
.好的,这是我的解决方案的开始,尽管我对任何其他(更简单的?)方法感兴趣。
从终端返回的输出是 UTF-8 编码,但 NSUTF8StringEncoding 无法正确解释该字符串。原因是 NSTask 输出的格式化方式。
字母 N 在 UTF-8 中是 0x4e。但对应的NSData是0x4e 0x08 0x4e。 0x08 对应于退格键。因此,对于粗体字母,终端将打印字母-退格-字母。
对于斜体 c,它是 UTF-8 格式的 0x63。 NSData 包含 0x5f 0x08 0x63,其中 0x5f 对应下划线。因此,对于斜体,终端打印下划线退格字母。
除了扫描原始 NSData 中的这些序列之外,我现在确实没有看到任何解决此问题的方法。一旦完成,我可能会将源代码发布到我的解析器,除非任何人都有任何现有代码。正如常见的编程短语所说,永远不要自己编写可以复制的内容。 :)
我有一个很好的、快速的解析器,用于获取 man 输出并用 NSMutableAttributedString 中的粗体/下划线格式替换粗体/下划线输出。如果其他人需要解决同样的问题,这是代码:
Okay, here's the start of my solution, though I would be interested in any additional (easier?) ways to do this.
The output returned from the Terminal is UTF-8 encoding, but the NSUTF8StringEncoding doesn't interpret the string properly. The reason is the way NSTask output is formatted.
The letter N is 0x4e in UTF-8. But the NSData corresponding to that is 0x4e 0x08 0x4e. 0x08 corresponds to a Backspace. So for a bold letter, Terminal prints letter-backspace-letter.
For an italic c, it's 0x63 in UTF-8. The NSData contains 0x5f 0x08 0x63, with 0x5f corresponding to an underscore. So for italics, Terminal prints underscore-backspace-letter.
I really don't see any way around this at this point besides just scanning the raw NSData for these sequences. I'll probably post the source to my parser here once I finish it, unless anybody has any existing code. As the common programming phrase goes, never write yourself what you can copy. :)
I've got a good, fast parser together for taking man output and replacing the bold/underlined output with bold/underlined formatting in an NSMutableAttributedString. Here's the code if anybody else needs to solve the same problem:
另一种方法是将手册页转换为 PostScript 源代码,通过 PostScript-to-PDF 转换器运行,然后将其放入 PDFView。
实现与 Bvarious 的答案类似,只是 groff 的参数不同(
-Tps
而不是-Thtml
)。这将是最慢的解决方案,但也可能是打印的最佳解决方案。
Another method would be to convert the man page to PostScript source code, run that through the PostScript-to-PDF converter, and put that into a PDFView.
The implementation would be similar to Bavarious's answer, just with different arguments to groff (
-Tps
instead of-Thtml
).This would be the slowest solution, but also probably the best for printing.