如何在 bash 提示符中正确转义 unicode 字符

发布于 2024-11-30 13:35:16 字数 800 浏览 0 评论 0原文

我的 bash 提示符有一个特定的方法,假设它看起来像这样:

CHAR="༇ "
my_function="
    prompt=\" \[\$CHAR\]\"
    echo -e \$prompt"

PS1="\$(${my_function}) \$ "

为了解释上面的内容,我通过执行存储在字符串中的函数来构建我的 bash 提示符,这是根据 这个问题。让我们假装它工作得很好,因为它确实工作得很好,除了当涉及 unicode 字符时

我试图找到转义 unicode 字符的正确方法,因为现在它与 bash 行长度混淆。测试它是否损坏的一个简单方法是输入一个长命令,执行它,按 CTRL-R 并键入找到它,然后按 CTRL-A CTRL-E 跳转到行的开头/结尾。如果文本出现乱码,那么它就不起作用。

我已经尝试了几种方法来正确转义函数字符串中的 unicode 字符,但似乎没有任何效果。

像这样的特殊字符起作用:

COLOR_BLUE=$(tput sgr0 && tput setaf 6)

my_function="
    prompt="\\[\$COLOR_BLUE\\] \"
    echo -e \$prompt"

这是我将提示设为函数字符串的主要原因。该转义序列不会扰乱行长度,它只是 unicode 字符。

I have a specific method for my bash prompt, let's say it looks like this:

CHAR="༇ "
my_function="
    prompt=\" \[\$CHAR\]\"
    echo -e \$prompt"

PS1="\$(${my_function}) \$ "

To explain the above, I'm builidng my bash prompt by executing a function stored in a string, which was a decision made as the result of this question. Let's pretend like it works fine, because it does, except when unicode characters get involved

I am trying to find the proper way to escape a unicode character, because right now it messes with the bash line length. An easy way to test if it's broken is to type a long command, execute it, press CTRL-R and type to find it, and then pressing CTRL-A CTRL-E to jump to the beginning / end of the line. If the text gets garbled then it's not working.

I have tried several things to properly escape the unicode character in the function string, but nothing seems to be working.

Special characters like this work:

COLOR_BLUE=$(tput sgr0 && tput setaf 6)

my_function="
    prompt="\\[\$COLOR_BLUE\\] \"
    echo -e \$prompt"

Which is the main reason I made the prompt a function string. That escape sequence does NOT mess with the line length, it's just the unicode character.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

桃扇骨 2024-12-07 13:35:16

\[...\] 序列表示完全忽略字符串的这一部分,当提示包含零长度序列(例如更改文本颜色的控制序列或标题栏,比如说。但在本例中,您正在打印一个字符,因此它的长度不为零。也许你可以通过使用无操作转义序列来欺骗 Bash 来计算正确的行长度来解决这个问题,但这听起来像是疯狂的做法。

正确的解决方案是 Bash 中的行长度计算正确地使用 UTF-8(或您正在使用的任何 Unicode 编码)。嗯,您是否尝试过不使用 \[...\] 序列?

编辑:以下内容实现了我在下面的评论中提出的解决方案。保存光标位置,然后在 \[...\] 之外打印两个空格,然后恢复光标位置,并在两个空格的顶部打印 Unicode 字符。这假定字体宽度固定,Unicode 字符具有双倍宽度。

PS1='\['"`tput sc`"'\]  \['"`tput rc`"'༇ \] \$ '

至少在 OSX 终端、Bash 3.2.17(1) 版本中,这通过了粗略的[原文如此]测试。

为了透明度和易读性,我忽略了在函数内具有提示功能的要求以及颜色编码;这只是将提示更改为字符、空格、美元提示、空格。适应您更复杂的需求。

The \[...\] sequence says to ignore this part of the string completely, which is useful when your prompt contains a zero-length sequence, such as a control sequence which changes the text color or the title bar, say. But in this case, you are printing a character, so the length of it is not zero. Perhaps you could work around this by, say, using a no-op escape sequence to fool Bash into calculating the correct line length, but it sounds like that way lies madness.

The correct solution would be for the line length calculations in Bash to correctly grok UTF-8 (or whichever Unicode encoding it is that you are using). Uhm, have you tried without the \[...\] sequence?

Edit: The following implements the solution I propose in the comments below. The cursor position is saved, then two spaces are printed, outside of \[...\], then the cursor position is restored, and the Unicode character is printed on top of the two spaces. This assumes a fixed font width, with double width for the Unicode character.

PS1='\['"`tput sc`"'\]  \['"`tput rc`"'༇ \] \$ '

At least in the OSX Terminal, Bash 3.2.17(1)-release, this passes cursory [sic] testing.

In the interest of transparency and legibility, I have ignored the requirement to have the prompt's functionality inside a function, and the color coding; this just changes the prompt to the character, space, dollar prompt, space. Adapt to suit your somewhat more complex needs.

帅的被狗咬 2024-12-07 13:35:16

@tripleee 获胜,在这里发布最终的解决方案,因为在注释中发布代码很痛苦:

CHAR="༇"
my_function="
    prompt=\" \\[`tput sc`\\]  \\[`tput rc`\\]\\[\$CHAR\\] \"
    echo -e \$prompt"

PS1="\$(${my_function}) \$ "

@tripleee 的链接中指出的技巧是使用命令 tput sctput rc 保存然后恢复光标位置。该代码有效地保存了光标位置,打印两个空格作为宽度,将光标位置恢复到空格之前,然后打印特殊字符,以便行的宽度来自两个空格,而不是字符。

@tripleee wins it, posting the final solution here because it's a pain to post code in comments:

CHAR="༇"
my_function="
    prompt=\" \\[`tput sc`\\]  \\[`tput rc`\\]\\[\$CHAR\\] \"
    echo -e \$prompt"

PS1="\$(${my_function}) \$ "

The trick as pointed out in @tripleee's link is the use of the commands tput sc and tput rc which save and then restore the cursor position. The code is effectively saving the cursor position, printing two spaces for width, restoring the cursor position to before the spaces, then printing the special character so that the width of the line is from the two spaces, not the character.

深居我梦 2024-12-07 13:35:16

(不是您问题的答案,而是与您的问题相关的一些指示和一般经验。)

我看到您描述的有关 cmd 行编辑的行为(Ctrl-R,... Cntrl-A Ctrl-E ...)全部时间,即使没有 unicode 字符。

在一个工作地点,我花了时间找出术语设置的终端解释与操作系统使用的术语定义之间的差异(好吧,我想是 stty)。

现在,当我遇到这个问题时,我会退出当前编辑该行的尝试,再次启动该行,然后立即进入“vi”模式,该模式将打开 vi 编辑器。 (只按“v”字符,对吗?)。完整的 vi 会话的所有易用性;为什么要少一些;-)?

再次查看您的问题描述,当您说

my_function="
    prompt=\" \[\$CHAR\]\"
    echo -e \$prompt"

That is just a string定义时,对吧?我假设您通过假设这是您的 my_function 的输出来简化问题定义。在创建函数定义、调用函数和使用返回值的步骤中,shell 引用很可能无法按您希望的方式工作。

如果您编辑问题以包含 my_function 定义及其完整使用(将您的函数简化为导致问题的原因),其他人也可能更容易帮助解决此问题。最后,您经常使用 set -vx 吗?它可以帮助展示变量扩展的方式/用途/内容,您可能会在那里找到一些东西。

如果所有这些都失败,请查看 Orilly termcap &术语信息。您可能需要查看本地系统 stty 和相关命令的手册页,并且最好查找特定于您的 Linux 系统的用户组(我假设您使用 Linux 变体) 。

我希望这有帮助。

(Not the answer to your problem, but some pointers and general experience related to your issue.)

I see the behaviour you describe about cmd-line editing (Ctrl-R, ... Cntrl-A Ctrl-E ...) all the time, even without unicode chars.

At one work-site, I spent the time to figure out the diff between the terminals interpretation of the TERM setting VS the TERM definition used by the OS (well, stty I suppose).

NOW, when I have this problem, I escape out of my current attempt to edit the line, bring the line up again, and then immediately go to the 'vi' mode, which opens the vi editor. (press just the 'v' char, right?). All the ease of use of a full-fledged session of vi; why go with less ;-)?

Looking again at your problem description, when you say

my_function="
    prompt=\" \[\$CHAR\]\"
    echo -e \$prompt"

That is just a string definition, right? and I'm assuming your simplifying the problem definition by assuming this is the output of your my_function. It seems very likely in the steps of creating the function definition, calling the function AND using the values returned are a lot of opportunities for shell-quoting to not work the way you want it to.

If you edit your question to include the my_function definition, and its complete use (reducing your function to just what is causing the problem), it may be easier for others to help with this too. Finally, do you use set -vx regularly? It can help show how/wnen/what of variable expansions, you may find something there.

Failing all of those, look at Orielly termcap & terminfo. You may need to look at the man page for your local systems stty and related cmds AND you may do well to look for user groups specific to you Linux system (I'm assuming you use a Linux variant).

I hope this helps.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文