将具有显式转义序列的字符串转换为相对字符

发布于 2024-10-31 13:43:02 字数 247 浏览 1 评论 0原文

我需要一个函数将“显式”转义序列转换为相对的不可打印字符。 Es:

char str[] = "\\n";
cout << "Line1" << convert_esc(str) << "Line2" << endl:

会给出这样的输出:

Line1

Line2

有没有任何函数可以做到这一点?

I need a function to convert "explicit" escape sequences into the relative non-printable character.
Es:

char str[] = "\\n";
cout << "Line1" << convert_esc(str) << "Line2" << endl:

would give this output:

Line1

Line2

Is there any function that does this?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

忆离笙 2024-11-07 13:43:02

我认为您必须自己编写这样的函数,因为转义字符是一个编译时功能,即当您编写 "\n" 时,编译器将替换 \ n 带有 eol 字符的序列。生成的字符串的长度为 1(不包括终止零字符)。

在您的情况下,字符串 "\\n"长度为 2 (再次排除终止零)并包含 \n< /代码>。

您需要扫描字符串,并在遇到 \ 时检查以下字符。如果它是合法转义之一,则应将它们都替换为相应的字符,否则跳过或保留它们。

http://ideone.com/BvcDE):

string unescape(const string& s)
{
  string res;
  string::const_iterator it = s.begin();
  while (it != s.end())
  {
    char c = *it++;
    if (c == '\\' && it != s.end())
    {
      switch (*it++) {
      case '\\': c = '\\'; break;
      case 'n': c = '\n'; break;
      case 't': c = '\t'; break;
      // all other escapes
      default: 
        // invalid escape sequence - skip it. alternatively you can copy it as is, throw an exception...
        continue;
      }
    }
    res += c;
  }

  return res;
}

I think that you must write such function yourself since escape characters is a compile-time feature, i.e. when you write "\n" the compiler would replace the \n sequence with the eol character. The resulting string is of length 1 (excluding the terminating zero character).

In your case a string "\\n" is of length 2 (again excluding terminating zero) and contains \ and n.

You need to scan your string and when encountering \ check the following char. if it is one of the legal escapes, you should replace both of them with the corresponding character, otherwise skip or leave them both as is.

( http://ideone.com/BvcDE ):

string unescape(const string& s)
{
  string res;
  string::const_iterator it = s.begin();
  while (it != s.end())
  {
    char c = *it++;
    if (c == '\\' && it != s.end())
    {
      switch (*it++) {
      case '\\': c = '\\'; break;
      case 'n': c = '\n'; break;
      case 't': c = '\t'; break;
      // all other escapes
      default: 
        // invalid escape sequence - skip it. alternatively you can copy it as is, throw an exception...
        continue;
      }
    }
    res += c;
  }

  return res;
}
泅渡 2024-11-07 13:43:02

使用 boost 字符串算法库,您可以相当简单地做到这一点。例如:

#include <string>
#include <iostream>
#include <boost/algorithm/string.hpp>

void escape(std::string& str)
{
  boost::replace_all(str, "\\\\", "\\");
  boost::replace_all(str, "\\t",  "\t");
  boost::replace_all(str, "\\n",  "\n");
  // ... add others here ...
}

int main()
{
  std::string str = "This\\tis\\n \\\\a test\\n123";

  std::cout << str << std::endl << std::endl;
  escape(str);
  std::cout << str << std::endl;

  return 0;
}

这肯定不是最有效的方法(因为它多次迭代字符串),但它紧凑且易于理解。

更新
正如 ybungalobill 所指出的,每当替换字符串生成字符序列时,稍后的替换正在搜索,或者当替换删除/修改应该被替换的字符序列时,这种实现都是错误的。

第一种情况的示例是 "\\\\n" -> “\\n” -> <代码>“\n”。当您输入 "\\\\" -> "\\" 替换最后(乍一看似乎是解决方案),您会得到后一种情况的示例 "\\\\n" -> ; “\\\n”。显然这个问题没有简单的解决方案,这使得该技术仅适用于非常简单的转义序列。

如果您需要通用(且更有效)的解决方案,您应该实现一个迭代字符串的状态机,如 davka 所建议的那样。

You can do that fairly easy, using the boost string algorithm library. For example:

#include <string>
#include <iostream>
#include <boost/algorithm/string.hpp>

void escape(std::string& str)
{
  boost::replace_all(str, "\\\\", "\\");
  boost::replace_all(str, "\\t",  "\t");
  boost::replace_all(str, "\\n",  "\n");
  // ... add others here ...
}

int main()
{
  std::string str = "This\\tis\\n \\\\a test\\n123";

  std::cout << str << std::endl << std::endl;
  escape(str);
  std::cout << str << std::endl;

  return 0;
}

This is surely not the most efficient way to do this (because it iterates the string multiple times), but it is compact and easy to understand.

Update:
As ybungalobill has pointed out, this implementation will be wrong, whenever a replacement string produces a character sequence, that a later replacement is searching for or when a replacement removes/modifies a character sequence, that should have been replaced.

An example for the first case is "\\\\n" -> "\\n" -> "\n". When you put the "\\\\" -> "\\" replacement last (which seems to be the solution at a first glance), you get an example for the latter case "\\\\n" -> "\\\n". Obviously there is no simple solution to this problem, which makes this technique only feasible for very simple escape sequences.

If you need a generic (and more efficient) solution, you should implement a state machine that iterates the string, as proposed by davka.

素手挽清风 2024-11-07 13:43:02

我确信是有人写的,但它太微不足道了,我怀疑它是否在任何地方专门发表过。

只需根据标准库中的各种“查找”/“替换”式算法自行重新创建它即可。

I'm sure that there is, written by someone, but it's so trivial that I doubt it's been specifically published anywhere.

Just recreate it yourself from the various "find"/"replace"-esque algorithms in the standard library.

风筝有风,海豚有海 2024-11-07 13:43:02

您考虑过使用 printf 吗? (或其亲属之一)

Have you considered using printf? (or one of its relatives)

陪我终i 2024-11-07 13:43:02

这是在 Unixy 平台上执行此操作的一种可爱方法。

它调用操作系统的echo命令来进行转换。

string convert_escapes( string input )
   {
   string buffer(input.size()+1,0);
   string cmd = "/usr/bin/env echo -ne \""+input+"\"";
   FILE * f = popen(cmd.c_str(),"r"); assert(f);
   buffer.resize(fread(&buffer[0],1,buffer.size()-1,f));
   fclose(f);
   return buffer;
   }

Here's a cute way to do it on Unixy platforms.

It calls the operating system's echo command to make the conversion.

string convert_escapes( string input )
   {
   string buffer(input.size()+1,0);
   string cmd = "/usr/bin/env echo -ne \""+input+"\"";
   FILE * f = popen(cmd.c_str(),"r"); assert(f);
   buffer.resize(fread(&buffer[0],1,buffer.size()-1,f));
   fclose(f);
   return buffer;
   }
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文