将具有显式转义序列的字符串转换为相对字符

发布于 2024-10-31 13:43:02 字数 247 浏览 1 评论 0原文

我需要一个函数将“显式”转义序列转换为相对的不可打印字符。 Es：

char str[] = "\\n";
cout << "Line1" << convert_esc(str) << "Line2" << endl:

会给出这样的输出：

Line1

Line2

有没有任何函数可以做到这一点？

原文

I need a function to convert "explicit" escape sequences into the relative non-printable character.
Es:

char str[] = "\\n";
cout << "Line1" << convert_esc(str) << "Line2" << endl:

would give this output:

Line1

Line2

Is there any function that does this?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

忆离笙 2024-11-07 13:43:02

我认为您必须自己编写这样的函数，因为转义字符是一个编译时功能，即当您编写 "\n" 时，编译器将替换 \ n 带有 eol 字符的序列。生成的字符串的长度为 1（不包括终止零字符）。

在您的情况下，字符串 "\\n" 的长度为 2 （再次排除终止零）并包含 \ 和 n< /代码>。

您需要扫描字符串，并在遇到 \ 时检查以下字符。如果它是合法转义之一，则应将它们都替换为相应的字符，否则跳过或保留它们。

（http://ideone.com/BvcDE）：

string unescape(const string& s)
{
  string res;
  string::const_iterator it = s.begin();
  while (it != s.end())
  {
    char c = *it++;
    if (c == '\\' && it != s.end())
    {
      switch (*it++) {
      case '\\': c = '\\'; break;
      case 'n': c = '\n'; break;
      case 't': c = '\t'; break;
      // all other escapes
      default: 
        // invalid escape sequence - skip it. alternatively you can copy it as is, throw an exception...
        continue;
      }
    }
    res += c;
  }

  return res;
}

I think that you must write such function yourself since escape characters is a compile-time feature, i.e. when you write "\n" the compiler would replace the \n sequence with the eol character. The resulting string is of length 1 (excluding the terminating zero character).

In your case a string "\\n" is of length 2 (again excluding terminating zero) and contains \ and n.

You need to scan your string and when encountering \ check the following char. if it is one of the legal escapes, you should replace both of them with the corresponding character, otherwise skip or leave them both as is.

( http://ideone.com/BvcDE ):

string unescape(const string& s)
{
  string res;
  string::const_iterator it = s.begin();
  while (it != s.end())
  {
    char c = *it++;
    if (c == '\\' && it != s.end())
    {
      switch (*it++) {
      case '\\': c = '\\'; break;
      case 'n': c = '\n'; break;
      case 't': c = '\t'; break;
      // all other escapes
      default: 
        // invalid escape sequence - skip it. alternatively you can copy it as is, throw an exception...
        continue;
      }
    }
    res += c;
  }

  return res;
}

回复收藏 0 原文

泅渡 2024-11-07 13:43:02

使用 boost 字符串算法库，您可以相当简单地做到这一点。例如：

#include <string>
#include <iostream>
#include <boost/algorithm/string.hpp>

void escape(std::string& str)
{
  boost::replace_all(str, "\\\\", "\\");
  boost::replace_all(str, "\\t",  "\t");
  boost::replace_all(str, "\\n",  "\n");
  // ... add others here ...
}

int main()
{
  std::string str = "This\\tis\\n \\\\a test\\n123";

  std::cout << str << std::endl << std::endl;
  escape(str);
  std::cout << str << std::endl;

  return 0;
}

这肯定不是最有效的方法（因为它多次迭代字符串），但它紧凑且易于理解。

更新：
正如 ybungalobill 所指出的，每当替换字符串生成字符序列时，稍后的替换正在搜索，或者当替换删除/修改应该被替换的字符序列时，这种实现都是错误的。

第一种情况的示例是 "\\\\n" -> “\\n” -> <代码>“\n”。当您输入 "\\\\" -> "\\" 替换最后（乍一看似乎是解决方案），您会得到后一种情况的示例 "\\\\n" -> ; “\\\n”。显然这个问题没有简单的解决方案，这使得该技术仅适用于非常简单的转义序列。

如果您需要通用（且更有效）的解决方案，您应该实现一个迭代字符串的状态机，如 davka 所建议的那样。

You can do that fairly easy, using the boost string algorithm library. For example:

#include <string>
#include <iostream>
#include <boost/algorithm/string.hpp>

void escape(std::string& str)
{
  boost::replace_all(str, "\\\\", "\\");
  boost::replace_all(str, "\\t",  "\t");
  boost::replace_all(str, "\\n",  "\n");
  // ... add others here ...
}

int main()
{
  std::string str = "This\\tis\\n \\\\a test\\n123";

  std::cout << str << std::endl << std::endl;
  escape(str);
  std::cout << str << std::endl;

  return 0;
}

This is surely not the most efficient way to do this (because it iterates the string multiple times), but it is compact and easy to understand.

Update:
As ybungalobill has pointed out, this implementation will be wrong, whenever a replacement string produces a character sequence, that a later replacement is searching for or when a replacement removes/modifies a character sequence, that should have been replaced.

An example for the first case is "\\\\n" -> "\\n" -> "\n". When you put the "\\\\" -> "\\" replacement last (which seems to be the solution at a first glance), you get an example for the latter case "\\\\n" -> "\\\n". Obviously there is no simple solution to this problem, which makes this technique only feasible for very simple escape sequences.

If you need a generic (and more efficient) solution, you should implement a state machine that iterates the string, as proposed by davka.

回复收藏 0 原文

素手挽清风 2024-11-07 13:43:02

我确信是有人写的，但它太微不足道了，我怀疑它是否在任何地方专门发表过。

只需根据标准库中的各种“查找”/“替换”式算法自行重新创建它即可。

回复收藏 0 原文

风筝有风，海豚有海 2024-11-07 13:43:02

您考虑过使用 printf 吗？（或其亲属之一）

回复收藏 0 原文

陪我终i 2024-11-07 13:43:02

这是在 Unixy 平台上执行此操作的一种可爱方法。

它调用操作系统的echo命令来进行转换。

string convert_escapes( string input )
   {
   string buffer(input.size()+1,0);
   string cmd = "/usr/bin/env echo -ne \""+input+"\"";
   FILE * f = popen(cmd.c_str(),"r"); assert(f);
   buffer.resize(fread(&buffer[0],1,buffer.size()-1,f));
   fclose(f);
   return buffer;
   }

Here's a cute way to do it on Unixy platforms.

It calls the operating system's echo command to make the conversion.

string convert_escapes( string input )
   {
   string buffer(input.size()+1,0);
   string cmd = "/usr/bin/env echo -ne \""+input+"\"";
   FILE * f = popen(cmd.c_str(),"r"); assert(f);
   buffer.resize(fread(&buffer[0],1,buffer.size()-1,f));
   fclose(f);
   return buffer;
   }

回复收藏 0 原文

~没有更多了~