如何在 C++ 中包含极长的文字来源?

发布于 2024-08-25 20:25:12 字数 1165 浏览 3 评论 0原文

我有一点问题。本质上,我需要在我的程序中存储大量白名单条目,并且我想直接包含这样一个列表——我不想分发其他库等,我也不想将字符串嵌入到 Win32 资源中,出于多种原因我现在不想讨论。

我只是将大白名单包含在 .cpp 文件中,然后出现以下错误:

1>ServicesWhitelist.cpp(2807): fatal error C1091: compiler limit: string exceeds 65535 bytes in length

字符串本身大约是 VC++ 允许限制的两倍。在程序中包含如此大的文字的最佳方法是什么?

编辑:

我像这样存储字符串:

const std::wstring servicesWhitelist
(
 L".NETFRAMEWORK|"
 L"_IOMEGA_ACTIVE_DISK_SERVICE_|"
 L"{6080A529-897E-4629-A488-ABA0C29B635E}|"
 L"{834170A7-AF3B-4D34-A757-E05EB29EE96D}|"
 L"{85CCB53B-23D8-4E73-B1B7-9DDB71827D9B}|"
 L"{95808DC4-FA4A-4C74-92FE-5B863F82066B}|"
 L"{A7447300-8075-4B0D-83F1-3D75C8EBC623}|"
 L"{D31A0762-0CEB-444E-ACFF-B049A1F6FE91}|"
 L"{E2B953A6-195A-44F9-9BA3-3D5F4E32BB55}|"
 L"{EDA5F5D3-9E0F-4F4D-8A13-1D1CF469C9CC}|"
 L"2WIREPCP|"
//About 3800 more lines
);

EDIT2 它在运行时的使用方式类似于:

static const boost::wregex servicesWhitelistRegex(servicesWhitelist);
std::wstring service;
//code to populate service
if (!boost::regex_match(service, servicesWhitelistRegex))
 //Do something to print service

I've got a bit of a problem. Essentially, I need to store a large list of whitelisted entries inside my program, and I'd like to include such a list directly -- I don't want to have to distribute other libraries and such, and I don't want to embed the strings into a Win32 resource, for a bunch of reasons I don't want to go into right now.

I simply included my big whitelist in my .cpp file, and was presented with this error:

1>ServicesWhitelist.cpp(2807): fatal error C1091: compiler limit: string exceeds 65535 bytes in length

The string itself is about twice this allowed limit by VC++. What's the best way to include such a large literal in a program?

EDIT:

I'm storing the string like this:

const std::wstring servicesWhitelist
(
 L".NETFRAMEWORK|"
 L"_IOMEGA_ACTIVE_DISK_SERVICE_|"
 L"{6080A529-897E-4629-A488-ABA0C29B635E}|"
 L"{834170A7-AF3B-4D34-A757-E05EB29EE96D}|"
 L"{85CCB53B-23D8-4E73-B1B7-9DDB71827D9B}|"
 L"{95808DC4-FA4A-4C74-92FE-5B863F82066B}|"
 L"{A7447300-8075-4B0D-83F1-3D75C8EBC623}|"
 L"{D31A0762-0CEB-444E-ACFF-B049A1F6FE91}|"
 L"{E2B953A6-195A-44F9-9BA3-3D5F4E32BB55}|"
 L"{EDA5F5D3-9E0F-4F4D-8A13-1D1CF469C9CC}|"
 L"2WIREPCP|"
//About 3800 more lines
);

EDIT2
It's used at runtime in a way similar to this:

static const boost::wregex servicesWhitelistRegex(servicesWhitelist);
std::wstring service;
//code to populate service
if (!boost::regex_match(service, servicesWhitelistRegex))
 //Do something to print service

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

感情旳空白 2024-09-01 20:25:12

数组怎么样? (您只需将逗号放在每个元素的合法限制之后)

const std::wstring servicesWhitelist[] = {
 L".NETFRAMEWORK|",
 L"_IOMEGA_ACTIVE_DISK_SERVICE_|",
 L"{6080A529-897E-4629-A488-ABA0C29B635E}|",
 L"{834170A7-AF3B-4D34-A757-E05EB29EE96D}|",
 L"{85CCB53B-23D8-4E73-B1B7-9DDB71827D9B}|",
 L"{95808DC4-FA4A-4C74-92FE-5B863F82066B}|",
 L"{A7447300-8075-4B0D-83F1-3D75C8EBC623}|",
 L"{D31A0762-0CEB-444E-ACFF-B049A1F6FE91}|",
 L"{E2B953A6-195A-44F9-9BA3-3D5F4E32BB55}|",
 L"{EDA5F5D3-9E0F-4F4D-8A13-1D1CF469C9CC}|",
 L"2WIREPCP|",
...
};

您可以使用以下语句来获取组合字符串。

accumulate(servicesWhitelist, servicesWhitelist+sizeof(servicesWhitelist)/sizeof(servicesWhitelist[0]), "")

How about an array? (you would put the commas only after the legal limit for every element)

const std::wstring servicesWhitelist[] = {
 L".NETFRAMEWORK|",
 L"_IOMEGA_ACTIVE_DISK_SERVICE_|",
 L"{6080A529-897E-4629-A488-ABA0C29B635E}|",
 L"{834170A7-AF3B-4D34-A757-E05EB29EE96D}|",
 L"{85CCB53B-23D8-4E73-B1B7-9DDB71827D9B}|",
 L"{95808DC4-FA4A-4C74-92FE-5B863F82066B}|",
 L"{A7447300-8075-4B0D-83F1-3D75C8EBC623}|",
 L"{D31A0762-0CEB-444E-ACFF-B049A1F6FE91}|",
 L"{E2B953A6-195A-44F9-9BA3-3D5F4E32BB55}|",
 L"{EDA5F5D3-9E0F-4F4D-8A13-1D1CF469C9CC}|",
 L"2WIREPCP|",
...
};

You could use the below statement to get the combined string.

accumulate(servicesWhitelist, servicesWhitelist+sizeof(servicesWhitelist)/sizeof(servicesWhitelist[0]), "")
李白 2024-09-01 20:25:12

假设您实际上需要存储一个大于 64k 个字符的字符串(即上述所有“只是不要这样做”的解决方案都不适用。)

为了让 MSVC 满意,而不是说:

const char *foo = "abcd...";

您可以转换您的大于 64k 个字符字符串到表示为整数的单个字符:

const char foo[] = { 97, 98, 99, 100, ..., 0 };

其中每个字母都已转换为其等效的 ascii(97 == 'a' 等),并且在末尾添加了 NUL 终止符。

至少MSVC2010对此感到满意。

Let's assume you actually need to store a string >64k characters (i.e. all of the above "just don't do that" solutions don't apply.)

To make MSVC happy, instead of saying:

const char *foo = "abcd...";

You can convert your >64k character string to individual characters represented as integers:

const char foo[] = { 97, 98, 99, 100, ..., 0 };

Where each letter has been converted to its ascii equivalent (97 == 'a', etc.), and a NUL terminator has been added at the end.

MSVC2010 at least is happy with this.

花开雨落又逢春i 2024-09-01 20:25:12

如果它只是限制的两倍左右,那么显而易见的解决方案似乎是存储 2 个(或 3 个)这样的字符串。 :) 我确信您在运行时读取它们的代码可以很容易地处理这个问题。

编辑:出于某种原因您需要使用正则表达式吗?您能否将大字符串分解为单个标记的列表并进行简单的字符串比较?

If it's only about twice the limit the obvious solution would seem to be to store 2 (or 3) such strings. :) I'm sure your code that reads them at runtime can deal with that easily enough.

EDIT: Do you need to use a regex for some reason? Could you break up the big strings into a list of individual tokens and do a simple string comparison?

瑕疵 2024-09-01 20:25:12

我对这个没有任何功劳:

https://social.msdn.microsoft.com/Forums/vstudio/en-US/c573db8b-c9cd-43d7-9f89-202ba9417296/fatal-error-c1091

改用STL。

代码片段

#include

std::ostringstream oss;

oss << myString1 << myString2 <<; myString3 <<; myString4;

oss.str() 现在将返回 STL 的 std:: string 类的实例,而 oss.str().c_str() 将返回 const char*

I claim no credit for this one:

https://social.msdn.microsoft.com/Forums/vstudio/en-US/c573db8b-c9cd-43d7-9f89-202ba9417296/fatal-error-c1091

Use the STL instead.

Code Snippet

#include <sstream>

std::ostringstream oss;

oss << myString1 << myString2 << myString3 << myString4;

oss.str() would now return an instance of the STL's std:: string class, and oss.str().c_str() would return a const char*

冰雪之触 2024-09-01 20:25:12

您的问题可以简化为(在 Python 中):

whitelist_services = { ".NETFRAMEWORK", "_IOMEGA_ACTIVE_DISK_SERVICE_" }
if service in whitelist_services:
   print service, "is a whitelisted service"

直接翻译为 C++ 将是:

// g++ *.cc -std=c++0x && ./a.out
#include <iostream>
#include <unordered_set>

namespace {
  typedef const wchar_t* str_t;
  // or
  ////typedef std::wstring str_t;
  str_t servicesWhitelist[] = {
    L".NETFRAMEWORK",
    L"_IOMEGA_ACTIVE_DISK_SERVICE_",
  };
  const size_t N = sizeof(servicesWhitelist) / sizeof(*servicesWhitelist);

  // if you need to search for multiple services then a hash table
  // could speed searches up O(1). Otherwise std::find() on the array
  // might be sufficient O(N), or std::binary_search() on sorted array
  // O(log N) 
  const std::unordered_set<str_t> services
    (servicesWhitelist, servicesWhitelist + N);
}

int main() {
  str_t service = L".NETFRAMEWORK";
  if (services.find(service) != services.end())
    std::wcout << service << " is a whitelisted service" << std::endl;
}

You problem could be stripped down to (in Python):

whitelist_services = { ".NETFRAMEWORK", "_IOMEGA_ACTIVE_DISK_SERVICE_" }
if service in whitelist_services:
   print service, "is a whitelisted service"

A direct translation to C++ would be:

// g++ *.cc -std=c++0x && ./a.out
#include <iostream>
#include <unordered_set>

namespace {
  typedef const wchar_t* str_t;
  // or
  ////typedef std::wstring str_t;
  str_t servicesWhitelist[] = {
    L".NETFRAMEWORK",
    L"_IOMEGA_ACTIVE_DISK_SERVICE_",
  };
  const size_t N = sizeof(servicesWhitelist) / sizeof(*servicesWhitelist);

  // if you need to search for multiple services then a hash table
  // could speed searches up O(1). Otherwise std::find() on the array
  // might be sufficient O(N), or std::binary_search() on sorted array
  // O(log N) 
  const std::unordered_set<str_t> services
    (servicesWhitelist, servicesWhitelist + N);
}

int main() {
  str_t service = L".NETFRAMEWORK";
  if (services.find(service) != services.end())
    std::wcout << service << " is a whitelisted service" << std::endl;
}
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文