当前位置：文江博客话题详情

要将复杂的应用程序从 C++Builder 2007 升级到 2010，我需要了解哪些信息？

发布于 2024-08-04 07:36:52 字数 2404 浏览 11 评论 0原文

我公司的主要应用程序大部分是用C++编写的（带有一些Delphi代码和组件）。我们将从 RAD Studio 2007 升级到 2010，以便在大约一周后开始发布下一个版本。我需要了解什么才能确保升级顺利进行？

到目前为止我想到的几点是：

Unicode。这个看起来真的很复杂。我们的应用程序包含 std::string-s 和 AnsiString-s 的可怕组合，以及它们之间的强制转换。我对此有很多疑问，例如“wstring 是否能够容纳 UnicodeString 可以容纳的所有内容，我们是否应该只进行搜索/替换”，或者“我们是否应该完全避免所有 C++ 字符串类型并使用 UnicodeString”、“我们可以吗？”将所有事件处理程序更改为使用 String，尽管现有的 ~~.HPPs~~ 事件处理程序方法原型已编译器翻译为 AnsiString”，具体到基础知识，例如“我们应该为所有字符串添加 L 前缀，还是编译器足够聪明，启用 Unicode 来使用 Unicode 字符串”等。对此的任何见解都将非常感激。
我们还需要向后兼容性。我们的应用程序使用自己的二进制元组格式，当前将字符串存储为字节数组。我需要升级它以读取旧文件，并且可能还写入新的 Unicode 字符串。如何处理嵌入二进制格式的 Unicode 字符串？是否有任何通用方法可以将 UnicodeString 指向字节数组，该数组最初可能写为 ANSI 字节或 Unicode，并且它会弄清楚它们是什么？
第三方组件。我们主要使用 SpTBX，而且它看起来是兼容的。
项目升级。 Codegear 论坛中的标准建议似乎是在升级时手动重新创建所有项目文件。这是一项巨大的工作量（我们的主应用程序中有 7 个项目（主要是库），加上六个 DLL，很多文件。）有什么方法可以自动执行此操作吗？
链接器看起来怎么样？传统上，我们在链接器随机崩溃或耗尽资源方面遇到了很多麻烦，尽管在 2007 年情况有所好转。这是我们的主应用程序被分成几个库的原因之一 - 链接器不能（希望“不能，但现在可以”？）以其他方式处理它。
我知道有一个新的类型库编辑器和格式（它存储 IDL，即文本，并动态生成 TLB？）这如何处理使用 TLB 升级现有 COM 项目？我们有内置于 C++ 应用程序中的 Delphi 代码和 TLB。
还有什么我应该考虑或注意的吗？

我发现：

2007年和2010年共存。我不确定我是否相信这个答案，因为我之前在同一台机器上遇到过 2006 年和 2007 年的问题。
关于Unicode的几个答案：用2009编写字符串和通用转换为 Unicode 文本，但是没有一个是问题的答案，也不是 C++Builder 特定的部分。
这个关于指南升级到 2009 年的问题但是尽管答案很有帮助，但它们并没有回答上述所有与 Unicode 相关的问题。
[编辑：添加] Unicode 的 Codegear 文档RAD Studio 和要查看的内容用于转换为 Unicode 时

原文

My company's main application is mostly written in C++ (with some Delphi code and components). We are upgrading from RAD Studio 2007 to 2010 for the next release, starting in about a week. What do I need to know to ensure this upgrade goes smoothly?

Points I have thought of so far are:

Unicode. This one looks really complicated. Our app contains a horrible mix of std::string-s and AnsiString-s with casts to and from them. I have lots of questions about this, such as "is wstring capable of holding everything a UnicodeString can, and should we just do a search/replace", or "should we avoid all C++ string types altogether and use UnicodeString", "can we change all event handlers to use String though the existing ~~.HPPs~~ event handler method prototypes were compiler-translated to AnsiString", right down to basics such as "should we prefix all strings with L, or is the compiler smart enough with Unicode enabled to use Unicode strings", etc. Any insight on this would be really appreciated.
We also need backwards compatibility. Our app uses its own binary tuple format that currently stores strings as an array of bytes. I need to upgrade this to read old files and, presumably, write new Unicode strings as well. How do I handle Unicode strings embedded in a binary format? Is there any generic way where I can point a UnicodeString at an array of bytes, that may be originally written as either ANSI bytes or Unicode, and it will figure out what they are?
Third-party components. We use SpTBX mainly, and it appears to be compatible.
Project upgrades. The standard advice in the Codegear forums seems to be to manually recreate all project files when upgrading. This is an awful lot of work (7 projects (mostly libs) in our main app, plus half a dozen DLLs, a lot of files.) Is there any way to automate this?
How's the linker look? We traditionally have a lot of trouble with the linker randomly crashing or running out of resources, though it got a lot better in 2007. This is one reason our main application is split into several libs - the linker cannot (hopefully, "could not, but now can"?) handle it otherwise.
I know there's a new type library editor and format (it stores the IDL, ie text, and generates the TLB dynamically?) How well does this handle upgrading existing COM projects with a TLB? We have Delphi code and TLB that are built into the C++ application.
Is there anything else I should be considering or be aware of?

I have found:

2007 and 2010 co-existing. I'm not sure I trust this answer since I have had issues with 2006 and 2007 on the same machine before.
several answers about Unicode: writing strings with 2009 and generic transition to Unicode text but none are answers for concerns, or the C++Builder-specific parts at all.
This question about guidelines upgrading to 2009 but though the answers are helpful, they don't answer all the Unicode-related issues above.
[Edit: added] Codegear documents for Unicode in RAD Studio and things to look for when converting to Unicode

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

幸福丶如此 2024-08-11 07:36:52

项目升级。 Codegear 论坛中的标准建议似乎是在升级时手动重新创建所有项目文件。这是一项巨大的工作量（我们的主应用程序中有 7 个项目（主要是库），再加上六个 DLL，大量文件。）有什么方法可以自动执行此操作吗？

有：只需使用 IDE 的项目导入器:)
说真的，我只是尝试导入项目，然后再去调查它是否不起作用。

链接器看起来怎么样？传统上，我们在链接器随机崩溃或耗尽资源方面遇到了很多麻烦，尽管在 2007 年情况有所好转。这是我们的主应用程序被分成几个库的原因之一 - 链接器不能（希望“不能，但现在可以”？）以其他方式处理它。

自 C++Builder 2009 以来，我在使用 ILINK 时几乎不再遇到任何问题。我偶尔会读到其他人遇到内存不足错误，但新闻组中的某人发现了一种解决方法：

https://forums.embarcadero.com/thread.jspa?messageID=140012&tstart=0#140012

另外，您也可以阅读此处，编译器得到一个新选项（-Cx）来控制它分配的最大内存量。

我知道有一个新的类型库编辑器和格式（它存储 IDL，即文本，并动态生成 TLB？）这如何处理使用 TLB 升级现有 COM 项目？

应该可以顺利工作。

我对此有很多疑问，例如“wstring 是否能够容纳 UnicodeString 可以容纳的所有内容，我们是否应该只进行搜索/替换”

是的，在Windows平台上wchar_t通常是16位大，这意味着它足以容纳UTF -16 是 UnicodeString。

或者“我们应该完全避免所有 C++ 字符串类型并使用 UnicodeString”

取决于您的代码需要的可移植性。无论如何，只要您只需要字符串类型，请使用“String”，而不是“UnicodeString”。

“我们可以将所有事件处理程序更改为使用 String，尽管现有的 .HPP 已编译器翻译为 AnsiString”

首先，您永远不应该重复使用旧版本 DCC 生成的 .hpp 文件！
对于 Delphi 中使用 String 类型的事件处理程序，必须使用 UnicodeString。如上所述，只需使用“String”，您的代码将适用于 C++Builder 的 ANSI 和 Unicode 版本。

直接到基础知识，例如“我们应该为所有字符串添加 L 前缀，还是编译器足够智能并启用 Unicode 来使用 Unicode 字符串”

编译器不会转换您的字符串（这会与语言标准冲突），但 AnsiString和 UnicodeString 确实有 char* 和 wchar_t* 字符串文字的复制构造函数重载。即，以下内容将起作用：

AnsiString as = L"foo";
UnicodeString us = "bar";

但是，以这种方式不起作用的是整堆 printf()/scanf() 函数； AnsiString::sprintf() 采用 const char*，UnicodeString::sprintf() 采用 const wchar_t*。

如果您经常使用 sprintf()，您可能会发现我的 CbdeFormat 库很有用；只需阅读我关于该主题的文章。

Project upgrades. The standard advice in the Codegear forums seems to be to manually recreate all project files when upgrading. This is an awful lot of work (7 projects (mostly libs) in our main app, plus half a dozen DLLs, a lot of files.) Is there any way to automate this?

There is: just use the IDE's project importer :)
Seriously, I would just try importing the projects, and then go investigate if it doesn't seem to work.

How's the linker look? We traditionally have a lot of trouble with the linker randomly crashing or running out of resources, though it got a lot better in 2007. This is one reason our main application is split into several libs - the linker cannot (hopefully, "could not, but now can"?) handle it otherwise.

I've had almost no trouble with ILINK anymore since C++Builder 2009. I've occasionally read that others experienced out-of-memory errors, but someone in the newsgroups has discovered a workaround:

https://forums.embarcadero.com/thread.jspa?messageID=140012&tstart=0#140012

Also, as you can read here, the compiler got a new option (-Cx) to control the maximal amount of memory it allocates.

I know there's a new type library editor and format (it stores the IDL, ie text, and generates the TLB dynamically?) How well does this handle upgrading existing COM projects with a TLB?

Should work without a hitch.

I have lots of questions about this, such as "is wstring capable of holding everything a UnicodeString can, and should we just do a search/replace"

Yes, on Windows platforms wchar_t usually is 16 bit large, which means it suffices for holding UTF-16 which UnicodeString is.

or "should we avoid all C++ string types altogether and use UnicodeString"

Depends on how portable your code needs to be. In any case, whenever you just need a string type, use "String", not "UnicodeString".

"can we change all event handlers to use String though the existing .HPPs were compiler-translated to AnsiString"

First, you should NEVER re-use .hpp files generated by older versions of DCC!
For event handlers that use the String type in Delphi, you must use UnicodeString. As above, simply use "String", and your code will work for both the ANSI and Unicode versions of C++Builder.

right down to basics such as "should we prefix all strings with L, or is the compiler smart enough with Unicode enabled to use Unicode strings"

The compiler doesn't convert your strings (it would conflict with the language standards), but both AnsiString and UnicodeString do have copy constructor overloads for both char* and wchar_t* string literals. I.e., the following will work:

AnsiString as = L"foo";
UnicodeString us = "bar";

What will not work this way, though, is the whole bunch of printf()/scanf() functions; AnsiString::sprintf() takes const char*, UnicodeString::sprintf() takes const wchar_t*.

If you are using sprintf() a lot, you may find my CbdeFormat library useful; just read my article on the subject.

回复收藏 0 原文

不乱于心 2024-08-11 07:36:52

您没有说明二进制元组格式中的数据字符串的用途：它们是否需要存储 Unicode？当我从 D2007 转换到 D2009 时，我能够仅保留系统的某些部分 ANSI 字符串。

如果需要存储 Unicode，那么您需要检查现有数据是否与 UTF-8 等格式兼容。如果存储在现有数据文件中的值范围出现问题，那么我会让您的下一次升级对任何旧数据文件进行一次性转换，读取旧的 AnsiString 数据并将其作为 UTF-8 写回不同的版本。文件名或扩展名，或通过修改适当的文件头数据。我很长时间以来一直对数据文件进行版本控制，只是为了允许这种处理更改。

我刚刚开始一个 BCB2010 项目，所以不能评论你的其他问题，但我确实很难将 Delphi 项目从 D2007 升级到 D2009 - 尽管我能够通过编辑项目文件（只是 XML）来解决这个问题。

祝转换顺利;-)

回复收藏 0 原文

记忆消瘦 2024-08-11 07:36:52

统一码。这个看起来真的
复杂的。我们的应用程序包含一个
std::string-s 和的可怕组合
AnsiString-s 与转换
他们。我有很多关于
这个，例如“wstring 能够
将所有内容保存为 UnicodeString
可以，而且我们应该做一个
搜索/替换”

std::wstring 包含 wchar_t* 字符串，就像 System::UnicodeString 一样。

我们应该避免所有 C++ 字符串吗
类型和使用
Unicode字符串

这由你决定。仍然支持 char* 字符串。您不必被迫将所有内容迁移到 Unicode。

我们可以将所有事件处理程序更改为
通过现有的 .HPP 使用 String
被编译器翻译为 AnsiString

不，您不能更改自动管理事件处理程序以使用 System::String 别名。所有 IDE 版本都会抱怨这一点。您必须手动更新事件处理程序声明和实现，以在适当的时候使用 UnicodeString 参数而不是 AnsiString 参数。这也意味着您也无法在多个 IDE 版本之间共享 DFM 和 Unit .h 文件（无论如何您都不应该这样做）。

我们应该为所有字符串添加 L 前缀吗？
或者编译器是否足够聪明
启用 Unicode 以使用 Unicode 字符串

否。如果声明不带 L 前缀的字符串常量或字符常量，数据仍将被解释为 Ansi。这一点没有改变。不过，您可以将 Ansi 数据传递给 System::UnicodeString（但不能传递给 std::wstring），它会自动转换为 Unicode。但你必须小心，因为它将使用操作系统的默认 Ansi 代码页来解释数据。只要您的 Ansi 数据仅使用 ASCII 字符，那么您就可以了。否则，如果您使用非 ASCII 字符，那么最好将数据放入 System::AnsiStringT 或 System::RawByteString （两者均在 CB2009 中引入））已分配了正确的代码页，然后将其分配给您的 System::UnicodeString 变量。将使用关联的代码页而不是操作系统默认代码页进行转换。

我们还需要向后兼容性。
我们的应用程序使用自己的二进制元组
当前存储字符串的格式
作为字节数组。我需要
升级它以读取旧文件，并且，
据推测，写入新的 Unicode 字符串
以及。我如何处理 Unicode
以二进制格式嵌入的字符串？

如果您的元组需要 8 位字符，那么您必须确保任何结构声明等都使用 char 而不是 wchar_t 字符。如果您需要存储 Unicode 字符串，但需要保持 8 位兼容性，那么您应该首先将 Unicode 字符串编码为 UTF-8（您可以使用 System::UTF8String 字符串类型来帮助你 - 从 CB2009 开始，现在它是真正的 UTF-8 字符串）。只要您不使用非 ASCII 字符，那么您的旧应用程序就不会知道其中的差异，因为 ASCII 字符按原样编码为 UTF-8。但是，如果您想存储原始 Unicode 数据，那么您的元组将需要在某处有一个标志（如果还没有），指示字符串数据是存储为 Ansi 还是 Unicode，并且您的应用程序必须查找该标志。

有什么通用的方法可以让我
将 UnicodeString 指向数组
字节，最初可能被写入
作为 ANSI 字节或 Unicode，以及
它会弄清楚它们是什么？

不可以。您必须事先知道字节的实际编码。如果将内存地址传递给 System::AnsiString 或 std::string，它将采用 Ansi 字符。如果将相同的内存地址传递给 System::UnicodeString 或 std::wstring，它将采用 Unicode 字符。

第三方组件。我们使用 SpTBX
主要是，而且似乎是
兼容。

就像所有以前的版本一样（除了从 2006 年迁移到 2007 年），您拥有的任何第三方组件都需要为 2010 年重新编译，可以手动（如果您有它们的源代码）或通过它们各自的供应商。

项目升级。标准建议
在 Codegear 论坛中似乎是
手动重新创建所有项目文件
升级时。

是的。这仍然适用。

我知道有一个新的类型库
编辑器和格式（它存储IDL，
即文本，并生成TLB
动态？）

.TLB 文件不再被使用。新系统现在运行在 .ridl（简化 IDL）文件上。在编译过程中，.ridl 直接在可执行文件的二进制资源中生成正确的 TypeLibrary 信息。不生成 .tlb 文件。

这对升级的处理效果如何
现有的带有 TLB 的 COM 项目？我们
有 Delphi 代码和 TLB
内置于 C++ 应用程序中。

我不记得 CB2010（或 CB2009，就此而言）是否可以直接使用预先存在的 .tlb 文件。我认为他们不能。但是，您可以通过 tlibimp.exe 运行 .tlb 文件，它将导出 .ridl 文件。或者，您可以从过去版本中的 TLB 编辑器复制 IDL 文本，然后手动将其粘贴到新的 .ridl 文件中。无论哪种方式，您都可以将该 .ridl 文件添加到您的 CB2010 项目中。

2007年和2010年并存。我不是
当然我相信这个答案，因为我已经
2006年和2007年有问题
之前是同一台机器。

这就是为什么我在同一台物理机上安装多个 IDE 版本时使用虚拟机的原因。

Unicode. This one looks really
complicated. Our app contains a
horrible mix of std::string-s and
AnsiString-s with casts to and from
them. I have lots of questions about
this, such as "is wstring capable of
holding everything a UnicodeString
can, and should we just do a
search/replace"

std::wstring contains wchar_t* strings, just like System::UnicodeString does.

should we avoid all C++ string
types altogether and use
UnicodeString

That is up to you to decide. char* strings are still supported. You are not forced to migrate everything to Unicode.

can we change all event handlers to
use String though the existing .HPPs
were compiler-translated to AnsiString

No, you cannot change auto-managed event handlers to use the System::String alias. All IDE versions will complain about that. You will have to manually update your event handler declarations and implementations to use UnicodeString parameters instead of AnsiString parameters when appropriate. That also means you cannot share DFMs and Unit .h files across multiple IDE versions, either (which you should not be doing anyway).

should we prefix all strings with L,
or is the compiler smart enough with
Unicode enabled to use Unicode strings

No. If you declare a string constant or character constant without an L prefix, the data will still be interpretted as Ansi. That has not changed. You can, however, pass Ansi data to System::UnicodeString (but not to std::wstring), and it will convert to Unicode automatically. But you have to be careful because it will use the OS's default Ansi codepage to interpret the data. As long as your Ansi data is only using ASCII characters only, then you will be OK. Otherwise, if you are using non-ASCII characters, then you are better off putting the data into a System::AnsiStringT or System::RawByteString (both were introduced in CB2009) that has been assigned the correct codepage, and then assign that to your System::UnicodeString variable. The associated codepage will be used instead of the OS default codepage for the conversion.

We also need backwards compatibility.
Our app uses its own binary tuple
format that currently stores strings
as an array of bytes. I need to
upgrade this to read old files and,
presumably, write new Unicode strings
as well. How do I handle Unicode
strings embedded in a binary format?

If your tuple is expecting 8-bit characters, then you will have to make sure that any struct declarations and such are using char and not wchar_t characters. If you need to store Unicode strings, but need to maintain the 8-bit compatibility, then you should encode your Unicode strings to UTF-8 first (you can use the System::UTF8String string type to help you - starting in CB2009, it is a true UTF-8 string now). As long as you do not use non-ASCII characters, then your old apps will not know the difference, as ASCII characters are encoded as-is in UTF-8. If you want to store raw Unicode data, however, then your tuple would need a flag somewhere (if it does not already have one) indicating whether the string data is stored as Ansi or Unicode, and your apps would have to look for that flag.

Is there any generic way where I can
point a UnicodeString at an array of
bytes, that may be originally written
as either ANSI bytes or Unicode, and
it will figure out what they are?

No. You have to know the actual encoding of the bytes beforehand. If you pass a memory address to System::AnsiString or std::string, it is going to assume Ansi characters. If you pass the same memory address to System::UnicodeString or std::wstring, it is going to assume Unicode characters instead.

Third-party components. We use SpTBX
mainly, and it appears to be
compatible.

Just like with all prior versions (except for the migration from 2006 to 2007), any third-party components you have will need to be re-compiled for 2010, either manually (if you have the source code for them) or by their respective vendors.

Project upgrades. The standard advice
in the Codegear forums seems to be to
manually recreate all project files
when upgrading.

Yes. That still applies.

I know there's a new type library
editor and format (it stores the IDL,
ie text, and generates the TLB
dynamically?)

.TLB files are not used at all anymore. The new system operates on .ridl (Reduced IDL) files now. During compiling, the .ridl produces the correct TypeLibrary information in the executable's binary resources directly. No .tlb files are generated.

How well does this handle upgrading
existing COM projects with a TLB? We
have Delphi code and TLB that are
built into the C++ application.

I do not remember whether CB2010 (or CB2009, for that matter) can consume pre-existing .tlb files directly. I don't think they can. You can, however, run the .tlb file through tlibimp.exe and it will export a .ridl file. Or you can copy the IDL text from the TLB editor in a past version and paste it into a new .ridl file manually. Either way, you can then add that .ridl ile to your CB2010 project.