要将复杂的应用程序从 C++Builder 2007 升级到 2010,我需要了解哪些信息?
我公司的主要应用程序大部分是用C++编写的(带有一些Delphi代码和组件)。我们将从 RAD Studio 2007 升级到 2010,以便在大约一周后开始发布下一个版本。我需要了解什么才能确保升级顺利进行?
到目前为止我想到的几点是:
Unicode。这个看起来真的很复杂。我们的应用程序包含 std::string-s 和 AnsiString-s 的可怕组合,以及它们之间的强制转换。我对此有很多疑问,例如“wstring 是否能够容纳 UnicodeString 可以容纳的所有内容,我们是否应该只进行搜索/替换”,或者“我们是否应该完全避免所有 C++ 字符串类型并使用 UnicodeString”、“我们可以吗?”将所有事件处理程序更改为使用 String,尽管现有的
.HPPs事件处理程序方法原型已编译器翻译为 AnsiString”,具体到基础知识,例如“我们应该为所有字符串添加 L 前缀,还是编译器足够聪明,启用 Unicode 来使用 Unicode 字符串”等。对此的任何见解都将非常感激。我们还需要向后兼容性。我们的应用程序使用自己的二进制元组格式,当前将字符串存储为字节数组。我需要升级它以读取旧文件,并且可能还写入新的 Unicode 字符串。如何处理嵌入二进制格式的 Unicode 字符串?是否有任何通用方法可以将 UnicodeString 指向字节数组,该数组最初可能写为 ANSI 字节或 Unicode,并且它会弄清楚它们是什么?
第三方组件。我们主要使用 SpTBX,而且它看起来是兼容的。
项目升级。 Codegear 论坛中的标准建议似乎是在升级时手动重新创建所有项目文件。这是一项巨大的工作量(我们的主应用程序中有 7 个项目(主要是库),加上六个 DLL,很多文件。)有什么方法可以自动执行此操作吗?
链接器看起来怎么样?传统上,我们在链接器随机崩溃或耗尽资源方面遇到了很多麻烦,尽管在 2007 年情况有所好转。这是我们的主应用程序被分成几个库的原因之一 - 链接器不能(希望“不能,但现在可以”?)以其他方式处理它。
我知道有一个新的类型库编辑器和格式(它存储 IDL,即文本,并动态生成 TLB?)这如何处理使用 TLB 升级现有 COM 项目?我们有内置于 C++ 应用程序中的 Delphi 代码和 TLB。
还有什么我应该考虑或注意的吗?
我发现:
- 2007年和2010年共存。我不确定我是否相信这个答案,因为我之前在同一台机器上遇到过 2006 年和 2007 年的问题。
- 关于Unicode的几个答案:用2009编写字符串 和 通用转换为 Unicode 文本,但是没有一个是问题的答案,也不是 C++Builder 特定的部分。
- 这个关于指南升级到 2009 年的问题但是尽管答案很有帮助,但它们并没有回答上述所有与 Unicode 相关的问题。
- [编辑:添加] Unicode 的 Codegear 文档RAD Studio 和要查看的内容用于转换为 Unicode 时
My company's main application is mostly written in C++ (with some Delphi code and components). We are upgrading from RAD Studio 2007 to 2010 for the next release, starting in about a week. What do I need to know to ensure this upgrade goes smoothly?
Points I have thought of so far are:
Unicode. This one looks really complicated. Our app contains a horrible mix of std::string-s and AnsiString-s with casts to and from them. I have lots of questions about this, such as "is wstring capable of holding everything a UnicodeString can, and should we just do a search/replace", or "should we avoid all C++ string types altogether and use UnicodeString", "can we change all event handlers to use String though the existing
.HPPsevent handler method prototypes were compiler-translated to AnsiString", right down to basics such as "should we prefix all strings with L, or is the compiler smart enough with Unicode enabled to use Unicode strings", etc. Any insight on this would be really appreciated.We also need backwards compatibility. Our app uses its own binary tuple format that currently stores strings as an array of bytes. I need to upgrade this to read old files and, presumably, write new Unicode strings as well. How do I handle Unicode strings embedded in a binary format? Is there any generic way where I can point a UnicodeString at an array of bytes, that may be originally written as either ANSI bytes or Unicode, and it will figure out what they are?
Third-party components. We use SpTBX mainly, and it appears to be compatible.
Project upgrades. The standard advice in the Codegear forums seems to be to manually recreate all project files when upgrading. This is an awful lot of work (7 projects (mostly libs) in our main app, plus half a dozen DLLs, a lot of files.) Is there any way to automate this?
How's the linker look? We traditionally have a lot of trouble with the linker randomly crashing or running out of resources, though it got a lot better in 2007. This is one reason our main application is split into several libs - the linker cannot (hopefully, "could not, but now can"?) handle it otherwise.
I know there's a new type library editor and format (it stores the IDL, ie text, and generates the TLB dynamically?) How well does this handle upgrading existing COM projects with a TLB? We have Delphi code and TLB that are built into the C++ application.
Is there anything else I should be considering or be aware of?
I have found:
- 2007 and 2010 co-existing. I'm not sure I trust this answer since I have had issues with 2006 and 2007 on the same machine before.
- several answers about Unicode: writing strings with 2009 and generic transition to Unicode text but none are answers for concerns, or the C++Builder-specific parts at all.
- This question about guidelines upgrading to 2009 but though the answers are helpful, they don't answer all the Unicode-related issues above.
- [Edit: added] Codegear documents for Unicode in RAD Studio and things to look for when converting to Unicode
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
有:只需使用 IDE 的项目导入器:)
说真的,我只是尝试导入项目,然后再去调查它是否不起作用。
自 C++Builder 2009 以来,我在使用 ILINK 时几乎不再遇到任何问题。我偶尔会读到其他人遇到内存不足错误,但新闻组中的某人发现了一种解决方法:
https://forums.embarcadero.com/thread.jspa?messageID=140012&tstart=0#140012
另外,您也可以阅读此处,编译器得到一个新选项(-Cx)来控制它分配的最大内存量。
应该可以顺利工作。
是的,在Windows平台上wchar_t通常是16位大,这意味着它足以容纳UTF -16 是 UnicodeString。
取决于您的代码需要的可移植性。无论如何,只要您只需要字符串类型,请使用“String”,而不是“UnicodeString”。
首先,您永远不应该重复使用旧版本 DCC 生成的 .hpp 文件!
对于 Delphi 中使用 String 类型的事件处理程序,必须使用 UnicodeString。如上所述,只需使用“String”,您的代码将适用于 C++Builder 的 ANSI 和 Unicode 版本。
编译器不会转换您的字符串(这会与语言标准冲突),但 AnsiString和 UnicodeString 确实有 char* 和 wchar_t* 字符串文字的复制构造函数重载。即,以下内容将起作用:
但是,以这种方式不起作用的是整堆 printf()/scanf() 函数; AnsiString::sprintf() 采用 const char*,UnicodeString::sprintf() 采用 const wchar_t*。
如果您经常使用 sprintf(),您可能会发现我的 CbdeFormat 库很有用;只需阅读我关于该主题的文章。
There is: just use the IDE's project importer :)
Seriously, I would just try importing the projects, and then go investigate if it doesn't seem to work.
I've had almost no trouble with ILINK anymore since C++Builder 2009. I've occasionally read that others experienced out-of-memory errors, but someone in the newsgroups has discovered a workaround:
https://forums.embarcadero.com/thread.jspa?messageID=140012&tstart=0#140012
Also, as you can read here, the compiler got a new option (-Cx) to control the maximal amount of memory it allocates.
Should work without a hitch.
Yes, on Windows platforms wchar_t usually is 16 bit large, which means it suffices for holding UTF-16 which UnicodeString is.
Depends on how portable your code needs to be. In any case, whenever you just need a string type, use "String", not "UnicodeString".
First, you should NEVER re-use .hpp files generated by older versions of DCC!
For event handlers that use the String type in Delphi, you must use UnicodeString. As above, simply use "String", and your code will work for both the ANSI and Unicode versions of C++Builder.
The compiler doesn't convert your strings (it would conflict with the language standards), but both AnsiString and UnicodeString do have copy constructor overloads for both char* and wchar_t* string literals. I.e., the following will work:
What will not work this way, though, is the whole bunch of printf()/scanf() functions; AnsiString::sprintf() takes const char*, UnicodeString::sprintf() takes const wchar_t*.
If you are using sprintf() a lot, you may find my CbdeFormat library useful; just read my article on the subject.
您没有说明二进制元组格式中的数据字符串的用途:它们是否需要存储 Unicode?当我从 D2007 转换到 D2009 时,我能够仅保留系统的某些部分 ANSI 字符串。
如果需要存储 Unicode,那么您需要检查现有数据是否与 UTF-8 等格式兼容。如果存储在现有数据文件中的值范围出现问题,那么我会让您的下一次升级对任何旧数据文件进行一次性转换,读取旧的 AnsiString 数据并将其作为 UTF-8 写回不同的版本。文件名或扩展名,或通过修改适当的文件头数据。我很长时间以来一直对数据文件进行版本控制,只是为了允许这种处理更改。
我刚刚开始一个 BCB2010 项目,所以不能评论你的其他问题,但我确实很难将 Delphi 项目从 D2007 升级到 D2009 - 尽管我能够通过编辑项目文件(只是 XML)来解决这个问题。
祝转换顺利;-)
You do not say what the data strings in your binary tuple format are for: is it necessary for them to store Unicode? When I transitioned from D2007 to D2009 I was able to keep some parts of the system ANSI-string only.
If storing Unicode is required, then you need to check if your existing data is compatible with a format such as UTF-8. If the range of values stored in existing data files present a problem, then I would make your next upgrade do a one-time conversion of any old data files, reading in the old AnsiString data and writing it back as UTF-8 to a different file name or extension, or by modifying appropriate file header data. I have been versioning data files for a long time, just to allow this sort of processing change.
I am only just starting a BCB2010 project, so cannot comment on your other questions, but I certainly had difficulty upgrading a Delphi project from D2007 to D2009 - though I was able to fix this by editing the project file, which is just XML.
Good luck with the conversion ;-)
std::wstring
包含wchar_t*
字符串,就像System::UnicodeString
一样。这由你决定。仍然支持
char*
字符串。您不必被迫将所有内容迁移到 Unicode。不,您不能更改自动管理事件处理程序以使用
System::String
别名。所有 IDE 版本都会抱怨这一点。您必须手动更新事件处理程序声明和实现,以在适当的时候使用UnicodeString
参数而不是AnsiString
参数。这也意味着您也无法在多个 IDE 版本之间共享 DFM 和 Unit .h 文件(无论如何您都不应该这样做)。否。如果声明不带 L 前缀的字符串常量或字符常量,数据仍将被解释为 Ansi。这一点没有改变。不过,您可以将 Ansi 数据传递给
System::UnicodeString
(但不能传递给std::wstring
),它会自动转换为 Unicode。但你必须小心,因为它将使用操作系统的默认 Ansi 代码页来解释数据。只要您的 Ansi 数据仅使用 ASCII 字符,那么您就可以了。否则,如果您使用非 ASCII 字符,那么最好将数据放入System::AnsiStringT
或System::RawByteString
(两者均在 CB2009 中引入) )已分配了正确的代码页,然后将其分配给您的System::UnicodeString
变量。将使用关联的代码页而不是操作系统默认代码页进行转换。如果您的元组需要 8 位字符,那么您必须确保任何结构声明等都使用
char
而不是wchar_t
字符。如果您需要存储 Unicode 字符串,但需要保持 8 位兼容性,那么您应该首先将 Unicode 字符串编码为 UTF-8(您可以使用System::UTF8String
字符串类型来帮助你 - 从 CB2009 开始,现在它是真正的 UTF-8 字符串)。只要您不使用非 ASCII 字符,那么您的旧应用程序就不会知道其中的差异,因为 ASCII 字符按原样编码为 UTF-8。但是,如果您想存储原始 Unicode 数据,那么您的元组将需要在某处有一个标志(如果还没有),指示字符串数据是存储为 Ansi 还是 Unicode,并且您的应用程序必须查找该标志。不可以。您必须事先知道字节的实际编码。如果将内存地址传递给
System::AnsiString
或std::string
,它将采用 Ansi 字符。如果将相同的内存地址传递给System::UnicodeString
或std::wstring
,它将采用 Unicode 字符。就像所有以前的版本一样(除了从 2006 年迁移到 2007 年),您拥有的任何第三方组件都需要为 2010 年重新编译,可以手动(如果您有它们的源代码)或通过它们各自的供应商。
是的。这仍然适用。
.TLB 文件不再被使用。新系统现在运行在 .ridl(简化 IDL)文件上。在编译过程中,.ridl 直接在可执行文件的二进制资源中生成正确的 TypeLibrary 信息。不生成 .tlb 文件。
我不记得 CB2010(或 CB2009,就此而言)是否可以直接使用预先存在的 .tlb 文件。我认为他们不能。但是,您可以通过 tlibimp.exe 运行 .tlb 文件,它将导出 .ridl 文件。或者,您可以从过去版本中的 TLB 编辑器复制 IDL 文本,然后手动将其粘贴到新的 .ridl 文件中。无论哪种方式,您都可以将该 .ridl 文件添加到您的 CB2010 项目中。
这就是为什么我在同一台物理机上安装多个 IDE 版本时使用虚拟机的原因。
std::wstring
containswchar_t*
strings, just likeSystem::UnicodeString
does.That is up to you to decide.
char*
strings are still supported. You are not forced to migrate everything to Unicode.No, you cannot change auto-managed event handlers to use the
System::String
alias. All IDE versions will complain about that. You will have to manually update your event handler declarations and implementations to useUnicodeString
parameters instead ofAnsiString
parameters when appropriate. That also means you cannot share DFMs and Unit .h files across multiple IDE versions, either (which you should not be doing anyway).No. If you declare a string constant or character constant without an L prefix, the data will still be interpretted as Ansi. That has not changed. You can, however, pass Ansi data to
System::UnicodeString
(but not tostd::wstring
), and it will convert to Unicode automatically. But you have to be careful because it will use the OS's default Ansi codepage to interpret the data. As long as your Ansi data is only using ASCII characters only, then you will be OK. Otherwise, if you are using non-ASCII characters, then you are better off putting the data into aSystem::AnsiStringT
orSystem::RawByteString
(both were introduced in CB2009) that has been assigned the correct codepage, and then assign that to yourSystem::UnicodeString
variable. The associated codepage will be used instead of the OS default codepage for the conversion.If your tuple is expecting 8-bit characters, then you will have to make sure that any struct declarations and such are using
char
and notwchar_t
characters. If you need to store Unicode strings, but need to maintain the 8-bit compatibility, then you should encode your Unicode strings to UTF-8 first (you can use theSystem::UTF8String
string type to help you - starting in CB2009, it is a true UTF-8 string now). As long as you do not use non-ASCII characters, then your old apps will not know the difference, as ASCII characters are encoded as-is in UTF-8. If you want to store raw Unicode data, however, then your tuple would need a flag somewhere (if it does not already have one) indicating whether the string data is stored as Ansi or Unicode, and your apps would have to look for that flag.No. You have to know the actual encoding of the bytes beforehand. If you pass a memory address to
System::AnsiString
orstd::string
, it is going to assume Ansi characters. If you pass the same memory address toSystem::UnicodeString
orstd::wstring
, it is going to assume Unicode characters instead.Just like with all prior versions (except for the migration from 2006 to 2007), any third-party components you have will need to be re-compiled for 2010, either manually (if you have the source code for them) or by their respective vendors.
Yes. That still applies.
.TLB files are not used at all anymore. The new system operates on .ridl (Reduced IDL) files now. During compiling, the .ridl produces the correct TypeLibrary information in the executable's binary resources directly. No .tlb files are generated.
I do not remember whether CB2010 (or CB2009, for that matter) can consume pre-existing .tlb files directly. I don't think they can. You can, however, run the .tlb file through tlibimp.exe and it will export a .ridl file. Or you can copy the IDL text from the TLB editor in a past version and paste it into a new .ridl file manually. Either way, you can then add that .ridl ile to your CB2010 project.
That is why I use virtual machines when installing multiple IDE versions on the same physical machine.
升级成本与收益是否相符?
为什么不开始逐步升级,在新平台上开发新组件。通过不同的互操作助手将新组件集成到旧版本。
这个我们向正在考虑升级到
vb.net
的vb6
开发人员建议了该方法。Is the cost of upgrading in line with the benefits?
Why not start a gradual upgrade where new components would be developed on the new platform. Integrate the new components to the old version via different interop helpers.
This approach was suggested to
vb6
developers who were thinking about upgrading tovb.net
.