Delphi dbExpress 和 Interbase:UTF8 迁移步骤和风险?
目前,我们的数据库使用 Win1252 作为唯一的字符编码。我们很快就必须在数据库表中支持 Unicode,这意味着我们必须对四个数据库和大约 80 个在 24/7 环境中内部运行的 Delphi 应用程序执行此迁移。是否有针对 Delphi 应用程序的数据库迁移到 UTF-8(或 UNICODE_FSS)的建议?下面列出一些问题。非常感谢您的回答!
- 是否有工具可以通过转储数据、使用 UNICODE_FSS 或 UTF-8 重新创建数据库以及加载来帮助迁移现有数据库(大小在 250 MB 到 2 GB 之间,无 Blob 字段)数据回来了?
- Delphi 2009、dbExpress 和 Interbase 7.5 是否存在与 Unicode 字符集相关的已知问题?
- 您是否建议首先将数据库升级到Interbase 2009? (此升级已计划,但优先级不高)
- 我们是否可以简单地迁移数据库,Delphi 将自动处理 Unicode 字符集,或者我们是否必须更改每个 Datamodule 中的所有字符字段类型( dfm 和源代码)也是吗?
- 您建议采用哪种策略来与现有应用程序的正常开发和维护并行进行迁移?该应用程序在内部运行,因此开发和数据库管理是在内部完成的。
更新:来自 InterBase 讨论论坛主题:InterBase 中的 Unicode 数据库 - 真的吗? (这不是我的线程,但它表明 InterBase XE 中仍然存在一些问题)。
以下是我提交的一些报告: QC#92867 - 字符串字段为空 仅当视图来自视图 包括一个 Union,并且当使用 客户端数据集。这被发现为 我的一些报告缺少数据, 不再起作用。
QC#91494 - IB 字符列数据 字符字段(例如:Char(1))是 检索时用空格填充 通过存储过程。测试失败 - 例如:如果活动=“Y”。我在表单中大量使用存储过程 这些都不起作用。
QC#91355 - IBSqlMonitor 失败。这 IBSqlMonitor 的输出有点 乱码使这个工具毫无用处。 (所以, 连我的铲子都坏了!)
未报告 - 中的持久字段 TClientDataSet 对于 TWideString 失败。
其他相关 QC 条目:
QC#94455 SQL Unicode Char 类型失败 ( InterBase XE)
Currently, our database uses Win1252 as the only character encoding. We will have to support Unicode in the database tables soon, which means we have to perform this migration for four databases and around 80 Delphi applications which run in-house in a 24/7 environment. Are there recommendations for database migrations to UTF-8 (or UNICODE_FSS) for Delphi applications? Some questions listed below. Many thanks in advance for your answers!
- are there tools which help with the migration of the existing databases (sizes between 250 MB and 2 GB, no Blob fields), by dumping the data, recreating the database with UNICODE_FSS or UTF-8, and loading the data back?
- are there known problems with Delphi 2009, dbExpress and Interbase 7.5 related to Unicode character sets?
- would you recommend to upgrade the databases to Interbase 2009 first? (This upgrade is planned but does not have a high priority)
- can we simply migrate the database and Delphi will handle the Unicode character sets automatically, or will we have to change all character field types in every Datamodule (dfm and source code) too?
- which strategy would you recommend to work on the migration in parallel with the normal development and maintenance of the existing application? The application runs in-house so development and database administration is done internally.
Update: From InterBase discussion forum thread: Unicode Databases in InterBase - Really? (it is not a thread by me but it shows that some problems still exist in InterBase XE).
Here are some reports I've submitted:
QC#92867 - String fields are blank
coming from Views only if the View
includes a Union, and when using a
ClientDataSet. This was found as
missing data on a few of my reports,
which no longer work.QC#91494 - IB Character column data
Character fields (eg: Char(1)) are
padded with blanks when retrieved
through a stored procedure. Tests fail
- eg: If Active = "Y". I make heavy use of stored procedures with forms
and these do not work.QC#91355 - IBSqlMonitor fails. The
output of IBSqlMonitor is somewhat
garbled making this tool useless. (So,
even my shovel is broken!)Unreported - Persistent fields in
TClientDataSet fail for TWideString.
Other related QC entries:
QC#94455 SQL Unicode Char Type Failure (InterBase XE)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(8)
Database Workbench 和 IBExpert可以为您完成数据迁移。
当我到达 Entwickler Tage 时,我会就其他问题回复您。
——杰罗恩
Both Database Workbench and IBExpert can do the data migration for you.
I'll get back to you on the other questions when I'm at the Entwickler Tage.
--jeroen
问题:空字符串字段上的 UPDATE 不再找到记录。如果 UTF8 字符字段为空,DataSetProvider 会为更新操作生成错误的 SELECT。
症状:消息“记录未找到或被其他用户编辑”
解决方案:升级到 Delphi 2010 Update 4 或使用 QC 中描述的解决方法
Problem: UPDATE on a empty string field no longer finds a record. If a UTF8 character field is empty, the DataSetProvider generates a wrong SELECT for the update action.
Symptom: Message 'record not found or edited by another user'
Solution: upgrade to Delphi 2010 Update 4 or use the workaround described in QC
问题:CHAR 字段不再起作用,必须替换为 VARCHAR。
症状:对现在使用 UTF8 并使用 ASCII 值从 WIN1252 导入的列进行 SELECT 查询不再返回任何值。也许这是一个我应该在质量控制中报告的错误。
解决方案:将数据库元数据 DDL 脚本中出现的所有
CHAR(
替换为VARCHAR(
Problem: CHAR fields no longer work and have to be replaced with VARCHAR.
Symptom: SELECT queries on a column which now uses UTF8 and is imported from WIN1252 with ASCII values no longer returns any value. Maybe this is a bug which I should report in QC.
Solution: replace all occurences of
CHAR(
in the database metadata DDL script withVARCHAR(
问题:持久字符串字段需要 Size 属性,该属性是字段的逻辑大小乘以四(另请参阅:是是否可以将 TStringField 调整为像 Delphi 中的 TWideStringField 一样工作?)
症状:访问冲突
解决方案:删除持久字段并再次添加它以更新 Size 属性。 (副作用:DisplayWidth 也会增加尺寸,导致 UI 出现问题)
Problem: persistent string fields require a Size property which is the logical size of the field multiplied by four (see also: Is it possible to tweak TStringField to work like TWideStringField in Delphi?)
Symptom: Access violations
Solution: delete the persistent field and add it again to update the Size property. (side effect: the DisplayWidth will also increase size, leading to problems with UI)
问题:带有字符串参数的 UDF(用户定义函数)可能会因大小限制而中断。
症状:
对于此 UDF:
解决方案:修复声明中的 UDF 参数。
Problem: UDF (user defined functions) with string parameters can break because of sizes limits.
Symptom:
for this UDF:
Solution: fix UDF parameters in the declaration.
问题:dbExpress在内部使用WideString作为数据类型,因此所有现有的读取/设置字段和参数的
.AsString
调用将不再起作用症状:特殊字符将不会被存储/正确阅读
解决方案:将所有出现的 .AsString 替换为 .AsWideString,但请注意不要更改未在字段或参数上调用 AsString 方法的位置。
Problem: dbExpress uses WideString as data type internally, so all existing
.AsString
calls for reading / setting field and parameter will no longer workSymptom: special characters will not be stored / read correctly
Solution: replace all occurences of .AsString with .AsWideString but be careful to not change where the AsString method is not called on a field or parameter.
问题:dbExpress 需要 TStringField 对象作为 WIN1252 字段。对于 UTF8 数据库字段,dbExpress 需要 TWideStringField 对象。
症状:错误消息“预期:发现 WideString:字符串”
解决方案:将所有出现的 TStringField 替换为 TWideStringField。这要求所有表单文件(dfm)都是文本,而不是二进制。修改后的表单和数据模块将不向后兼容。
Problem: dbExpress needs TStringField objects for WIN1252 fields. For UTF8 database fields, dbExpress needs TWideStringField objects.
Symptom: error message 'expected: WideString found: string'
Solution: replace all occurences of TStringField with TWideStringField. This requires that all form files (dfm) are text, not binary. The modified forms and datamodules will not be backwards compatible.
问题:导出 WIN1252 数据库的元数据和表数据将创建 CP1252 编码文件,但导入时需要 UTF8 文件(使用 IBExpert 测试)
症状:导入到 InterBase 的脚本中出现错误
< strong>解决方案:使用 iconv 将脚本文件转换为 UTF8
Problem: exporting metadata and table data for a WIN1252 database will create a CP1252 encoded file, but for the import, a UTF8 file is required (tested with IBExpert)
Symptom: errors in the script import to InterBase
Solution: use iconv to convert the script file to UTF8