读取包含非英语字符的 dBase DBF 时出现问题

发布于 2024-10-22 04:47:44 字数 1015 浏览 1 评论 0原文

我有一个工具可以读取 dBase 文件并将内容上传到 SQL Server,这是导入 shapefile 的系统的一部分。它可以工作,但现在我们需要导入包含非英语字符(在本例中为挪威语,以后可能是其他语言)的文件,并且它们已损坏。

使用 OleDbDataAdapter 读取 dBase 文件。单步执行代码,我可以看到读入的文本是错误的。我假设这与代码页或 Unicode 有关,但我不知道如何修复它。

dBase Reader 应用程序告诉我 DBF 位于代码页 1252 中 - 我不知道这是否正确。我的上传工具在 Win7 上运行,区域设置为英语(英国)。

示例:

DBF 中的 ÅSGARD 变为 VB.Net 和 VB.Net 中的 +SGARD SQL 服务器。

DBF 中的 RINGHORNE ØST 变为 VB.Net 和 VB.Net 中的 RINGHORNE ÏST SQL 服务器。

读取 DBF 的代码:

dbfConnectionString = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" & strPath & ";Extended Properties=dBASE IV"
Cnn.ConnectionString = dbfConnectionString
Cnn.Open()

strSQL = "SELECT * FROM [" & strDBF & "]"
DA = New OleDb.OleDbDataAdapter(strSQL, Cnn)
DS = New DataSet
DA.Fill(DS)

If DS.Tables(0).Rows.Count > 0 Then
  dtDBF = DS.Tables(0)
Else
  dtDBF = Nothing
End If

数据读取方式如下: Name = dtDBF.Rows(index)("NAME_1")

有没有办法告诉 OleDbDataAdapter 使用什么代码页或从 VB.Net 读取 dBase 文件的更好方法?

I have a tool which reads dBase files and uploads the contents to SQL Server, part of a system to import shapefiles. It works but now we have a requirement to import files that include non-English characters (Norwegian in this case, could be other languages later) and they're being corrupted.

The dBase files are being read using an OleDbDataAdapter. Stepping through the code I can see that the text is wrong as it is read in. I'm assuming it's something to do with code pages or Unicode but I have no idea how to fix it.

A dBase Reader application tells me the DBFs are in code page 1252 - I don't know if this is correct. My upload tool runs on Win7 with English (UK) regional settings.

Examples:

ÅSGARD in DBF becomes +SGARD in VB.Net & SQL Server.

RINGHORNE ØST in DBF becomes RINGHORNE ÏST in VB.Net & SQL Server.

The code that reads the DBF:

dbfConnectionString = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" & strPath & ";Extended Properties=dBASE IV"
Cnn.ConnectionString = dbfConnectionString
Cnn.Open()

strSQL = "SELECT * FROM [" & strDBF & "]"
DA = New OleDb.OleDbDataAdapter(strSQL, Cnn)
DS = New DataSet
DA.Fill(DS)

If DS.Tables(0).Rows.Count > 0 Then
  dtDBF = DS.Tables(0)
Else
  dtDBF = Nothing
End If

Data is read like: Name = dtDBF.Rows(index)("NAME_1")

Is there a way to tell OleDbDataAdapter what code page to use or a better way to read dBase files from VB.Net?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

桃扇骨 2024-10-29 04:47:44

尝试将其添加到您的 DSN:

CollatingSequence=Norwegian-Danish

您也许还可以使用:

CollatingSequence=International

Try adding this to your DSN:

CollatingSequence=Norwegian-Danish

You might also be able to use:

CollatingSequence=International
2024-10-29 04:47:44

检查 shapefile 是否包含代码页信息。有两个地方look

  • 查看语言驱动程序 ID (LDID),可在 shapefile 的 DBF 表的标头中找到(第 29 个字节)。
  • 查找扩展名为 .cpg 的关联单独文件。

如果未在这些位置指定代码页,则默认为生成 shapefile 的 PC 上的代码页。你只需要知道这一点:(

我从未使用过它,但也许 Shape2SQL或者shp2text 我相信PostGIS shapefile loader 处理代码页:也许你可以导入到 PostGIS 然后以另一种格式导出?

Check whether the shapefile contains codepage information. There are two places to look

  • Look in the language driver ID (LDID), which is found in the header of the shapefile’s DBF table (in the 29th byte).
  • Look for an associated separate file with extension .cpg.

If the code page is not specified in those locations, it defaults to the codepage on the PC that generated the shapefile. You will just have to know that :(

I've never used it, but maybe Shape2SQL takes care of this for you? Or shp2text? I believe the PostGIS shapefile loader handles code pages: maybe you could import into PostGIS and then export in another format??

旧城烟雨 2024-10-29 04:47:44

老问题,但这可能会为未来的读者解答...

您可以尝试在连接字符串中添加属性设置:

Locale Identifier=1044

此属性(以及包括该属性的值列表)是 ADO 与 Jet 4.0 的 OLDB 结合记录的提供商,但我没有理由相信 ADO.Net 也不支持它。该值 (1044) 是挪威语/丹麦语。

未经测试,但可以尝试其他方法。

Old question, but this may answer it for future readers...

You might try adding a property setting in your connection string:

Locale Identifier=1044

This property (and a list of values including this one) is documented for ADO in conjunction with Jet 4.0's OLDB Provider but I have no reason to believe it isn't also supported by ADO.Net. This value (1044) is Norwegian/Danish.

Untested, but something else to try.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文