读取包含非英语字符的 dBase DBF 时出现问题
我有一个工具可以读取 dBase 文件并将内容上传到 SQL Server,这是导入 shapefile 的系统的一部分。它可以工作,但现在我们需要导入包含非英语字符(在本例中为挪威语,以后可能是其他语言)的文件,并且它们已损坏。
使用 OleDbDataAdapter 读取 dBase 文件。单步执行代码,我可以看到读入的文本是错误的。我假设这与代码页或 Unicode 有关,但我不知道如何修复它。
dBase Reader 应用程序告诉我 DBF 位于代码页 1252 中 - 我不知道这是否正确。我的上传工具在 Win7 上运行,区域设置为英语(英国)。
示例:
DBF 中的 ÅSGARD 变为 VB.Net 和 VB.Net 中的 +SGARD SQL 服务器。
DBF 中的 RINGHORNE ØST 变为 VB.Net 和 VB.Net 中的 RINGHORNE ÏST SQL 服务器。
读取 DBF 的代码:
dbfConnectionString = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" & strPath & ";Extended Properties=dBASE IV"
Cnn.ConnectionString = dbfConnectionString
Cnn.Open()
strSQL = "SELECT * FROM [" & strDBF & "]"
DA = New OleDb.OleDbDataAdapter(strSQL, Cnn)
DS = New DataSet
DA.Fill(DS)
If DS.Tables(0).Rows.Count > 0 Then
dtDBF = DS.Tables(0)
Else
dtDBF = Nothing
End If
数据读取方式如下: Name = dtDBF.Rows(index)("NAME_1")
有没有办法告诉 OleDbDataAdapter 使用什么代码页或从 VB.Net 读取 dBase 文件的更好方法?
I have a tool which reads dBase files and uploads the contents to SQL Server, part of a system to import shapefiles. It works but now we have a requirement to import files that include non-English characters (Norwegian in this case, could be other languages later) and they're being corrupted.
The dBase files are being read using an OleDbDataAdapter. Stepping through the code I can see that the text is wrong as it is read in. I'm assuming it's something to do with code pages or Unicode but I have no idea how to fix it.
A dBase Reader application tells me the DBFs are in code page 1252 - I don't know if this is correct. My upload tool runs on Win7 with English (UK) regional settings.
Examples:
ÅSGARD in DBF becomes +SGARD in VB.Net & SQL Server.
RINGHORNE ØST in DBF becomes RINGHORNE ÏST in VB.Net & SQL Server.
The code that reads the DBF:
dbfConnectionString = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" & strPath & ";Extended Properties=dBASE IV"
Cnn.ConnectionString = dbfConnectionString
Cnn.Open()
strSQL = "SELECT * FROM [" & strDBF & "]"
DA = New OleDb.OleDbDataAdapter(strSQL, Cnn)
DS = New DataSet
DA.Fill(DS)
If DS.Tables(0).Rows.Count > 0 Then
dtDBF = DS.Tables(0)
Else
dtDBF = Nothing
End If
Data is read like: Name = dtDBF.Rows(index)("NAME_1")
Is there a way to tell OleDbDataAdapter what code page to use or a better way to read dBase files from VB.Net?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
尝试将其添加到您的 DSN:
您也许还可以使用:
Try adding this to your DSN:
You might also be able to use:
检查 shapefile 是否包含代码页信息。有两个地方到look
.cpg
的关联单独文件。如果未在这些位置指定代码页,则默认为生成 shapefile 的 PC 上的代码页。你只需要知道这一点:(
我从未使用过它,但也许 Shape2SQL或者shp2text 我相信PostGIS shapefile loader 处理代码页:也许你可以导入到 PostGIS 然后以另一种格式导出?
Check whether the shapefile contains codepage information. There are two places to look
.cpg
.If the code page is not specified in those locations, it defaults to the codepage on the PC that generated the shapefile. You will just have to know that :(
I've never used it, but maybe Shape2SQL takes care of this for you? Or shp2text? I believe the PostGIS shapefile loader handles code pages: maybe you could import into PostGIS and then export in another format??
老问题,但这可能会为未来的读者解答...
您可以尝试在连接字符串中添加属性设置:
此属性(以及包括该属性的值列表)是 ADO 与 Jet 4.0 的 OLDB 结合记录的提供商,但我没有理由相信 ADO.Net 也不支持它。该值 (
1044
) 是挪威语/丹麦语。未经测试,但可以尝试其他方法。
Old question, but this may answer it for future readers...
You might try adding a property setting in your connection string:
This property (and a list of values including this one) is documented for ADO in conjunction with Jet 4.0's OLDB Provider but I have no reason to believe it isn't also supported by ADO.Net. This value (
1044
) is Norwegian/Danish.Untested, but something else to try.