使用 OpenRowSet 导入 .CSV 文件时将科学计数法转换为浮点数

发布于 2024-12-05 11:30:15 字数 791 浏览 2 评论 0原文

我正在使用 openrowset 将 csv 文件导入 SQL Server。 csv 文件中的一列包含科学记数法中的数字 (1.08E+05) 以及正在插入的表中的列。

默认情况下,它将值导入为 1 并忽略 .08E+05。

我尝试过在执行查询时使用cast()和convert()直接转换值,以及将表中的数据类型设置为字符串并按此导入。所有这些方法都具有相同的行为,即忽略 .08E+05。

有没有办法在没有 .08E+05 的情况下将值导入为 108000 而不是 1,而不必更改 csv 文件本身?

将数据类型设置为 varchar 并在 csv 文件中读取似乎与以下代码具有相同的效果:

CREATE TABLE #dataTemp (StartDate datetime, Value varchar(12))

SET @insertDataQuery = 'SELECT Date, CSVValue from OpenRowset(''MSDASQL'', ''Driver={Microsoft Text Driver (*.txt; *.csv)}; DefaultDir=' 
SET @insertDataQuery = @insertDataQuery + 'C:\Data\;'',''SELECT * FROM '+ '11091800.csv' + ''')'

INSERT INTO #dataTemp EXEC(@insertDataQuery)

SELECT * FROM #dataTemp

并非 CSV 文件中的所有值都有科学计数法,而没有科学计数法的值(例如 81000)不会出现问题。

I am using openrowset to import a csv file into SQL Server. One of the columns in the csv file contains numbers in scientific notation (1.08E+05) and the column in the table it is being inserted

By default it is importing the value as 1 and ignoring the .08E+05.

I have tried using cast() and convert() to convert the value directly when the query is executed as well as setting up the datatype in the table as a character string and importing it as such. All of these methods have the same behavior where the .08E+05 is ignored.

Is there a way to have the value imported as 108000 instead of 1 without the .08E+05 without having to change the csv file itself?

Setting up the datatype as a varchar and reading in the csv file appears to have the same effect with the following code:

CREATE TABLE #dataTemp (StartDate datetime, Value varchar(12))

SET @insertDataQuery = 'SELECT Date, CSVValue from OpenRowset(''MSDASQL'', ''Driver={Microsoft Text Driver (*.txt; *.csv)}; DefaultDir=' 
SET @insertDataQuery = @insertDataQuery + 'C:\Data\;'',''SELECT * FROM '+ '11091800.csv' + ''')'

INSERT INTO #dataTemp EXEC(@insertDataQuery)

SELECT * FROM #dataTemp

Not all of the values in the CSV file have the scientific notation and the value without it, e.g. 81000 come across without issue.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

甜心 2024-12-12 11:30:15

对于 BULK INSERT 方法,我经常发现首先将数据移动到所有 varchar 的表中,然后删除无关的东西(例如带引号的分隔符)并修复格式更简单。我记得有一段时间摆脱了科学记数法,你可以直接使用 varchar 表,直到你找到正确的方法。我记得尝试过各种精度/比例组合,直到最终找到兼容的组合。我认为对我来说是 FLOAT 然后 DECIMAL(24,12)...

SELECT CONVERT(DECIMAL(24, 12), CONVERT(FLOAT, '1.08 E+05'));

编辑添加我所做的尝试重现和/或演示一种不太复杂的方式。

我创建了一个非常简单的 CSV 文件:

StartDate,Value
20110808,81000
20110808,1.08E+05

然后我运行了以下代码(由于某种原因,我无法让 MSDASQL 在我的计算机上运行以挽救我的生命):

CREATE TABLE #dataTemp(StartDate DATETIME, Value VARCHAR(32));

BULK INSERT #dataTemp FROM 'C:\data\whatever.csv' 
    WITH (ROWTERMINATOR='\n', FIELDTERMINATOR=',', FIRSTROW = 2);

SELECT * FROM #dataTemp
GO
SELECT StartDate, CONVERT(INT, CONVERT(FLOAT, Value)) FROM #dataTemp;
GO
DROP TABLE #dataTemp;

结果:

StartDate               Value
----------------------- --------
2011-08-08 00:00:00.000 81000
2011-08-08 00:00:00.000 1.08E+05

StartDate               (No column name)
----------------------- ----------------
2011-08-08 00:00:00.000 81000
2011-08-08 00:00:00.000 108000

For BULK INSERT methodologies I've often found it simpler to first move the data into a table of all varchars, then get rid of extraneous things like quoted delimiters and fix formatting. I remember having a heck of a time getting rid of the scientific notation, you can just play with the varchar table until you get it right. I remember attempting all kinds of precision/scale combinations until I finally found one that was compatible. I think for me it was FLOAT then DECIMAL(24,12)...

SELECT CONVERT(DECIMAL(24, 12), CONVERT(FLOAT, '1.08E+05'));

EDIT adding what I did to try to repro and/or demonstrate a less convoluted way.

I created a very simple CSV file:

StartDate,Value
20110808,81000
20110808,1.08E+05

Then I ran the following code (for some reason I can't get MSDASQL to run on my machine to save my life):

CREATE TABLE #dataTemp(StartDate DATETIME, Value VARCHAR(32));

BULK INSERT #dataTemp FROM 'C:\data\whatever.csv' 
    WITH (ROWTERMINATOR='\n', FIELDTERMINATOR=',', FIRSTROW = 2);

SELECT * FROM #dataTemp
GO
SELECT StartDate, CONVERT(INT, CONVERT(FLOAT, Value)) FROM #dataTemp;
GO
DROP TABLE #dataTemp;

Results:

StartDate               Value
----------------------- --------
2011-08-08 00:00:00.000 81000
2011-08-08 00:00:00.000 1.08E+05

StartDate               (No column name)
----------------------- ----------------
2011-08-08 00:00:00.000 81000
2011-08-08 00:00:00.000 108000
吖咩 2024-12-12 11:30:15

会把它拍成真正的作品吗?

select cast('1.08E+05' as real)

Will casting it as a real work?

select cast('1.08E+05' as real)
梦旅人picnic 2024-12-12 11:30:15

首先,你有一个科学记数法,这意味着它可能是 Excel 或其他创建该值的程序丢失了一些数据......换句话说,记数法中的原始数字已被转换,因此一些数字和准确性被改变了。丢失的。这是许多从 Excel 和 CSV 进行转换的 Microsoft 产品的问题。

其次,这是一个更好的转换块,可以将数字转换为字符串:

CONVERT(nvarchar(255),LTRIM(RTRIM(str(ISNULL(YOUR_NUMBER,0),20,0))))

First of all, the fact you have a scientific notation means its likely Excel or some other program that created the value has LOST some data....in other words, the original number inside the notation was converted and so some numbers and accuracy was lost. thats a problem with many Microsoft products that convert from Excel and CSV.

Second, here is a better converting piefce that converts the number to a string:

CONVERT(nvarchar(255),LTRIM(RTRIM(str(ISNULL(YOUR_NUMBER,0),20,0))))
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文