C# 中的 CSV 验证 - 确保每行具有相同数量的逗号
我希望在我的 C#/ASP.NET 应用程序中实现一个相当简单的 CSV 检查器 - 我的项目自动从 GridView 为用户生成 CSV,但我希望能够快速运行每一行并查看它们是否具有相同数量的逗号,如果出现任何差异则抛出异常。到目前为止,我已经有了这个,它确实有效,但有一些问题我将很快描述:
int? CommaCount = null;
StringBuilder sb = new StringBuilder();
StringWriter sw = new StringWriter(sb);
String Str = null;
//This loops through all the headerrow cells and writes them to the stringbuilder
for (int k = 0; k <= (grd.Columns.Count - 1); k++)
{
sw.Write(grd.HeaderRow.Cells[k].Text + ",");
}
sw.WriteLine(",");
//This loops through all the main rows and writes them to the stringbuilder
for (int i = 0; i <= grd.Rows.Count - 1; i++)
{
StringBuilder RowString = new StringBuilder();
for (int j = 0; j <= grd.Columns.Count - 1; j++)
{
//We'll need to strip meaningless junk such as <br /> and
Str = grd.Rows[i].Cells[j].Text.ToString().Replace("<br />", "");
if (Str == " ")
{
Str = "";
}
Str = "\"" + Str + "\"" + ",";
RowString.Append(Str);
sw.Write(Str);
}
sw.WriteLine();
//The below code block ensures that each row contains the same number of commas, which is crucial
int RowCommaCount = CheckChar(RowString.ToString(), ',');
if (CommaCount == null)
{
CommaCount = RowCommaCount;
}
else
{
if (CommaCount!= RowCommaCount)
{
throw new Exception("CSV generated is corrupt - line " + i + " has " + RowCommaCount + " commas when it should have " + CommaCount);
}
}
}
sw.Close();
和我的 CheckChar 方法:
protected static int CheckChar(string Input, char CharToCheck)
{
int Counter = 0;
foreach (char StringChar in Input)
{
if (StringChar == CharToCheck)
{
Counter++;
}
}
return Counter;
}
现在我的问题是,如果网格中的单元格包含逗号,我的 check char 方法仍然会将它们计为分隔符所以会返回错误。正如您在代码中看到的,我将所有值包装在 " 字符中以“转义”它们。在我的方法中忽略值中的逗号有多简单?我认为我需要重写该方法很多次。
I wish to implement a fairly simple CSV checker in my C#/ASP.NET application - my project automatically generates CSV's from GridView's for users, but I want to be able to quickly run through each line and see if they have the same amount of commas, and throw an exception if any differences occur. So far I have this, which does work but there are some issues I'll describe soon:
int? CommaCount = null;
StringBuilder sb = new StringBuilder();
StringWriter sw = new StringWriter(sb);
String Str = null;
//This loops through all the headerrow cells and writes them to the stringbuilder
for (int k = 0; k <= (grd.Columns.Count - 1); k++)
{
sw.Write(grd.HeaderRow.Cells[k].Text + ",");
}
sw.WriteLine(",");
//This loops through all the main rows and writes them to the stringbuilder
for (int i = 0; i <= grd.Rows.Count - 1; i++)
{
StringBuilder RowString = new StringBuilder();
for (int j = 0; j <= grd.Columns.Count - 1; j++)
{
//We'll need to strip meaningless junk such as <br /> and
Str = grd.Rows[i].Cells[j].Text.ToString().Replace("<br />", "");
if (Str == " ")
{
Str = "";
}
Str = "\"" + Str + "\"" + ",";
RowString.Append(Str);
sw.Write(Str);
}
sw.WriteLine();
//The below code block ensures that each row contains the same number of commas, which is crucial
int RowCommaCount = CheckChar(RowString.ToString(), ',');
if (CommaCount == null)
{
CommaCount = RowCommaCount;
}
else
{
if (CommaCount!= RowCommaCount)
{
throw new Exception("CSV generated is corrupt - line " + i + " has " + RowCommaCount + " commas when it should have " + CommaCount);
}
}
}
sw.Close();
And my CheckChar method:
protected static int CheckChar(string Input, char CharToCheck)
{
int Counter = 0;
foreach (char StringChar in Input)
{
if (StringChar == CharToCheck)
{
Counter++;
}
}
return Counter;
}
Now my problem is, if a cell in the grid contains a comma, my check char method will still count these as delimiters so will return an error. As you can see in the code, I wrap all the values in " characters to 'escape' them. How simple would it be to ignore commas in values in my method? I assume I'll need to rewrite the method quite a lot.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
您可以只使用匹配一项的正则表达式并计算行中匹配项的数量。此类正则表达式的示例如下:
You could just use a regular expression that matches one item and count the number of matches in your line. An example of such a regex is the following:
只需执行如下操作(假设您不想在字段内包含 " (否则需要一些额外的处理)):
这将导致
inValue
在字段内为 true。例如 pass < code>'"' 作为fieldDelimiter
来忽略"..."
之间的所有内容。请注意,这不会处理转义的"
(如""
或\"
)。您必须自己添加此类处理。Just do something like the following (assuming you don't want to have " inside your fields (otherwise these need some extra handling)):
This will cause
inValue
to be true while inside fields. E.g. pass'"'
asfieldDelimiter
to ignore everything between"..."
. Just note that this won't handle escaped"
(like""
or\"
). You'd have to add such handling yourself.您应该在连接(混合)字段(成分)之前检查字段(成分),而不是检查结果字符串(蛋糕)。这将使您能够做出一些建设性的事情(转义/替换)并仅作为最后的手段抛出异常。
一般来说,“,”在 .csv 字段中是合法的,只要字符串字段被引用即可。因此内部“,”应该不是问题,但引号很可能是问题。
Instead of checking the resulting string (the cake) you should check the fields (ingredients) before you concatenate (mix) them. That would give you the change to do something constructive (escaping/replacing) and throwing an exception only as a last resort.
In general, "," are legal in .csv fields, as long as the string fields are quoted. So internal "," should not be a problem, but the quotes may well be.