简单的 csv 阅读器?
总而言之,
我一开始就认为这是一项非常简单的任务。 (将 csv 转换为“wiki”格式)但我遇到了一些障碍,我无法解决
我有 3 个主要问题
1)一些单元格包含 \r\n (所以当逐行读取时,这会处理每个新行作为一个新单元格
2)某些行包含“,”(我尝试切换到 \t 删除文件,但当它在两个“”之间时我仍然遇到转义问题)
3)除了分隔符之外,有些行完全空白( “,”或“\t”)其他不完整(这很好,我只需要确保单元格位于正确的位置)
我已经尝试了一些 CSV 阅读器类,但它们会增加代理上面列出的问题
我试图让这个应用程序尽可能小,所以我也试图避免 dll 和大型类,只有一小部分做我想要的事情。
到目前为止,我有两次“不起作用的尝试
尝试1(不处理单元格中的\r\n)
OpenFileDialog openFileDialog1 = new OpenFileDialog();
openFileDialog1.InitialDirectory = Environment.GetFolderPath(Environment.SpecialFolder.Desktop);
openFileDialog1.Filter = "tab sep file (*.txt)|*.txt|All files (*.*)|*.*";
openFileDialog1.FilterIndex = 1;
openFileDialog1.RestoreDirectory = true;
if (openFileDialog1.ShowDialog() == DialogResult.OK)
{
if (cb_sortable.Checked)
{
header = "{| class=\"wikitable sortable\" border=\"1\" \r\n|+ Sortable table";
}
StringBuilder sb = new StringBuilder();
string line;
bool firstline = true;
StreamReader sr = new StreamReader(openFileDialog1.FileName);
sb.AppendLine(header);
while ((line = sr.ReadLine()) != null)
{
if (line.Replace("\t", "").Length > 1)
{
string[] hold;
string lead = "| ";
if (firstline && cb_header.Checked == true)
{
lead = "| align=\"center\" style=\"background:#f0f0f0;\"| ";
}
hold = line.Split('\t');
sb.AppendLine(table);
foreach (string row in hold)
{
sb.AppendLine(lead + row.Replace("\"", ""));
}
firstline = false;
}
}
sb.AppendLine(footer);
Clipboard.SetText(sb.ToString());
MessageBox.Show("Done!");
}
}
string header = "{| class=\"wikitable\" border=\"1\" ";
string footer = "|}";
string table = "|-";
尝试2(可以处理\r\n但将单元格移到空白单元格上)(尚未完成)
OpenFileDialog openFileDialog1 = new OpenFileDialog();
openFileDialog1.InitialDirectory = Environment.GetFolderPath(Environment.SpecialFolder.Desktop);
openFileDialog1.Filter = "txt file (*.txt)|*.txt|All files (*.*)|*.*";
openFileDialog1.FilterIndex = 1;
openFileDialog1.RestoreDirectory = true;
if (openFileDialog1.ShowDialog() == DialogResult.OK)
{
if (cb_sortable.Checked)
{
header = "{| class=\"wikitable sortable\" border=\"1\" \r\n|+ Sortable table";
}
using (StreamReader sr = new StreamReader(openFileDialog1.FileName))
{
string text = sr.ReadToEnd();
string[] cells = text.Split('\t');
int columnCount = 0;
foreach (string cell in cells)
{
if (cell.Contains("\r\n"))
{
break;
}
columnCount++;
}
}
基本上我需要的是“如果不是在 \” 之间分割,但我现在不知所措,
任何提示或技巧将不胜感激
all,
I started out with what i thought was going to be a pretty simple task. (convert a csv to "wiki" format) but im hitting a few snags that im having trouble working through
I have 3 main problems
1) some of the cells contain \r\n ( so when reading line by line this treats each new line as a new cell
2) some of the rows contain "," ( i tried switching to \t delemited files but im still running into a problem escaping when its between two "")
3) some rows are completely blank except for the delmiter ("," or "\t") others are incomplete (which is fine i just need to make sure that the cell goes in the correct place)
I've tried a few of the CSV reader classes but they would bump up agenst of teh problems listed above
I'm trying to keep this app as small as possible so i am also trying to avoid dlls and large classes that only a small portion do what i want.
so far i have two "attempts that are not working
Atempt 1 (doesn't handel \r\n in a cell)
OpenFileDialog openFileDialog1 = new OpenFileDialog();
openFileDialog1.InitialDirectory = Environment.GetFolderPath(Environment.SpecialFolder.Desktop);
openFileDialog1.Filter = "tab sep file (*.txt)|*.txt|All files (*.*)|*.*";
openFileDialog1.FilterIndex = 1;
openFileDialog1.RestoreDirectory = true;
if (openFileDialog1.ShowDialog() == DialogResult.OK)
{
if (cb_sortable.Checked)
{
header = "{| class=\"wikitable sortable\" border=\"1\" \r\n|+ Sortable table";
}
StringBuilder sb = new StringBuilder();
string line;
bool firstline = true;
StreamReader sr = new StreamReader(openFileDialog1.FileName);
sb.AppendLine(header);
while ((line = sr.ReadLine()) != null)
{
if (line.Replace("\t", "").Length > 1)
{
string[] hold;
string lead = "| ";
if (firstline && cb_header.Checked == true)
{
lead = "| align=\"center\" style=\"background:#f0f0f0;\"| ";
}
hold = line.Split('\t');
sb.AppendLine(table);
foreach (string row in hold)
{
sb.AppendLine(lead + row.Replace("\"", ""));
}
firstline = false;
}
}
sb.AppendLine(footer);
Clipboard.SetText(sb.ToString());
MessageBox.Show("Done!");
}
}
string header = "{| class=\"wikitable\" border=\"1\" ";
string footer = "|}";
string table = "|-";
attempt 2 ( can handle \r\n but shifts cells over blank cells) (its not complete yet)
OpenFileDialog openFileDialog1 = new OpenFileDialog();
openFileDialog1.InitialDirectory = Environment.GetFolderPath(Environment.SpecialFolder.Desktop);
openFileDialog1.Filter = "txt file (*.txt)|*.txt|All files (*.*)|*.*";
openFileDialog1.FilterIndex = 1;
openFileDialog1.RestoreDirectory = true;
if (openFileDialog1.ShowDialog() == DialogResult.OK)
{
if (cb_sortable.Checked)
{
header = "{| class=\"wikitable sortable\" border=\"1\" \r\n|+ Sortable table";
}
using (StreamReader sr = new StreamReader(openFileDialog1.FileName))
{
string text = sr.ReadToEnd();
string[] cells = text.Split('\t');
int columnCount = 0;
foreach (string cell in cells)
{
if (cell.Contains("\r\n"))
{
break;
}
columnCount++;
}
}
basically all I needs is a "split if not between \" " but im just at a loss right now
any tips or tricks would be greatly appreciated
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
查看此项目而不是滚动您自己的 CSV 解析器。
Checkout this project instead of rolling your own CSV parser.
您也可以看看http://www.filehelpers.com/...
不要'如果可以使用库,就不要尝试自己做!
You might take a look at http://www.filehelpers.com/ as well...
Don't try to do it by yourself if you can use libraries!
尝试查看此处。您的代码不会发出 Web 请求,但这实际上向您展示了如何解析从 Web 服务返回的 csv。
Try taking a look here. Your code doesn't make web requests, but effectively this shows you how to parse a csv that is returned from a web service.
这里有一个不错的实现...
在这种情况下,使用经过尝试和测试的代码比尝试编写自己的代码更有意义。
There's a decent implementation here...
It makes much more sense in this case to use tried-and-tested code rather than trying to roll your own.
对于基本上有两页长的规范来说,CSV 格式具有欺骗性在于它的简单性。互联网上可以找到的大多数短解析器实现在某种程度上都是明显不正确的。尽管如此,该格式似乎几乎不需要 1k+ SLOC 实现。
For a specification that's essentially two pages long, the CSV format is deceptive in its simplicity. The majority of short parser implementations that can be found on the internet are blatantly incorrect in one way or another. That notwithstanding, the format hardly seems to call for 1k+ SLOC implementations.