需要在 textwrangler - grep 中使用正则表达式查找和替换 csv 文件

发布于 2024-10-10 14:14:08 字数 497 浏览 9 评论 0原文

我有这个 csv 文件,这里是纯文本: http://pastie.org/1425970

它在 Excel 中的样子: http://cl.ly/3qXk

我希望它看起来像什么的示例(仅使用第一行为例): http://cl.ly/3qYT

第一行纯文本:http://pastie.org/1425979

我需要创建一个 csv 文件,将所有信息导入数据库表中。

我可以手动创建 csv,但我想看看是否可以使用 textwrangler (grep) 查找和替换中的正则表达式来完成此操作

I have this csv file, plain text here: http://pastie.org/1425970

What it looks like in excel: http://cl.ly/3qXk

An example of what I would like it to look like (just using the first row as example): http://cl.ly/3qYT

Plain text of first row: http://pastie.org/1425979

I need to create a csv file, to import all of the information into a database table.

I could manually create the csv, but I wanted to see if it was possible to accomplish this using regular expressions in textwrangler (grep) find and replace

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

小镇女孩 2024-10-17 14:14:08

正则表达式实际上并不是实现此目的的最佳方法。正如其他人所指出的,您最好编写一些代码来将文件解析为您想要的格式。

话虽如此,这个丑陋的正则表达式应该可以帮助您完成一半:

查找:

(\d+),"?(?:(\d+),? ?)?(?:(\d+),? ?)?(?:(\d+),? ?)?(?:(\d+),? ?)?(?:(\d+),? ?)?(?:(\d+),? ?)?(?:(\d+),? ?)?"?

替换:

\1,\2\r\1,\3\r\1,\4\r\1,\5\r\1,\6\r\1,\7\r\1,\8

这将为您留下一些额外的行,如下所示:

1,1
1,8
1,11
1,13
1,
1,
1,
2,10
2,11
2,12
2,
2,
...

您可以手动清理多余的行,或使用以下正则表达式:

查找:

\d+,\r

替换:

(empty string)

Regular expressions aren't really the best way to accomplish this. As others have noted, you're better off writing some code to parse the file into the format you want.

With that said, this ugly regex should get you halfway there:

Find:

(\d+),"?(?:(\d+),? ?)?(?:(\d+),? ?)?(?:(\d+),? ?)?(?:(\d+),? ?)?(?:(\d+),? ?)?(?:(\d+),? ?)?(?:(\d+),? ?)?"?

Replace:

\1,\2\r\1,\3\r\1,\4\r\1,\5\r\1,\6\r\1,\7\r\1,\8

Which will leave you with some extra rows, like below:

1,1
1,8
1,11
1,13
1,
1,
1,
2,10
2,11
2,12
2,
2,
...

You can clean up the extra rows by hand, or with the following regex:

Find:

\d+,\r

Replace:

(empty string)
人疚 2024-10-17 14:14:08

使用 Perl,您可以执行如下操作:

open(my $read,"<","input.csv") 或 die ("Gah,无法读取 input.csv!\n");
open(my $write,">","output.csv") or die ("WHAAAARGARBL!\n");
while(<$read>)
{
咀嚼;
if(/(\d+),"(.*)"/)
{
我的@arr=split(/,/,$2);
foreach(@arr)
{
打印 $write $1.",".$2."\n";
}
}
}
关闭($读);
关闭($write);

Using Perl, you could do something like this:

open(my $read,"<","input.csv") or die ("Gah, couldn't read input.csv!\n");
open(my $write,">","output.csv") or die ("WHAAAARGARBL!\n");
while(<$read>)
{
chomp;
if(/(\d+),"(.*)"/)
{
my @arr=split(/,/,$2);
foreach(@arr)
{
print $write $1.",".$2."\n";
}
}
}
close($read);
close($write);

失去的东西太少 2024-10-17 14:14:08

我不认识短信。但总的来说,我可以用伪代码描述如何做到这一点。

loop, read each line  
   strip off the newline
   split into an array using /[, "]+/ as delimeter regex
   loop using result. an array slice from element 1 to the last element
       print element 0, comma, then itterator value
   end loop
end loop

在 Perl 中,类似这样的事情..

while ($line = <DATA> ) {
    chomp $line;
    @data_array = split /[, "]+/, $line;
    for $otherfield ( @data_array[ 1 .. $#data_array ]) {
        print "$data_array[0], $otherfield\n";
    }
}

如果你有分割能力的话,这应该很容易。

I don't know textmate. But in general I can describe what it takes to do this in pseudo-code.

loop, read each line  
   strip off the newline
   split into an array using /[, "]+/ as delimeter regex
   loop using result. an array slice from element 1 to the last element
       print element 0, comma, then itterator value
   end loop
end loop

In Perl, something like this ..

while ($line = <DATA> ) {
    chomp $line;
    @data_array = split /[, "]+/, $line;
    for $otherfield ( @data_array[ 1 .. $#data_array ]) {
        print "$data_array[0], $otherfield\n";
    }
}

It should be easy if you have a split capability.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文