cygwin中的命令行文件解析工具
我必须处理多种格式的文本文件。 这是一个示例(列A和B是制表符分隔的):
A B
a Name1=Val1, Name2=Val2, Name3=Val3
b Name1=Val4, Name3=Val5
c Name1=Val6, Name2=Val7, Name3=Val8
文件可以有标题,也可以没有标题,有混合的分隔方案,有带有上面的名称/值对的列等.
我经常临时需要以各种方式从此类文件中提取数据。 例如,从上面的数据中,我可能想要与 Name2 相关的值(如果它存在)。 即,
A B
a Val2
c Val7
有哪些工具/技术可以执行诸如一行命令之类的操作,以上述为例但可以扩展到其他情况?
I have to deal with text files in a motley selection of formats. Here's an example (Columns A and B are tab delimited):
A B
a Name1=Val1, Name2=Val2, Name3=Val3
b Name1=Val4, Name3=Val5
c Name1=Val6, Name2=Val7, Name3=Val8
The files could have headers or not, have mixed delimiting schemes, have columns with name/value pairs as above etc.
I often have the ad-hoc need to extract data from such files in various ways. For example from the above data I might want the value associated with Name2 where it is present. i.e.
A B
a Val2
c Val7
What tools/techniques are there for performing such manipulations as one line commands, using the above as an example but extensible to other cases?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(6)
您可以使用所有基本的 bash shell 命令,例如 grep、cut、sed 和 awk。 您还可以使用 Perl 或 Ruby 来处理更复杂的事情。
You have all the basic bash shell commands, for example grep, cut, sed and awk at your disposal. You can also use Perl or Ruby for more complex things.
据我所知,我会从 Awk 开始处理这类事情,然后如果你需要更复杂的东西,我会转向 Python。
From what I've seen I'd start with Awk for this sort of thing and then if you need something more complex, I'd progress to Python.
我会使用 sed:
I would use sed:
既然你有 cygwin,我就选择 Perl。 这是最容易学习的(查看 O'Reily 的书:学习 Perl) 并广泛适用。
Since you have cygwin, I'd go with Perl. It's the easiest to learn (check out the O'Reily book: Learning Perl) and widely applicable.
我会使用 Perl。 编写一个小模块(或多个)来处理不同的格式。 然后您可以使用该库运行 perl oneliners。 举例说明它会发生什么
如下所示:
不要在语法上引用我,但这就是总体思路。 将手头的任务抽象出来,让你思考需要做什么,而不是如何做。 Ruby 是另一种选择,它往往具有更清晰的语法,但任何一种语言都可以。
I would use Perl. Write a small module (or more than one) for dealing with the different formats. You could then run perl oneliners using that library. Example for what it would
look like as follows:
Don't quote me on the syntax, but that's the general idea. Abstract the task at hand to allow you to think in terms of what you need to do, not how you need to do it. Ruby would be another option, it tends to have a cleaner syntax, but either language would work.
我不太喜欢 sed ,但它适用于这样的事情:
给你:
I don't like sed too much, but it works for such things:
Gives you: