从命令行将元字符作为参数传递给 Python
我正在制作一个 Python 程序,它将解析某些输入行中的字段。我想让用户从命令行输入字段分隔符作为选项。我正在使用 optparse 来执行此操作。我遇到了这样的问题:输入类似 \t
的内容将在 \t
上按字面意思分隔,而不是在选项卡上分隔,这正是我想要的。我很确定这是 Python 的东西,而不是 shell,因为我已经尝试了我能想到的所有引号、反斜杠和 t 的组合。
如果我能让 optparse 让参数成为普通输入(有这样的事情吗?)而不是 raw_input,我认为这会起作用。但我不知道该怎么做。
我还尝试了各种替换和正则表达式技巧,将字符串从两个字符 "\t"
转换为一个字符选项卡,但没有成功。
例如,其中 input.txt
为:
field 1[tab]field\t2
(注意:[tab]
是制表符,>field\t2
是一个 8 个字符的字符串)
parseme.py:
#!/usr/bin/python
from optparse import OptionParser
parser = OptionParser()
parser.add_option("-d", "--delimiter", action="store", type="string",
dest="delimiter", default='\t')
parser.add_option("-f", dest="filename")
(options, args) = parser.parse_args()
Infile = open(options.filename, 'r')
Line = Infile.readline()
Fields = Line.split(options.delimiter)
print Fields[0]
print options.delimiter
Infile.close()
这给了我:
$ parseme.py -f input.txt
field 1
[tab]
嘿,太好了,默认设置工作正常。 (是的,我知道我可以将 \t 设置为默认值并忘记它,但我想知道如何处理此类问题。)
$ parseme.py -f input.txt -d '\t'
field 1[tab]field
\t
这不是我想要的。
I'm making a Python program that will parse the fields in some input lines. I'd like to let the user enter the field separator as an option from the command line. I'm using optparse
to do this. I'm running into the problem that entering something like \t
will separate literally on \t
, rather than on a tab, which is what I want. I'm pretty sure this is a Python thing and not the shell, since I've tried every combo of quotes, backslashes, and t
's that I can think of.
If I could get optparse
to let the argument be plain input (is there such a thing?) rather than raw_input
, I think that would work. But I have no clue how to do that.
I've also tried various substitutions and regex tricks to turn the string from the two character "\t"
into the one character tab, but without success.
Example, where input.txt
is:
field 1[tab]field\t2
(Note: [tab]
is a tab character and field\t2
is an 8 character string)
parseme.py:
#!/usr/bin/python
from optparse import OptionParser
parser = OptionParser()
parser.add_option("-d", "--delimiter", action="store", type="string",
dest="delimiter", default='\t')
parser.add_option("-f", dest="filename")
(options, args) = parser.parse_args()
Infile = open(options.filename, 'r')
Line = Infile.readline()
Fields = Line.split(options.delimiter)
print Fields[0]
print options.delimiter
Infile.close()
This gives me:
$ parseme.py -f input.txt
field 1
[tab]
Hey, great, the default setting worked properly. (Yes, I know I could just make \t the default and forget about it, but I'd like to know how to deal with this type of problem.)
$ parseme.py -f input.txt -d '\t'
field 1[tab]field
\t
This is not what I want.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
快速而肮脏的方法是对其进行评估,如下所示:
额外的空字典是为了防止程序意外损坏。
The quick and dirty way is to to
eval
it, like this:The extra empty dicts are there to prevent accidental clobbering of your program.
从脚本中解决它:
您可以调整 re 来匹配更多转义字符(\n、\r 等)
另一种解决 python 之外的问题的方法:
当您从 shell 调用脚本时,请这样做:
^ V 表示“按 Ctrl+V”,
然后按普通 Tab 键,
这将正确地将制表符传递给您的 Python 脚本;
solving it from within your script:
you can adapt the re about to match more escaped chars (\n, \r, etc)
another way to solve the problem outside python:
when you call your script from shell, do it like this:
^V means "press Ctrl+V"
then press the normal tab key
this will properly pass the tab character to your python script;
callback
选项是处理棘手情况的好方法:使用相应的函数(然后在解析器之前定义):
您可以通过命令行检查它是否有效:
python test.py -f test.txt -d \t
(\t
周围没有引号,它们没有用)。它的优点是通过“optparse”模块处理选项,而不是通过后处理解析结果。
The
callback
option is a good way to handle tricky cases:with the corresponding function (to be defined before the parser, then):
You can check this works via the command line:
python test.py -f test.txt -d \t
(no quote around the\t
, they're useless).It has the advantage of handling the option via the 'optparse' module, not via post-processing the parsing results.