从命令行将元字符作为参数传递给 Python

发布于 2024-11-02 23:59:38 字数 1289 浏览 8 评论 0原文

我正在制作一个 Python 程序,它将解析某些输入行中的字段。我想让用户从命令行输入字段分隔符作为选项。我正在使用 optparse 来执行此操作。我遇到了这样的问题:输入类似 \t 的内容将在 \t 上按字面意思分隔,而不是在选项卡上分隔,这正是我想要的。我很确定这是 Python 的东西,而不是 shell,因为我已经尝试了我能想到的所有引号、反斜杠和 t 的组合。

如果我能让 optparse 让参数成为普通输入(有这样的事情吗?)而不是 raw_input,我认为这会起作用。但我不知道该怎么做。

我还尝试了各种替换和正则表达式技巧,将字符串从两个字符 "\t" 转换为一个字符选项卡,但没有成功。

例如,其中 input.txt 为:

field 1[tab]field\t2

(注意:[tab] 是制表符,>field\t2 是一个 8 个字符的字符串)

parseme.py:

#!/usr/bin/python
from optparse import OptionParser  
parser = OptionParser()  
parser.add_option("-d", "--delimiter", action="store", type="string",  
    dest="delimiter", default='\t')  
parser.add_option("-f", dest="filename")  
(options, args) = parser.parse_args()  
Infile = open(options.filename, 'r')  
Line = Infile.readline()  

Fields = Line.split(options.delimiter)  
print Fields[0]  
print options.delimiter  

Infile.close()  

这给了我:

$ parseme.py -f input.txt  
field 1  
[tab]

嘿,太好了,默认设置工作正常。 (是的,我知道我可以将 \t 设置为默认值并忘记它,但我想知道如何处理此类问题。)

$ parseme.py -f input.txt -d '\t'  
field 1[tab]field  
\t

这不是我想要的。

I'm making a Python program that will parse the fields in some input lines. I'd like to let the user enter the field separator as an option from the command line. I'm using optparse to do this. I'm running into the problem that entering something like \t will separate literally on \t, rather than on a tab, which is what I want. I'm pretty sure this is a Python thing and not the shell, since I've tried every combo of quotes, backslashes, and t's that I can think of.

If I could get optparse to let the argument be plain input (is there such a thing?) rather than raw_input, I think that would work. But I have no clue how to do that.

I've also tried various substitutions and regex tricks to turn the string from the two character "\t" into the one character tab, but without success.

Example, where input.txt is:

field 1[tab]field\t2

(Note: [tab] is a tab character and field\t2 is an 8 character string)

parseme.py:

#!/usr/bin/python
from optparse import OptionParser  
parser = OptionParser()  
parser.add_option("-d", "--delimiter", action="store", type="string",  
    dest="delimiter", default='\t')  
parser.add_option("-f", dest="filename")  
(options, args) = parser.parse_args()  
Infile = open(options.filename, 'r')  
Line = Infile.readline()  

Fields = Line.split(options.delimiter)  
print Fields[0]  
print options.delimiter  

Infile.close()  

This gives me:

$ parseme.py -f input.txt  
field 1  
[tab]

Hey, great, the default setting worked properly. (Yes, I know I could just make \t the default and forget about it, but I'd like to know how to deal with this type of problem.)

$ parseme.py -f input.txt -d '\t'  
field 1[tab]field  
\t

This is not what I want.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

蹲墙角沉默 2024-11-09 23:59:38
>>> r'\t\n\v\r'.decode('string-escape')
'\t\n\x0b\r'
>>> r'\t\n\v\r'.decode('string-escape')
'\t\n\x0b\r'
弄潮 2024-11-09 23:59:38

快速而肮脏的方法是对其进行评估,如下所示:

eval(options.delimiter, {}. {})

额外的空字典是为了防止程序意外损坏。

The quick and dirty way is to to eval it, like this:

eval(options.delimiter, {}. {})

The extra empty dicts are there to prevent accidental clobbering of your program.

失去的东西太少 2024-11-09 23:59:38

从脚本中解决它:

options.delimiter = re.sub("\\\\t","\t",options.delimiter)

您可以调整 re 来匹配更多转义字符(\n、\r 等)

另一种解决 python 之外的问题的方法:

当您从 shell 调用脚本时,请这样做:

parseme.py -f input.txt -d '^V<tab>'

^ V 表示“按 Ctrl+V”,

然后按普通 Tab 键,

这将正确地将制表符传递给您的 Python 脚本;

solving it from within your script:

options.delimiter = re.sub("\\\\t","\t",options.delimiter)

you can adapt the re about to match more escaped chars (\n, \r, etc)

another way to solve the problem outside python:

when you call your script from shell, do it like this:

parseme.py -f input.txt -d '^V<tab>'

^V means "press Ctrl+V"

then press the normal tab key

this will properly pass the tab character to your python script;

梦里梦着梦中梦 2024-11-09 23:59:38

callback 选项是处理棘手情况的好方法:

parser.add_option("-d", "--delimiter", action="callback", type="string",
                  callback=my_callback, default='\t')

使用相应的函数(然后在解析器之前定义):

def my_callback(option, opt, value, parser):
    val = value
    if value == '\\t':
        val = '\t'
    elif value == '\\n':
        val = '\n'
    parser.values.delimiter = val

您可以通过命令行检查它是否有效:python test.py -f test.txt -d \t\t周围没有引号,它们没有用)。

它的优点是通过“optparse”模块处理选项,而不是通过后处理解析结果。

The callback option is a good way to handle tricky cases:

parser.add_option("-d", "--delimiter", action="callback", type="string",
                  callback=my_callback, default='\t')

with the corresponding function (to be defined before the parser, then):

def my_callback(option, opt, value, parser):
    val = value
    if value == '\\t':
        val = '\t'
    elif value == '\\n':
        val = '\n'
    parser.values.delimiter = val

You can check this works via the command line: python test.py -f test.txt -d \t (no quote around the \t, they're useless).

It has the advantage of handling the option via the 'optparse' module, not via post-processing the parsing results.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文