为什么``' {x [1:3]}' .format(x =; asd; asd; quord; quary)`导致typeerror?

发布于 2025-01-20 06:38:46 字数 268 浏览 0 评论 0 原文

考虑一下:

>>> '{x[1]}'.format(x="asd")
's'
>>> '{x[1:3]}'.format(x="asd")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: string indices must be integers

这种行为的原因可能是什么?

Consider this:

>>> '{x[1]}'.format(x="asd")
's'
>>> '{x[1:3]}'.format(x="asd")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: string indices must be integers

What could be the cause for this behavior?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

疑心病 2025-01-27 06:38:46

基于, checking what value the object's __getitem__ method actually receives:

class C:
    def __getitem__(self, index):
        print(repr(index))

'{c[4]}'.format(c=C())
'{c[4:6]}'.format(c=C())
'{c[anything goes!@#$%^&]}'.format(c=C())
C()[4:6]

Output (

4
'4:6'
'anything goes!@#$%^&'
slice(4, 6, None)

“ rel =“ noreferrer” title = int , 4:6 没有像通常的切片一样转换为切片(4,6,无)。相反,它只是 string '4:6'。这不是用于索引/切片字符串的有效类型,因此 typeError:字符串索引必须是您获得的整数

更新:

记录了吗?好吧...我看不到很清楚的东西,但是 @gacy20 grammar 这些规则

field_name        ::=  arg_name ("." attribute_name | "[" element_index "]")*
element_index     ::=  digit+ | index_string
index_string      ::=  <any source character except "]"> +

我们的 c [ 4:6] field_name ,我们对 element_index part 4:6 感兴趣。我认为,如果 Digit+有自己的规则,我会说有有意义的名称:

field_name        ::=  arg_name ("." attribute_name | "[" element_index "]")*
element_index     ::=  index_integer | index_string
index_integer     ::=  digit+
index_string      ::=  <any source character except "]"> +

我要说 index_integer 和index_string更清楚地表明 dig> digit+ index_integer >将转换为 integer (而不是保持数字字​​符串),而&lt;除了“]”&gt; +将保持字符串

也就是说,看着规则,也许我们应该认为“将数字案件与任何字符案件中分开是什么意义?” 并认为这一点是对纯数字的处理方式有所不同,大概是将其转换为整数。或者也许文档的其他部分甚至指出 Digit 数字+ 在一般中被转换为整数。

An experiment based on your comment, checking what value the object's __getitem__ method actually receives:

class C:
    def __getitem__(self, index):
        print(repr(index))

'{c[4]}'.format(c=C())
'{c[4:6]}'.format(c=C())
'{c[anything goes!@#$%^&]}'.format(c=C())
C()[4:6]

Output (Try it online!):

4
'4:6'
'anything goes!@#$%^&'
slice(4, 6, None)

So while the 4 gets converted to an int, the 4:6 isn't converted to slice(4, 6, None) as in usual slicing. Instead, it remains simply the string '4:6'. And that's not a valid type for indexing/slicing a string, hence the TypeError: string indices must be integers you got.

Update:

Is that documented? Well... I don't see something really clear, but @GACy20 pointed out something subtle. The grammar has these rules

field_name        ::=  arg_name ("." attribute_name | "[" element_index "]")*
element_index     ::=  digit+ | index_string
index_string      ::=  <any source character except "]"> +

Our c[4:6] is the field_name, and we're interested in the element_index part 4:6. I think it would be clearer if digit+ had its own rule with meaningful name:

field_name        ::=  arg_name ("." attribute_name | "[" element_index "]")*
element_index     ::=  index_integer | index_string
index_integer     ::=  digit+
index_string      ::=  <any source character except "]"> +

I'd say having index_integer and index_string would more clearly indicate that digit+ is converted to an integer (instead of staying a digit string), while <any source character except "]"> + would stay a string.

That said, looking at the rules as they are, perhaps we should think "what would be the point of separating the digits case out of the any-characters case which would match it as well?" and think that the point is to treat pure digits differently, presumably to convert them to an integer. Or maybe some other part of the documentation even states that digit or digits+ in general gets converted to an integer.

故事还在继续 2025-01-27 06:38:46

'{x[1]}'.format(x="asd") 这里的 [1] 语法不是“正常”字符串索引语法,即使在在这种情况下,它似乎以同样的方式工作。

它使用格式规范迷你语言。相同的机制允许传递对象并访问格式化字符串内的任意属性(例如'{x.name}'.format(x=some_object))。

这种“假”索引语法还允许将可索引对象传递给 format 并直接从格式化字符串中获取所需的元素:

'{x[0]}'.format(x=('a', 'tuple'))
# 'a'
'{x[1]}'.format(x=('a', 'tuple'))
# 'tuple'

在文档是这一段:

field_name 本身以 arg_name 开头,arg_name 可以是数字,也可以是关键字。如果它是数字,则它指的是位置参数,如果它是关键字,则它指的是命名关键字参数。如果格式字符串中的数字 arg_name 依次为 0、1、2、...,则可以全部省略(不仅仅是部分),并且数字 0、1、2、... 将按该顺序自动插入。由于 arg_name 不是用引号分隔的,因此无法在格式字符串中指定任意字典键(例如,字符串“10”或“:-]”)。 arg_name 后面可以跟任意数量的索引或属性表达式。 “.name”形式的表达式使用 getattr() 选择命名属性,而“[index]”形式的表达式使用 __getitem__() 进行索引查找。


虽然它提到

而“[index]”形式的表达式使用 __getitem__() 进行索引查找。

它没有提及任何有关不支持切片语法的内容。

对我来说,这感觉像是文档中的疏忽,特别是因为 '{x[1:3]}'.format(x="asd") 生成如此神秘的错误消息,甚至更是如此由于 __getitem__ 已经支持切片。

'{x[1]}'.format(x="asd") the [1] syntax here is not the "normal" string indexing syntax, even if in this case it appears to be working the same way.

It is using the Format Specification Mini-Language. The same mechanism that allows for passing objects and accessing an arbitrary attribute inside the formatted string (eg '{x.name}'.format(x=some_object)).

This "fake" indexing syntax also allows to pass indexable objects to format and directly getting the element you want from within the formatted string:

'{x[0]}'.format(x=('a', 'tuple'))
# 'a'
'{x[1]}'.format(x=('a', 'tuple'))
# 'tuple'

The only reference (that I could find, at least) for this in the docs is this paragraph:

The field_name itself begins with an arg_name that is either a number or a keyword. If it’s a number, it refers to a positional argument, and if it’s a keyword, it refers to a named keyword argument. If the numerical arg_names in a format string are 0, 1, 2, … in sequence, they can all be omitted (not just some) and the numbers 0, 1, 2, … will be automatically inserted in that order. Because arg_name is not quote-delimited, it is not possible to specify arbitrary dictionary keys (e.g., the strings '10' or ':-]') within a format string. The arg_name can be followed by any number of index or attribute expressions. An expression of the form '.name' selects the named attribute using getattr(), while an expression of the form '[index]' does an index lookup using __getitem__().

While it mentions

while an expression of the form '[index]' does an index lookup using __getitem__().

it does not mention anything about slicing syntax not being supported.

For me this feels like an oversight in the docs, especially because '{x[1:3]}'.format(x="asd") generates such a cryptic error message, and even more so due to __getitem__ already supporting slicing.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文