当前位置：文江博客话题详情

如何使用Python内置的切片对象？

发布于 2024-09-27 05:15:24 字数 104 浏览 6 评论 0原文

我知道 Pythonic 切片：l1[start:stop:step]。

内置函数slice有什么用？
我该如何使用它？

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

谢绝鈎搭 2024-10-04 05:15:24

您可以通过使用与执行 [start:end:step] 表示法时使用的字段相同的字段来调用 slice 来创建切片：

sl = slice(0,4)

要使用切片，只需将其传递为列表或字符串的索引即可：

>>> s = "ABCDEFGHIJKL"
>>> sl = slice(0,4)
>>> print(s[sl])
'ABCD'

假设您有一个固定长度文本字段的文件。您可以定义一个切片列表，以便轻松地从此文件中的每个“记录”中提取值。

data = """\
0010GEORGE JETSON    12345 SPACESHIP ST   HOUSTON       TX
0020WILE E COYOTE    312 ACME BLVD        TUCSON        AZ
0030FRED FLINTSTONE  246 GRANITE LANE     BEDROCK       CA
0040JONNY QUEST      31416 SCIENCE AVE    PALO ALTO     CA""".splitlines()


fieldslices = [slice(*fielddef) for fielddef in [
    (0,4), (4, 21), (21,42), (42,56), (56,58),
    ]]
fields = "id name address city state".split()

for rec in data:
    for field,sl in zip(fields, fieldslices):
        print("{} : {}".format(field, rec[sl]))
    print('')

# or this same code using itemgetter, to make a function that
# extracts all slices from a string into a tuple of values
import operator
rec_reader = operator.itemgetter(*fieldslices)
for rec in data:
    for field, field_value in zip(fields, rec_reader(rec)):
        print("{} : {}".format(field, field_value))
    print('')

印刷：

id : 0010
name : GEORGE JETSON    
address : 12345 SPACESHIP ST   
city : HOUSTON       
state : TX

id : 0020
name : WILE E COYOTE    
address : 312 ACME BLVD        
city : TUCSON        
state : AZ

id : 0030
name : FRED FLINTSTONE  
address : 246 GRANITE LANE     
city : BEDROCK       
state : CA

id : 0040
name : JONNY QUEST      
address : 31416 SCIENCE AVE    
city : PALO ALTO     
state : CA

You create a slice by calling slice with the same fields you would use if doing [start:end:step] notation:

sl = slice(0,4)

To use the slice, just pass it as if it were the index into a list or string:

>>> s = "ABCDEFGHIJKL"
>>> sl = slice(0,4)
>>> print(s[sl])
'ABCD'

Let's say you have a file of fixed-length text fields. You could define a list of slices to easily extract the values from each "record" in this file.

data = """\
0010GEORGE JETSON    12345 SPACESHIP ST   HOUSTON       TX
0020WILE E COYOTE    312 ACME BLVD        TUCSON        AZ
0030FRED FLINTSTONE  246 GRANITE LANE     BEDROCK       CA
0040JONNY QUEST      31416 SCIENCE AVE    PALO ALTO     CA""".splitlines()


fieldslices = [slice(*fielddef) for fielddef in [
    (0,4), (4, 21), (21,42), (42,56), (56,58),
    ]]
fields = "id name address city state".split()

for rec in data:
    for field,sl in zip(fields, fieldslices):
        print("{} : {}".format(field, rec[sl]))
    print('')

# or this same code using itemgetter, to make a function that
# extracts all slices from a string into a tuple of values
import operator
rec_reader = operator.itemgetter(*fieldslices)
for rec in data:
    for field, field_value in zip(fields, rec_reader(rec)):
        print("{} : {}".format(field, field_value))
    print('')

Prints:

id : 0010
name : GEORGE JETSON    
address : 12345 SPACESHIP ST   
city : HOUSTON       
state : TX

id : 0020
name : WILE E COYOTE    
address : 312 ACME BLVD        
city : TUCSON        
state : AZ

id : 0030
name : FRED FLINTSTONE  
address : 246 GRANITE LANE     
city : BEDROCK       
state : CA

id : 0040
name : JONNY QUEST      
address : 31416 SCIENCE AVE    
city : PALO ALTO     
state : CA

回复收藏 0 原文

紫罗兰の梦幻 2024-10-04 05:15:24

序列后面的方括号表示索引或切片，具体取决于括号内的内容：

>>> "Python rocks"[1]    # index
'y'
>>> "Python rocks"[1:10:2]    # slice
'yhnrc'

这两种情况都由序列的 __getitem__() 方法（或 __setitem__() 如果在等号左边。）索引或切片作为单个参数传递给方法，Python 执行此操作的方式是通过转换切片符号，(1:10:2，在本例中）转换为切片对象：slice(1,10,2)。

因此，如果您正在定义自己的类似序列的类或重写另一个类的 __getitem__ 或 __setitem__ 或 __delitem__ 方法，则需要测试index 参数来确定它是 int 还是 slice，并进行相应处理：

def __getitem__(self, index):
    if isinstance(index, int):
        ...    # process index as an integer
    elif isinstance(index, slice):
        start, stop, step = index.indices(len(self))    # index is a slice
        ...    # process slice
    else:
        raise TypeError("index must be int or slice")

slice 对象具有三个属性： start< /code>、stop 和 step 以及一种方法：indices，它采用单个参数（对象的长度），并返回一个三元组：（开始、停止、步骤）。

Square brackets following a sequence denote either indexing or slicing depending on what's inside the brackets:

>>> "Python rocks"[1]    # index
'y'
>>> "Python rocks"[1:10:2]    # slice
'yhnrc'

Both of these cases are handled by the __getitem__() method of the sequence (or __setitem__() if on the left of an equals sign.) The index or slice is passed to the methods as a single argument, and the way Python does this is by converting the slice notation, (1:10:2, in this case) to a slice object: slice(1,10,2).

So if you are defining your own sequence-like class or overriding the __getitem__ or __setitem__ or __delitem__ methods of another class, you need to test the index argument to determine if it is an int or a slice, and process accordingly:

def __getitem__(self, index):
    if isinstance(index, int):
        ...    # process index as an integer
    elif isinstance(index, slice):
        start, stop, step = index.indices(len(self))    # index is a slice
        ...    # process slice
    else:
        raise TypeError("index must be int or slice")

A slice object has three attributes: start, stop and step, and one method: indices, which takes a single argument, the length of the object, and returns a 3-tuple: (start, stop, step).

回复收藏 0 原文

三生路 2024-10-04 05:15:24

>>> class sl:
...  def __getitem__(self, *keys): print keys
...     
>>> s = sl()
>>> s[1:3:5]
(slice(1, 3, 5),)
>>> s[1:2:3, 1, 4:5]
((slice(1, 2, 3), 1, slice(4, 5, None)),)
>>>

>>> class sl:
...  def __getitem__(self, *keys): print keys
...     
>>> s = sl()
>>> s[1:3:5]
(slice(1, 3, 5),)
>>> s[1:2:3, 1, 4:5]
((slice(1, 2, 3), 1, slice(4, 5, None)),)
>>>

回复收藏 0 原文

好倦 2024-10-04 05:15:24

slice 函数返回切片对象。 Slice 对象是 Python 的内部类型之一，它针对读取性能进行了优化 - 它们的所有属性都是只读的。

如果希望更改默认行为，则更改 slice 可能会很有用。例如， lxml 使用切片表示法来访问 DOM 元素（但是，我没有我自己也证实了他们是如何做到的）。

回复收藏 0 原文

热情消退 2024-10-04 05:15:24

在尝试回答根据变量对字符串进行子集时，我记得 numpy 有定义切片对象的语法上很好的方法：

>>> import numpy as np
>>> s = "The long-string instrument is a musical instrument in which the string is of such a length that the fundamental transverse wave is below what a person can hear as a tone."
>>> z = np.s_[18:26]  # in this case same as slice(18, 26, None)
>>> s[z]
'strument'

这里解决的问题是如何将切片存储在变量中以供以后使用，而 np.s_ 允许这样做。是的，它不是内置的，但由于原始问题被重定向到这里，我觉得我的答案也属于这里。此外，numpy 也是 Python、IIRC 中添加如此高级切片功能的原因之一。

更复杂的“切片”的示例：

>>> data = np.array(range(6)).reshape((2, 3))
>>> z = np.s_[:1, 1:2]
>>> data[z]
array([[1]])
>>> data
array([[0, 1, 2],
       [3, 4, 5]])
>>> z
(slice(None, 1, None), slice(1, 2, None))

其中 z 现在是切片元组。

While trying to answer Subset a string based on variable , I recalled that numpy has a syntactically nice way to define slice objects:

>>> import numpy as np
>>> s = "The long-string instrument is a musical instrument in which the string is of such a length that the fundamental transverse wave is below what a person can hear as a tone."
>>> z = np.s_[18:26]  # in this case same as slice(18, 26, None)
>>> s[z]
'strument'

The problem solved here is how to store the slice in a variable for later use, and np.s_ allows to do just that. Yes, it's not built-in, but as that original question was redirected here I feel like my answer belong here as well. Also, numpy was one of the reasons why so advanced slicing abilities were added to Python, IIRC.

An example of a more complex "slicing":

>>> data = np.array(range(6)).reshape((2, 3))
>>> z = np.s_[:1, 1:2]
>>> data[z]
array([[1]])
>>> data
array([[0, 1, 2],
       [3, 4, 5]])
>>> z
(slice(None, 1, None), slice(1, 2, None))

where z is now a tuple of slices.

回复收藏 0 原文

梦里人 2024-10-04 05:15:24

切片对象允许您以编程方式生成和操作切片。特别是对于多维 numpy 数组，尤其是如果您事先不知道维数，您可能需要即时构造切片来指定所需的轴或维度。

import numpy as np
dimension = np.random.randint(10) # Might have up to 10 dimensions
shape = []
for d in range(dimension):
    shape.append(np.random.randint(10))
zz = np.random.rand(tuple(shape))
print(zz)
>>> array([[0.68379351, 0.50854469, 0.64578775, 0.73441699, 0.28977396],
           [0.88797164, 0.81603025, 0.63978659, 0.22677299, 0.93455738],
           [0.0892855 , 0.28048706, 0.04262895, 0.9353467 , 0.13062249],
           [0.88561035, 0.93378367, 0.12124208, 0.25600301, 0.96035638]])

这里我们的数据最终是二维的（4×5），但并不能保证这一点。您将如何从 zz 请求切片？

一个问题是我无法操作 Python 的切片表示法。它不是切片操作之外的有效语法。

my_slice = 2:3:1
>>> SyntaxError: Invalid Syntax

如果我可以在循环中构建我想要的确切切片请求，就像构建字符串一样，会怎么样？那不是很好吗？我的意思是，当然您可以使用字符串来完成此操作，但它会很混乱并且需要eval。

your_slice_definitions = [(2,3,1), *[(None, None, None)]*(zz.ndim - 1)] 
my_slice_str = ""
for slice_start, slice_end, slice_step in your_slice_definitions:
    my_slice_str += "{}:{}:{},".format(slice_start, slice_end, slice_step)
eval("zz["+my_slice_str+"])

所以我们在这里：slice 对象可以让你做到这一点。您可以即时组装列表和元组，将它们作为函数参数传递，对它们进行排序，对它们进行洗牌等等。

my_slices = []
for slice_start, slice_end, slice_step in your_slice_definitions:
    my_slices += slice(slice_start, slice_end, slice_step)
print(zz[my_slices])
>>> array([[0.0892855 , 0.28048706, 0.04262895, 0.9353467 , 0.13062249]])

Slice objects let you programmatically generate and manipulate slices. Especially for multidimensional numpy arrays, and especially if you don't know the dimensionality in advance, you might want to construct slices on-the-fly to specify the axes or dimensions that you want.

import numpy as np
dimension = np.random.randint(10) # Might have up to 10 dimensions
shape = []
for d in range(dimension):
    shape.append(np.random.randint(10))
zz = np.random.rand(tuple(shape))
print(zz)
>>> array([[0.68379351, 0.50854469, 0.64578775, 0.73441699, 0.28977396],
           [0.88797164, 0.81603025, 0.63978659, 0.22677299, 0.93455738],
           [0.0892855 , 0.28048706, 0.04262895, 0.9353467 , 0.13062249],
           [0.88561035, 0.93378367, 0.12124208, 0.25600301, 0.96035638]])

Here our data ended up being two dimensional (4-by-5), but there was no guarantee of that. How will you request slices from zz?

One problem is that I can't manipulate Python's slice notation. It's not valid syntax outside of a slicing operation.

my_slice = 2:3:1
>>> SyntaxError: Invalid Syntax

What if I could just build up the exact slice request I wanted in a loop, the way I can build up a string? Wouldn't that be great? I mean, sure you can use a string to do it, but it would be messy and requires eval.

your_slice_definitions = [(2,3,1), *[(None, None, None)]*(zz.ndim - 1)] 
my_slice_str = ""
for slice_start, slice_end, slice_step in your_slice_definitions:
    my_slice_str += "{}:{}:{},".format(slice_start, slice_end, slice_step)
eval("zz["+my_slice_str+"])

So here we are: slice objects let you do this. You can assemble lists and tuples of them on-the-fly, pass them as function parameters, sort them, shuffle them, and so on.

my_slices = []
for slice_start, slice_end, slice_step in your_slice_definitions:
    my_slices += slice(slice_start, slice_end, slice_step)
print(zz[my_slices])
>>> array([[0.0892855 , 0.28048706, 0.04262895, 0.9353467 , 0.13062249]])

回复收藏 0 原文

~没有更多了~