验证参数是序列而不是字符串的 Pythonic 方法

发布于 2024-10-03 09:55:20 字数 843 浏览 8 评论 0原文

我有一个函数，它获取数据库表列表作为参数，并返回要在这些表上执行的命令字符串，例如：

pg_dump( file='/tmp/dump.sql',
         tables=('stack', 'overflow'),
         port=5434
         name=europe)

应该返回类似以下内容：

pg_dump -t stack -t overflow -f /tmp/dump.sql -p 5434 europe

这是使用 tables_string='-t '+' -t 完成的'.join(表)。

当使用 tables=('stackoverflow') （字符串）而不是 tables=('stackoverflow',) （元组）调用函数时，乐趣就开始了，其结果是：

pg_dump -t s -t t -t a -t c -t k -t o -t v -t e -t r -t f -t l -t o -t w
        -f /tmp/dump.sql -p 5434 europe

因为字符串本身正在被迭代。

这个问题建议在类型上使用断言，但我不确定它是否足够Pythonic，因为它打破了鸭子类型约定。

有什么见解吗？

亚当

原文

I have a function that gets a list of DB tables as parameter, and returns a command string to be executed on these tables, e.g.:

pg_dump( file='/tmp/dump.sql',
         tables=('stack', 'overflow'),
         port=5434
         name=europe)

Should return something like:

pg_dump -t stack -t overflow -f /tmp/dump.sql -p 5434 europe

This is done using tables_string='-t '+' -t '.join(tables).

The fun begins when the function is called with: tables=('stackoverflow') (a string) instead of tables=('stackoverflow',) (a tuple), which yields:

pg_dump -t s -t t -t a -t c -t k -t o -t v -t e -t r -t f -t l -t o -t w
        -f /tmp/dump.sql -p 5434 europe

Because the string itself is being iterated.

This SO question suggests using asserts on the type, but I'm not sure it's Pythonic enough because it breaks the duck-type convention.

Any insights?

Adam

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

君勿笑 2024-10-10 09:55:20

在这种情况下断言类型似乎是合适的——处理由于鸭子类型而看似合法的常见误用。

处理这种常见情况的另一种方法是测试字符串并将其作为特殊情况正确处理。

最后，您可以鼓励将表名称作为位置参数传递，这将减少这种情况的发生：

def pg_dump(*tables, **kwargs):
  file = kwargs['file']
  port = kwargs['port']
  name = kwargs['name']
  ...

pg_dump('stack', 'overflow', file='/tmp/dump.sql', port=5434, name='europe')

Asserting the type seems appropriate in this case - handling a common misuse that seems legal because of duck typing.

Another way to handle this common case would be to test for string and handle it correctly as a special case.

Finally, you could encourage passing the table names as positional parameters which would make this scenario less likely:

def pg_dump(*tables, **kwargs):
  file = kwargs['file']
  port = kwargs['port']
  name = kwargs['name']
  ...

pg_dump('stack', 'overflow', file='/tmp/dump.sql', port=5434, name='europe')

回复收藏 0 原文

转角预定愛 2024-10-10 09:55:20

您可以使用 ABC 断言对象是可迭代的，但不是字符串：

from types import StringType
from collections import Iterable
assert isinstance(x, Iterable) and not isinstance(x, StringType)

You can use ABCs to assert that an object is iterable but not a string:

from types import StringType
from collections import Iterable
assert isinstance(x, Iterable) and not isinstance(x, StringType)

回复收藏 0 原文

深陷 2024-10-10 09:55:20

检测参数是序列（列表或元组）还是字符串的常见 Python 习惯用法是检查它是否具有 __iter__ 属性：

def func(arg):
    if hasattr(arg, '__iter__'):
        print repr(arg), 'has __iter__ attribute'
    else:
        print repr(arg), 'has no __iter__ attribute'

func('abc')
# 'abc' has no __iter__

func(('abc'))
# 'abc' has no __iter__

func(('abc',))
# ('abc',) has __iter__

当它不是序列时，也通常会更改它合并为一个以简化代码的其余部分（只需处理一种事情）。在示例中，可以使用简单的 arg = [arg] 来完成。

A common Python idiom to detect whether an argument is a sequence (a list or tuple) or a string is to check whether it has the __iter__ attribute:

def func(arg):
    if hasattr(arg, '__iter__'):
        print repr(arg), 'has __iter__ attribute'
    else:
        print repr(arg), 'has no __iter__ attribute'

func('abc')
# 'abc' has no __iter__

func(('abc'))
# 'abc' has no __iter__

func(('abc',))
# ('abc',) has __iter__

When it's not a sequence, it's also common to change it into one to simplify the rest of the code (which only has to deal with one kind of thing). In the sample it could have been done with a simple arg = [arg].

回复收藏 0 原文

你的他你的她 2024-10-10 09:55:20

不能使用列表而不是元组吗？

pg_dump( file='/tmp/dump.sql',
         tables=['stack', 'overflow'],
         port=5434,
         name='europe')

Can you not use a list rather than a tuple?

pg_dump( file='/tmp/dump.sql',
         tables=['stack', 'overflow'],
         port=5434,
         name='europe')

回复收藏 0 原文

夕嗳→ 2024-10-10 09:55:20

我能想到的最简洁的方法是通过重写 subclasshook 创建一个新的抽象集合类型“NonStrSequence”。请参阅下面的实现和测试：

  from typing import Sequence, ByteString
  from abc import ABC
  class NonStrSequence(ABC):
      @classmethod
      def __subclasshook__(cls, C):
          # not possible to do with AnyStr
          if issubclass(C, (str, ByteString)):
              return NotImplemented
          else:
              return issubclass(C, Sequence)



  tests = {
      'list_of_strs': ['b', 'c'],
      'str': 'abc',
      'bytes': b'bytes',
      'tuple': ([1,2], 'a'),
      'str_in_parens': ('a'), # Not a tuple
      'str_in_tuple': ('a',),
      }
  for type in [Sequence, NonStrSequence]:
      for k,v in tests.items():
          print(f'{k}: isinstance({v}, {type}): {isinstance(v, type)}')

The cleanest way I can think of is to create a new Abstract Collection type "NonStrSequence" by overriding subclasshook. See below implementation and tests:

  from typing import Sequence, ByteString
  from abc import ABC
  class NonStrSequence(ABC):
      @classmethod
      def __subclasshook__(cls, C):
          # not possible to do with AnyStr
          if issubclass(C, (str, ByteString)):
              return NotImplemented
          else:
              return issubclass(C, Sequence)



  tests = {
      'list_of_strs': ['b', 'c'],
      'str': 'abc',
      'bytes': b'bytes',
      'tuple': ([1,2], 'a'),
      'str_in_parens': ('a'), # Not a tuple
      'str_in_tuple': ('a',),
      }
  for type in [Sequence, NonStrSequence]:
      for k,v in tests.items():
          print(f'{k}: isinstance({v}, {type}): {isinstance(v, type)}')

回复收藏 0 原文

~没有更多了~