返回 Python 字符串中第一个非空白字符的最低索引

发布于 2024-08-23 23:53:05 字数 94 浏览 9 评论 0原文

在 Python 中执行此操作的最短方法是什么?

string = "   xyz"

必须返回索引 = 3

What's the shortest way to do this in Python?

string = "   xyz"

must return index = 3

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(7

九局 2024-08-30 23:53:05
>>> s = "   xyz"
>>> len(s) - len(s.lstrip())
3
>>> s = "   xyz"
>>> len(s) - len(s.lstrip())
3
情深如许 2024-08-30 23:53:05
>>> next(i for i, j in enumerate('   xyz') if j.strip())
3

或者

>>> next(i for i, j in enumerate('   xyz') if j not in string.whitespace)
3

在 Python 版本中 < 2.5 你必须这样做:

(...).next()
>>> next(i for i, j in enumerate('   xyz') if j.strip())
3

or

>>> next(i for i, j in enumerate('   xyz') if j not in string.whitespace)
3

in versions of Python < 2.5 you'll have to do:

(...).next()
就此别过 2024-08-30 23:53:05

看起来“正则表达式可以做任何事情”大队已经休息了一天,所以我将填写:

>>> tests = [u'foo', u' foo', u'\xA0foo']
>>> import re
>>> for test in tests:
...     print len(re.match(r"\s*", test, re.UNICODE).group(0))
...
0
1
1
>>>

FWIW:花费的时间是 O(the_answer),而不是 O(len(input_string))

Looks like the "regexes can do anything" brigade have taken the day off, so I'll fill in:

>>> tests = [u'foo', u' foo', u'\xA0foo']
>>> import re
>>> for test in tests:
...     print len(re.match(r"\s*", test, re.UNICODE).group(0))
...
0
1
1
>>>

FWIW: time taken is O(the_answer), not O(len(input_string))

呆° 2024-08-30 23:53:05

许多先前的解决方案在其提出的解决方案中的多个点上进行迭代。有些会复制数据(字符串)。 re.match()、strip()、enumerate()、isspace() 都在幕后重复工作。 。

next(idx for idx, chr in enumerate(string) if not chr.isspace())
next(idx for idx, chr in enumerate(string) if not chr.whitespace)

对于针对各种领先的空白类型(例如垂直制表符等)测试字符串来说,这是不错的选择,但这也增加了成本

但是,如果您的字符串仅使用空格字符或制表符,那么以下更基本的解决方案、清晰且快速的解决方案也会使用更少的内存。

def get_indent(astr):

    """Return index of first non-space character of a sequence else False."""

    try:
        iter(astr)
    except:
        raise

    # OR for not raising exceptions at all
    # if hasattr(astr,'__getitem__): return False

    idx = 0
    while idx < len(astr) and astr[idx] == ' ':
        idx += 1
    if astr[0] <> ' ':
        return False
    return idx

尽管这在视觉上可能不是绝对最快或最简单的,但此解决方案的一些好处是您可以轻松地将其转移到其他语言和 Python 版本。并且可能是最容易调试的,因为几乎没有什么神奇的行为。如果您将函数的核心内容与您的代码而不是放在函数中,那么您将删除函数调用部分,并使该解决方案在字节代码中与其他解决方案类似。

此外,该解决方案允许更多变化。例如添加选项卡测试

or astr[idx] == '\t':

或者您可以测试整个数据是否可迭代一次,而不是检查每行是否可迭代。请记住,“”[0] 会引发异常,而“”[0:] 不会引发异常。

如果您想将解决方案推向内联,您可以采用非 Pythonic 路线:

i = 0
while i < len(s) and s[i] == ' ': i += 1

print i
3


Many of the previous solutions are iterating at several points in their proposed solutions. And some make copies of the data (the string). re.match(), strip(), enumerate(), isspace()are duplicating behind the scene work. The

next(idx for idx, chr in enumerate(string) if not chr.isspace())
next(idx for idx, chr in enumerate(string) if not chr.whitespace)

are good choices for testing strings against various leading whitespace types such as vertical tabs and such, but that adds costs too.

However if your string uses just a space characters or tab charachers then the following, more basic solution, clear and fast solution also uses the less memory.

def get_indent(astr):

    """Return index of first non-space character of a sequence else False."""

    try:
        iter(astr)
    except:
        raise

    # OR for not raising exceptions at all
    # if hasattr(astr,'__getitem__): return False

    idx = 0
    while idx < len(astr) and astr[idx] == ' ':
        idx += 1
    if astr[0] <> ' ':
        return False
    return idx

Although this may not be the absolute fastest or simpliest visually, some benefits with this solution are that you can easily transfer this to other languages and versions of Python. And is likely the easiest to debug, as there is little magic behavior. If you put the meat of the function in-line with your code instead of in a function you'd remove the function call part and would make this solution similar in byte code to the other solutions.

Additionally this solution allows for more variations. Such as adding a test for tabs

or astr[idx] == '\t':

Or you can test the entire data as iterable once instead of checking if each line is iterable. Remember things like ""[0] raises an exception whereas ""[0:] does not.

If you wanted to push the solution to inline you could go the non-Pythonic route:

i = 0
while i < len(s) and s[i] == ' ': i += 1

print i
3

.
.

孤寂小茶 2024-08-30 23:53:05
import re
def prefix_length(s):
   m = re.match('(\s+)', s)
   if m:
      return len(m.group(0))
   return 0
import re
def prefix_length(s):
   m = re.match('(\s+)', s)
   if m:
      return len(m.group(0))
   return 0
2024-08-30 23:53:05
>>> string = "   xyz"
>>> next(idx for idx, chr in enumerate(string) if not chr.isspace())
3
>>> string = "   xyz"
>>> next(idx for idx, chr in enumerate(string) if not chr.isspace())
3
天涯离梦残月幽梦 2024-08-30 23:53:05
>>> string = "   xyz"
>>> map(str.isspace,string).index(False)
3
>>> string = "   xyz"
>>> map(str.isspace,string).index(False)
3
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文