带撇号的 Python title()

发布于 2024-12-16 18:17:33 字数 181 浏览 2 评论 0原文

有没有办法使用 .title() 从带撇号的标题中获取正确的输出？例如：

"john's school".title() --> "John'S School"

我如何在这里获得正确的标题，“John's School”？

原文

Is there a way to use .title() to get the correct output from a title with apostrophes? For example:

"john's school".title() --> "John'S School"

How would I get the correct title here, "John's School" ?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

玩世 2024-12-23 18:17:33

如果您的标题不包含连续的多个空白字符（这将被折叠），您可以使用字符串。 capwords() 改为：

>>> import string
>>> string.capwords("john's school")
"John's School"

编辑： 正如 Chris Morgan 在下面正确所说，您可以通过在 sep< 中指定 " " 来缓解空白折叠问题/代码> 参数：

>>> string.capwords("john's    school", " ")
"John's    School"

If your titles do not contain several whitespace characters in a row (which would be collapsed), you can use string.capwords() instead:

>>> import string
>>> string.capwords("john's school")
"John's School"

EDIT: As Chris Morgan rightfully says below, you can alleviate the whitespace collapsing issue by specifying " " in the sep argument:

>>> string.capwords("john's    school", " ")
"John's    School"

回复收藏 0 原文

旧城空念 2024-12-23 18:17:33

这在一般情况下很困难，因为某些单撇号后面可以合法地跟随大写字符，例如以“O'”开头的爱尔兰名字。 string.capwords() 在许多情况下都可以工作，但会忽略引号中的任何内容。 string.capwords("john'sprincipal said,'no'") 不会返回您可能期望的结果。

>>> capwords("John's School")
"John's School"
>>> capwords("john's principal says,'no'")
"John's Principal Says,'no'"
>>> capwords("John O'brien's School")
"John O'brien's School"

一个更烦人的问题是标题本身并不能产生正确的结果。例如，在美式英语中，冠词和介词在标题或标题中通常不大写。（《芝加哥风格手册》）。

>>> capwords("John clears school of spiders")
'John Clears School Of Spiders'
>>> "John clears school of spiders".title()
'John Clears School Of Spiders'

您可以轻松安装 titlecase 模块，它对您更有用，并且可以执行您想要的操作就像，没有大写字的问题。当然，仍然存在许多边缘情况，但您会走得更远，而不必过多担心个人编写的版本。

>>> titlecase("John clears school of spiders")
'John Clears School of Spiders'

This is difficult in the general case, because some single apostrophes are legitimately followed by an uppercase character, such as Irish names starting with "O'". string.capwords() will work in many cases, but ignores anything in quotes. string.capwords("john's principal says,'no'") will not return the result you may be expecting.

>>> capwords("John's School")
"John's School"
>>> capwords("john's principal says,'no'")
"John's Principal Says,'no'"
>>> capwords("John O'brien's School")
"John O'brien's School"

A more annoying issue is that title itself does not produce the proper results. For example, in American usage English, articles and prepositions are generally not capitalized in titles or headlines. (Chicago Manual of Style).

>>> capwords("John clears school of spiders")
'John Clears School Of Spiders'
>>> "John clears school of spiders".title()
'John Clears School Of Spiders'

You can easy_install the titlecase module that will be much more useful to you, and does what you like, without capwords's issues. There are still many edge cases, of course, but you'll get much further without worrying too much about a personally-written version.

>>> titlecase("John clears school of spiders")
'John Clears School of Spiders'

回复收藏 0 原文

乙白 2024-12-23 18:17:33

我认为 title() 可能会很棘手，

让我们尝试一些不同的东西：

def titlize(s):
    b = []
    for temp in s.split(' '): b.append(temp.capitalize())
    return ' '.join(b)

titlize("john's school")

// You get : John's School

希望有帮助..!!

I think that can be tricky with title()

Lets try out something different :

def titlize(s):
    b = []
    for temp in s.split(' '): b.append(temp.capitalize())
    return ' '.join(b)

titlize("john's school")

// You get : John's School

Hope that helps.. !!

回复收藏 0 原文

今天小雨转甜 2024-12-23 18:17:33

尽管其他答案很有帮助并且更简洁，但您可能会遇到一些问题。例如，如果字符串中有新行或制表符。此外，连字符的单词（无论是常规连字符还是不间断连字符）以及以撇号开头的单词在某些情况下可能会出现问题。但是，使用正则表达式（使用正则表达式替换参数的函数）可以解决这些问题：

import re

def title_capitalize(match):
    text=match.group()
    i=0
    new_text=""
    capitalized=False
    while i<len(text):
        if text[i] not in {"’", "'"} and capitalized==False:
            new_text+=text[i].upper()
            capitalized=True
        else:
            new_text+=text[i].lower()
        i+=1
    return new_text

def title(the_string):
    return re.sub(r"[\w'’‑-]+", title_capitalize, the_string)

s="here's an apostrophe es. this string has multiple         spaces\nnew\n\nlines\nhyphenated words: and non-breaking   spaces, and a non‑breaking hyphen, as well as 'ords that begin with ’strophies; it\teven\thas\t\ttabs."
print(title(s))

无论如何，您可以编辑它以弥补任何进一步的问题，例如反引号和“what-have-you”（如果需要）。

如果您认为标题大小写应保留介词、连词和冠词等小写，除非它们位于标题的开头或结尾，您可以尝试这样的代码（但有一些不明确的单词必须根据上下文来判断，例如when）：

import re

lowers={'this', 'upon', 'altogether', 'whereunto', 'across', 'between', 'and', 'if', 'as', 'over', 'above', 'afore', 'inside', 'like', 'besides', 'on', 'atop', 'about', 'toward', 'by', 'these', 'for', 'into', 'beforehand', 'unlike', 'until', 'in', 'aft', 'onto', 'to', 'vs', 'amid', 'towards', 'afterwards', 'notwithstanding', 'unto', 'while', 'next', 'including', 'thru', 'a', 'down', 'after', 'with', 'afterward', 'or', 'those', 'but', 'whereas', 'versus', 'without', 'off', 'among', 'because', 'some', 'against', 'before', 'around', 'of', 'under', 'that', 'except', 'at', 'beneath', 'out', 'amongst', 'the', 'from', 'per', 'mid', 'behind', 'along', 'outside', 'beyond', 'up', 'past', 'through', 'beside', 'below', 'during'}

def title_capitalize(match, use_lowers=True):
    text=match.group()
    lower=text.lower()
    if lower in lowers and use_lowers==True:
        return lower
    else:
        i=0
        new_text=""
        capitalized=False
        while i<len(text):
            if text[i] not in {"’", "'"} and capitalized==False:
                new_text+=text[i].upper()
                capitalized=True
            else:
                new_text+=text[i].lower()
            i+=1
        return new_text

def title(the_string):
    first=re.sub(r"[\w'’‑-]+", title_capitalize, the_string)
    return re.sub(r"(^[\w'’‑-]+)|([\w'’‑-]+$)", lambda match : title_capitalize(match, use_lowers=False), first)

Although the other answers are helpful, and more concise, you may run into some problems with them. For example, if there are new lines or tabs in your string. Also, hyphenated words (whether with regular or non-breaking hyphens) may be a problem in some instances, as well as words that begin with apostrophes. However, using regular expressions (using a function for the regular expression replacement argument) you can solve these problems:

import re

def title_capitalize(match):
    text=match.group()
    i=0
    new_text=""
    capitalized=False
    while i<len(text):
        if text[i] not in {"’", "'"} and capitalized==False:
            new_text+=text[i].upper()
            capitalized=True
        else:
            new_text+=text[i].lower()
        i+=1
    return new_text

def title(the_string):
    return re.sub(r"[\w'’‑-]+", title_capitalize, the_string)

s="here's an apostrophe es. this string has multiple         spaces\nnew\n\nlines\nhyphenated words: and non-breaking   spaces, and a non‑breaking hyphen, as well as 'ords that begin with ’strophies; it\teven\thas\t\ttabs."
print(title(s))

Anyway, you can edit this to compensate for any further problems, such as backticks and what-have-you, if needed.

If you're of the opinion that title casing should keep such as prepositions, conjunctions and articles lowercase unless they're at the beginning or ending of the title, you can try such as this code (but there are a few ambiguous words that you'll have to figure out by context, such as when):

import re

lowers={'this', 'upon', 'altogether', 'whereunto', 'across', 'between', 'and', 'if', 'as', 'over', 'above', 'afore', 'inside', 'like', 'besides', 'on', 'atop', 'about', 'toward', 'by', 'these', 'for', 'into', 'beforehand', 'unlike', 'until', 'in', 'aft', 'onto', 'to', 'vs', 'amid', 'towards', 'afterwards', 'notwithstanding', 'unto', 'while', 'next', 'including', 'thru', 'a', 'down', 'after', 'with', 'afterward', 'or', 'those', 'but', 'whereas', 'versus', 'without', 'off', 'among', 'because', 'some', 'against', 'before', 'around', 'of', 'under', 'that', 'except', 'at', 'beneath', 'out', 'amongst', 'the', 'from', 'per', 'mid', 'behind', 'along', 'outside', 'beyond', 'up', 'past', 'through', 'beside', 'below', 'during'}

def title_capitalize(match, use_lowers=True):
    text=match.group()
    lower=text.lower()
    if lower in lowers and use_lowers==True:
        return lower
    else:
        i=0
        new_text=""
        capitalized=False
        while i<len(text):
            if text[i] not in {"’", "'"} and capitalized==False:
                new_text+=text[i].upper()
                capitalized=True
            else:
                new_text+=text[i].lower()
            i+=1
        return new_text

def title(the_string):
    first=re.sub(r"[\w'’‑-]+", title_capitalize, the_string)
    return re.sub(r"(^[\w'’‑-]+)|([\w'’‑-]+$)", lambda match : title_capitalize(match, use_lowers=False), first)

回复收藏 0 原文

凉月流沐 2024-12-23 18:17:33

恕我直言，最佳答案是@Frédéric 的。但是，如果您已经将字符串分隔为单词，并且知道 string.capwords 是如何实现的，那么您可以避免不必要的连接步骤：

def capwords(s, sep=None):
    return (sep or ' ').join(
        x.capitalize() for x in s.split(sep)
    )

因此，您可以这样做：

# here my_words == ['word1', 'word2', ...]
s = ' '.join(word.capitalize() for word in my_words)

IMHO, best answer is @Frédéric's one. But if you already have your string separated to words, and you know how string.capwords is implemeted, then you can avoid unneeded joining step:

def capwords(s, sep=None):
    return (sep or ' ').join(
        x.capitalize() for x in s.split(sep)
    )

As a result, you can just do this:

# here my_words == ['word1', 'word2', ...]
s = ' '.join(word.capitalize() for word in my_words)

回复收藏 0 原文

二智少女 2024-12-23 18:17:33

如果您必须满足破折号，请使用：

import string
" ".join(
    string.capwords(word, sep="-")
    for word in string.capwords(
        "john's school at bel-red"
    ).split()
)
# "John's School At Bel-Red"

If you have to cater for dashes then use:

import string
" ".join(
    string.capwords(word, sep="-")
    for word in string.capwords(
        "john's school at bel-red"
    ).split()
)
# "John's School At Bel-Red"

回复收藏 0 原文

~没有更多了~