在python中截断一条长字符串 - 但仅在特定字符之后
我使用textwrap
将长字符串分为块,每个字符限制为280个字符。我不希望分裂随机发生。它只能在特定字符之后发生。在我的情况下,€
符号和单个线路断开\ n
。
这是我的代码:
query = 'Lorem ipsum dolor\n\n Lorem ipsum 0.5€\n Lorem ipsum 0.2€\n (...)'
for item in [query]:
# obtain length of string
item_length = len(item)
# check length
if item_length <= 280:
# do something here
elif item_length >= 280:
item_length_limit = item_length / 280
# determine the number of items
item_chunk_length = item_length / math.ceil(item_length_limit)
# chunk the item into individual pieces
item_chunks = textwrap.wrap(item, math.ceil(
item_chunk_length), break_long_words=False, replace_whitespace=False)
# iterate over the chunks
for x, chunk in zip(range(len(item_chunks)), item_chunks):
if x == 0:
print(f'{chunk} 1/{len(item_chunks)}')
else:
print(f'{chunk} {x+1}/{len(item_chunks)}')
当前输出(以方便起见为60个字符分开):
Lorem ipsum dolor\n\n Lorem ipsum 0.5€\n Lorem ipsum 1/3
dolor 0.2€\n Lorem ipsum 0.4€\n Lorem ipsum 0.4€\n Lorem 2/3
Ipsum 0.4€ 3/3
所需的输出:
Lorem ipsum dolor\n\n Lorem ipsum 0.5€\n 1/4
Lorem ipsum dolor 0.2€\n 2/4
Lorem ipsum 0.4€\n Lorem ipsum 0.4€\n 3/4
Lorem Ipsum 0.4€ 4/4
I use textwrap
to split a long string into chunks, limited to 280 characters each. I don't want the split to occur at random though; it should only occur after a specific character. In my case after the €
sign and a single line break \n
.
This is my code:
query = 'Lorem ipsum dolor\n\n Lorem ipsum 0.5€\n Lorem ipsum 0.2€\n (...)'
for item in [query]:
# obtain length of string
item_length = len(item)
# check length
if item_length <= 280:
# do something here
elif item_length >= 280:
item_length_limit = item_length / 280
# determine the number of items
item_chunk_length = item_length / math.ceil(item_length_limit)
# chunk the item into individual pieces
item_chunks = textwrap.wrap(item, math.ceil(
item_chunk_length), break_long_words=False, replace_whitespace=False)
# iterate over the chunks
for x, chunk in zip(range(len(item_chunks)), item_chunks):
if x == 0:
print(f'{chunk} 1/{len(item_chunks)}')
else:
print(f'{chunk} {x+1}/{len(item_chunks)}')
Current output (split at 60 characters for convenience):
Lorem ipsum dolor\n\n Lorem ipsum 0.5€\n Lorem ipsum 1/3
dolor 0.2€\n Lorem ipsum 0.4€\n Lorem ipsum 0.4€\n Lorem 2/3
Ipsum 0.4€ 3/3
Desired output:
Lorem ipsum dolor\n\n Lorem ipsum 0.5€\n 1/4
Lorem ipsum dolor 0.2€\n 2/4
Lorem ipsum 0.4€\n Lorem ipsum 0.4€\n 3/4
Lorem Ipsum 0.4€ 4/4
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
这不会是最好的算法,而是完成工作。
要打印出值,只需打印此功能的返回值
This won't be the best algorithm out there, but gets the job done.
to print the values out, just print the return value of this function
我不是100%确定我了解您的问题,但是您正在寻找类似的问题吗?
它将创建一个列表,其中每个条目是每次遇到'€\ n'字符时都会介于两者之间的片段。
I'm not 100% sure I understood your question, but are you looking for something like that ?
It will create a list where every entry is a snippet of your string in-between everytime you encounter the '€\n' characters.
这将有效地
将字符串分为数组,然后打印出结果
希望:)
This is will work
This splits the string into an array and then prints the result
Hope this helps :)