如何在Python中将一个很长的字符串拆分为较短的字符串列表
在我当前的 django 项目中,我有一个模型,它存储非常长的字符串(每个数据库条目可以是 5000-10000 甚至更多字符),然后当用户调用记录时我需要将它们拆分(它确实需要在一个记录在数据库中)。我需要的是它返回一个较短字符串的列表(查询集?取决于是否在“SQL”部分或按原样获取所有列表并在视图中进行解析)(我返回的列表中每个字符串 100 - 500 个字符)到模板)。
我在任何地方都找不到 python split 命令,也找不到示例或任何类型的答案....
我总是可以计算单词并附加但计算单词....但我确信必须有某种函数诸如此类的事情......
编辑:谢谢大家,但我想我没有被理解,
示例:
字符串:“这是一个非常长的字符串,有很多很多很多很多句子,没有一个字符可以用来分割,只能按单词数来分割”
该字符串是 django 模型的 textField。
我需要分割它,让我们说每5个单词,这样我会得到:
['这是一个非常长的字符串','有很多很多很多','还有更多的句子和','没有一个字符','我可以用来','分割,仅按数字','单词']
问题是,几乎每种编程语言都有按单词数分割”的实用函数,但我在 python 中找不到。
谢谢, 埃雷兹
In my current django project I have a model that stores very long strings (can be 5000-10000 or even more characters per DB entry) and then i need to split them when a user is calling the record (it really need to be in one record in the DB). What i need is it to return a list (queryset? depends if in the "SQL" part or getting all the list as is and doing the parsing in the view) of shorter strings (100 - 500 characters per sting in the list i return to the template).
I couldn't find anywhere a python split command nor example or any kind of answer for that....
I could always count words and append but count words.... but i am sure there has to be some kind of function for that sort of things....
EDIT: thank you everyone, but i guess i wasn't understood,
Example:
The String: "This is a very long string with many many many many and many more sentences and there is not one character that i can use to split by, just by number of words"
the string is a textField of django model.
i need to split it, lets say every 5 words so i will get:
['This is a very long string','with many many many many','and many more sentences and','there is not one character','that i can use to','split by, just by number',' of words']
The thing is that is almost every programming languages there is split per number of words" kind of utility function but i can't find one in python.
thanks,
Erez
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
这是一个想法:
它尝试将字符串分割成长度最多为
chunksize
的块。它尝试在空格处拆分,但如果不能,它会在单词中间拆分:我想它需要一些调整(例如如何处理单词内部发生的拆分),但它应该为您提供一个起点。
要按字数拆分,请执行以下操作:
Here is an idea:
This tries to split strings into chunks at most
chunksize
in length. It tries to split at spaces, but if it can't it splits in the middle of a word:I guess it requires some tweaking though (for instance how to handle splits that occur inside words), but it should give you a starting point.
To split by number of words, do this: