当前位置：文江博客话题详情

如何在Python中的非打印ascii字符处分割行

发布于 2024-09-03 14:38:26 字数 79 浏览 13 评论 0原文

如何在Python中的非打印ascii字符处分割一行（例如长减号十六进制0x97，八进制227）？我不需要角色本身。后面的信息将被保存为变量。

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

月野兔 2024-09-10 14:38:26

您可以使用re.split。

>>> import re
>>> re.split('\W+', 'Words, words, words.')
['Words', 'words', 'words', '']

调整模式以仅包含您想要保留的字符。

另请参阅：stripping-non-printable-characters-from-a -string-in-python

示例（带长减号）：

>>> # \xe2\x80\x93 represents a long dash (or long minus)
>>> s = 'hello – world'
>>> s
'hello \xe2\x80\x93 world'
>>> import re
>>> re.split("\xe2\x80\x93", s)
['hello ', ' world']

或者，与 unicode 相同：

>>> # \u2013 represents a long dash, long minus or so called en-dash
>>> s = u'hello – world'
>>> s
u'hello \u2013 world'
>>> import re
>>> re.split(u"\u2013", s)
[u'hello ', u' world']

You can use re.split.

>>> import re
>>> re.split('\W+', 'Words, words, words.')
['Words', 'words', 'words', '']

Adjust the pattern to only include the characters you want to keep.

Example (w/ the long minus):

>>> # \xe2\x80\x93 represents a long dash (or long minus)
>>> s = 'hello – world'
>>> s
'hello \xe2\x80\x93 world'
>>> import re
>>> re.split("\xe2\x80\x93", s)
['hello ', ' world']

Or, the same with unicode:

>>> # \u2013 represents a long dash, long minus or so called en-dash
>>> s = u'hello – world'
>>> s
u'hello \u2013 world'
>>> import re
>>> re.split(u"\u2013", s)
[u'hello ', u' world']

回复收藏 0 原文

似最初 2024-09-10 14:38:26

_, _, your_result= your_input_string.partition('\x97')

或者

your_result= your_input_string.partition('\x97')[2]

如果 your_input_string 不包含 '\x97'，则 your_result 将为空。如果 your_input_string 包含多个 '\x97' 个字符，your_result 将包含第一个 '\ x97' 字符，包括其他 '\x97' 字符。

_, _, your_result= your_input_string.partition('\x97')

your_result= your_input_string.partition('\x97')[2]

If your_input_string does not contain a '\x97', then your_result will be empty. If your_input_string contains multiple '\x97' characters, your_result will contain everything after the first '\x97' character, including other '\x97' characters.

回复收藏 0 原文

浮生未歇 2024-09-10 14:38:26

只需使用 string/unicode split 方法（他们并不真正关心您拆分的字符串（除了它是一个常量之外。如果您想使用正则表达式，请使用 re.split）

要获取拆分字符串，请转义它就像其他人所展示的那样
"\x97"

或

使用 chr(0x97) 表示字符串 (0-255) 或使用 unichr(0x97) 表示 unicode，

示例如下

'will not be split'.split(chr(0x97))

'will be split here:\x97 and this is the second string'.split(chr(0x97))

Just use the string/unicode split method (They don't really care about the string you split upon (other than it is a constant. If you want to use a Regex then use re.split)

To get the split string either escape it like the other people have shown
"\x97"

use chr(0x97) for strings (0-255) or unichr(0x97) for unicode

so an example would be

'will not be split'.split(chr(0x97))

'will be split here:\x97 and this is the second string'.split(chr(0x97))

回复收藏 0 原文

~没有更多了~

关于作者

我ぃ本無心為│何有愛

暂无简介

文章

27 人气

关注发私信

友情链接

文江博客

如何在Python中的非打印ascii字符处分割行

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（3）

关于作者

相关话题

热门标签

推荐作者

櫻之舞

弥枳

m2429

寻找一个思念的角度

野却迷人

我怀念的。

友情链接

如何在Python中的非打印ascii字符处分割行

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（3）

关于作者

相关话题

热门标签

推荐作者

櫻之舞

弥枳

m2429

寻找一个思念的角度

野却迷人

我怀念的。

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。