删除文本文件中特定位置的换行符

发布于 2024-08-26 00:50:05 字数 295 浏览 11 评论 0原文

我有一个很大的文本文件,由于控制台宽度,该文件在第 80 列有换行符。文本文件中的许多行的长度都不是 80 个字符,并且不受换行符的影响。在伪代码中,这就是我想要的:

  • 迭代文件中的行
  • 如果行匹配此正则表达式模式:^(.{80})\n(.+)
    • 将此行替换为由 match.group(1) 和 match.group(2) 组成的新字符串。只需删除该行的换行符即可。
  • 如果该行与正则表达式不匹配,请跳过!

也许我不需要正则表达式来做到这一点?

I have a large textfile, which has linebreaks at column 80 due to console width. Many of the lines in the textfile are not 80 characters long, and are not affected by the linebreak. In pseudocode, this is what I want:

  • Iterate through lines in file
  • If line matches this regex pattern: ^(.{80})\n(.+)
    • Replace this line with a new string consisting of match.group(1) and match.group(2). Just remove the linebreak from this line.
  • If line doesn't match the regex, skip!

Maybe I don't need regex to do this?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

固执像三岁 2024-09-02 00:50:05
f=open("file")
for line in f:
    if len(line)==81:
       n=f.next()
       line=line.rstrip()+n
    print line.rstrip()
f.close()
f=open("file")
for line in f:
    if len(line)==81:
       n=f.next()
       line=line.rstrip()+n
    print line.rstrip()
f.close()
小霸王臭丫头 2024-09-02 00:50:05

这是一些应该解决问题的代码

def remove_linebreaks(textfile, position=81):
    """
    textfile : an file opened in 'r' mode
    position : the index on a line at which \n must be removed

    return a string with the \n at position removed
    """
    fixed_lines = []
    for line in textfile:
        if len(line) == position:
            line = line[:position]
        fixed_lines.append(line)
    return ''.join(fixed_lines)

请注意,与伪代码相比,这将合并任意数量的连续折叠线。

Here's some code which should to the trick

def remove_linebreaks(textfile, position=81):
    """
    textfile : an file opened in 'r' mode
    position : the index on a line at which \n must be removed

    return a string with the \n at position removed
    """
    fixed_lines = []
    for line in textfile:
        if len(line) == position:
            line = line[:position]
        fixed_lines.append(line)
    return ''.join(fixed_lines)

Note that compared to your pseudo code, this will merge any number of consecutive folded lines.

短暂陪伴 2024-09-02 00:50:05

考虑一下这一点。

def merge_lines( line_iter ):
    buffer = ''
    for line in line_iter:
        if len(line) <= 80:
            yield buffer + line
            buffer= ''
        else:
            buffer += line[:-1] # remove '\n'

with open('myFile','r') as source:
    with open('copy of myFile','w') as destination:
        for line in merge_lines( source ):
            destination.write(line)

我发现显式生成器函数可以更轻松地测试和调试脚本的基本逻辑,而无需创建模拟文件系统或进行大量花哨的设置和拆卸来进行测试。

Consider this.

def merge_lines( line_iter ):
    buffer = ''
    for line in line_iter:
        if len(line) <= 80:
            yield buffer + line
            buffer= ''
        else:
            buffer += line[:-1] # remove '\n'

with open('myFile','r') as source:
    with open('copy of myFile','w') as destination:
        for line in merge_lines( source ):
            destination.write(line)

I find that an explicit generator function makes it much easier to test and debug the essential logic of the script without having to create mock filesystems or do lots of fancy setup and teardown for testing.

吃→可爱长大的 2024-09-02 00:50:05

以下是如何使用正则表达式来归档此内容的示例。但正则表达式并不是所有地方的最佳解决方案,在这种情况下,我认为不使用正则表达式会更有效。无论如何,这是解决方案:

text = re.sub(r'(?<=^.{80})\n', '', text)

当您使用可调用对象调用 re.sub 时,您也可以使用正则表达式:

text = re.sub(r'^(.{80})\n(.+)', lambda m: m.group(1)+m.group(2), text)

Here is an example of how to use regular expressions to archive this. But regular expressions aren't the best solution everywhere and in this case, i think not using regular expressions is more efficient. Anyway, here is the solution:

text = re.sub(r'(?<=^.{80})\n', '', text)

You can also use the your regular expression when you call re.sub with a callable:

text = re.sub(r'^(.{80})\n(.+)', lambda m: m.group(1)+m.group(2), text)
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文