用Python解析文本文件？

发布于 2024-08-08 10:54:55 字数 299 浏览 3 评论 0原文

我必须做一个作业，其中我有一个包含类似内容的 .txt 文件
p
没有人热爱痛苦本身、追求痛苦并想要拥有它，只是因为它很痛苦...

h1
这是这个文本文件的另一个例子，

我想编写一个 python 代码来解析这个文本文件并创建 xhtml 文件
我需要为这个项目找到一个起点，因为我对 python 很陌生，对很多东西都不熟悉。
这个Python代码应该从这个文本文件中获取每个“标签”并将它们放入一个xhtml文件中我希望我的要求对你有意义。
非常感谢任何帮助，
预先感谢！
-博扬

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

伤感在游骋 2024-08-15 10:54:55

你说你对 Python 很陌生，所以我将从非常低的级别开始。您可以在 Python 中非常简单地迭代文件中的行，

fyle = open("contents.txt")
for lyne in fyle :
    # Do string processing here
fyle.close()

现在如何解析它。如果每个格式指令（例如 p、h1）位于单独的行上，您可以轻松检查。我会建立一个处理程序字典并像这样获取处理程序：

handlers= {"p": # p tag handler
           "h1": # h1 tag handler
          }

# ... in the loop
    if lyne.rstrip() in handlers :  # strip to remove trailing whitespace
        # close current handler?
        # start new handler?
    else :
        # pass string to current handler

您可以执行以下操作 Daniel Pryden 建议首先创建一个内存数据结构，然后序列化该 XHTML。在这种情况下，处理程序将知道如何构建与每个标签相对应的对象。但我认为更简单的解决方案，特别是如果您没有很多时间，您可以直接转到 XHTML，保留当前包含的标签的堆栈。在这种情况下，您的“处理程序”可能只是一些将标签写入输出文件/字符串的简单逻辑。

在不了解您问题的具体情况的情况下，我无法说更多。此外，我不想为你做所有的作业。这应该会给你一个好的开始。

You say you're very new to Python, so I'll start at the very low-level. You can iterate over the lines in a file very simply in Python

fyle = open("contents.txt")
for lyne in fyle :
    # Do string processing here
fyle.close()

Now how to parse it. If each formatting directive (e.g. p, h1), is on a separate line, you can check that easily. I'd build up a dictionary of handlers and get the handler like so:

handlers= {"p": # p tag handler
           "h1": # h1 tag handler
          }

# ... in the loop
    if lyne.rstrip() in handlers :  # strip to remove trailing whitespace
        # close current handler?
        # start new handler?
    else :
        # pass string to current handler

You could do what Daniel Pryden suggested and create an in-memory data structure first, and then serialize that the XHTML. In that case, the handlers would know how to build the objects corresponding to each tag. But I think the simpler solution, especially if you don't have lots of time, you have is just to go straight to XHTML, keeping a stack of the current enclosed tags. In that case your "handler" may just be some simple logic to write the tags to the output file/string.

I can't say more without knowing the specifics of your problem. And besides, I don't want to do all your homework for you. This should give you a good start.

回复收藏 0 原文

他是夢罘是命 2024-08-15 10:54:55

我不会直接从您描述的文本文件转换为 XHTML 文件，而是首先将其转换为中间内存表示形式。

因此，我将构建类来表示 p 和 h1 标记，然后遍历文本文件并构建这些对象并将它们放入列表中（或者甚至是更复杂的列表）对象，但从文件的外观来看，列表应该足够了）。然后，我将该列表传递给另一个函数，该函数将循环遍历 p 和 h1 对象并将它们输出为 XHTML。

作为额外的好处，我将使每个标记对象（例如 Paragraph 和 Heading1 类）实现一个 as_xhtml() 方法，并委托对该方法的实际格式化。那么 XHTML 输出循环可能类似于：

for tag in input_tags:
    xhtml_file.write(tag.as_xhtml())

Rather than going directly from the text file you describe to an XHTML file, I would transform it into an intermediate in-memory representation first.

So I would build classes to represent the p and h1 tags, and then go through the text file and build those objects and put them into a list (or even a more complex object, but from the looks of your file a list should be sufficient). Then I would pass the list to another function that would loop through the p and h1 objects and output them as XHTML.

As an added bonus, I would make each tag object (say, Paragraph and Heading1 classes) implement an as_xhtml() method, and delegate the actual formatting to that method. Then the XHTML output loop could be something like:

for tag in input_tags:
    xhtml_file.write(tag.as_xhtml())

回复收藏 0 原文

~没有更多了~

关于作者

我为君王

暂无简介

0 文章

0 评论

22 人气

关注发私信

友情链接

文江博客

用Python解析文本文件？

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（2）

关于作者

相关话题

热门标签

推荐作者

lioqio

Single

禾厶谷欠

alipaysp_2zg8elfGgC

qq_N6d4X7

放低过去

友情链接

用Python解析文本文件？

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（2）

关于作者

相关话题

热门标签

推荐作者

lioqio

Single

禾厶谷欠

alipaysp_2zg8elfGgC

qq_N6d4X7

放低过去

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。