python-markdown htmlStash 占位符未被替换

发布于 2024-11-29 16:46:19 字数 3187 浏览 1 评论 0原文

我目前正在使用 django 开发一个 Web 应用程序,并使用 python-markdown 将 markdown 转换为 HTML。 Markdown 目前无法处理几种情况,因此编写了一些基本扩展。

"""

Helps make paras for Less framework

@div large-column float-left

# This is an H1

this is a paragraph right here!

and a new one

## Heading 2

and yet another one

--> becomes -->

<div class="large-column float left">
    <h1>This is an H1</h1>
    <p>this is a paragraph right here!</p>
    <p>and a new one</p>
    <h2>Heading 2</h2>
    <p>and yet another one</p>
</div>

"""

import re
import markdown

# Global vars

LESS_BLOCK_RE = re.compile( \
    r'@(?P<tag>div|span)[ ]*(?P<class>[a-zA-z0-9-\ ^\n]+)[ ]*\n(?P<inner>.*)(?=div|span)?',
    re.MULTILINE|re.DOTALL
    )

class LessFrameworkExtension(markdown.Extension):

    def extendMarkdown(self, md, md_globals):
        md.registerExtension(self)

        md.preprocessors.add('less_framework', LessBlockPreprocessor(md),'_begin')

    def reset(self):
        print 'resetting'

class LessBlockPreprocessor(markdown.preprocessors.Preprocessor):

    def __init__(self, md):
        markdown.preprocessors.Preprocessor.__init__(self, md)

    def getConfig(self, key):
        if key in self.config:
            return self.config[key][0]
        else:
            return None

    def run(self, lines):
        """ Match and store Less Framework Blocks in the HTML Stash """

        text = "\n".join(lines)

        while 1:
            m = LESS_BLOCK_RE.search(text)
            if m:
                less_tag = m.group('tag')
                less_class = m.group('class')
                less_inner = m.group('inner')

                print less_tag
                print less_class
                print less_inner

                placeholder = self.markdown.htmlStash.store(less_inner, safe=True)
                text = '<%s class="%s">\n%s\n</%s>' % (less_tag, less_class, placeholder, less_tag)
            else:
                break
        return text.split("\n")

    def _escape(self, txt):
        """ basic html escaping """
        txt = txt.replace('&', '&amp;')
        txt = txt.replace('<', '&lt;')
        txt = txt.replace('>', '&gt;')
        txt = txt.replace('"', '&quot;')
        return txt

def makeExtension(configs):
    return LessFrameworkExtension(configs)

上面的扩展可以部分工作,但输出是:

<div class="large-column float-left
">
wzxhzdk:0
</div>'

这似乎是 htmlStash 存储占位符。也许我错过了对 python-markdown 的调用?查看 python-markdown 项目中的类似扩展,看来我的方法是一致的。

任何帮助将不胜感激!

示例输入和预期输出

@div large-column float-left

# This is an H1

this is a paragraph right here!

and a new one

## Heading 2

and yet another one

扩展降价 -->变成-->超文本标记语言

<div class="large-column float left">
    <h1>This is an H1</h1>
    <p>this is a paragraph right here!</p>
    <p>and a new one</p>
    <h2>Heading 2</h2>
    <p>and yet another one</p>
</div>

I am currently developing a web application using django and and using python-markdown to convert markdown into HTML. There are a couple of situations that markdown currently doesn't handle, and as such have written a couple of basic extensions.

"""

Helps make paras for Less framework

@div large-column float-left

# This is an H1

this is a paragraph right here!

and a new one

## Heading 2

and yet another one

--> becomes -->

<div class="large-column float left">
    <h1>This is an H1</h1>
    <p>this is a paragraph right here!</p>
    <p>and a new one</p>
    <h2>Heading 2</h2>
    <p>and yet another one</p>
</div>

"""

import re
import markdown

# Global vars

LESS_BLOCK_RE = re.compile( \
    r'@(?P<tag>div|span)[ ]*(?P<class>[a-zA-z0-9-\ ^\n]+)[ ]*\n(?P<inner>.*)(?=div|span)?',
    re.MULTILINE|re.DOTALL
    )

class LessFrameworkExtension(markdown.Extension):

    def extendMarkdown(self, md, md_globals):
        md.registerExtension(self)

        md.preprocessors.add('less_framework', LessBlockPreprocessor(md),'_begin')

    def reset(self):
        print 'resetting'

class LessBlockPreprocessor(markdown.preprocessors.Preprocessor):

    def __init__(self, md):
        markdown.preprocessors.Preprocessor.__init__(self, md)

    def getConfig(self, key):
        if key in self.config:
            return self.config[key][0]
        else:
            return None

    def run(self, lines):
        """ Match and store Less Framework Blocks in the HTML Stash """

        text = "\n".join(lines)

        while 1:
            m = LESS_BLOCK_RE.search(text)
            if m:
                less_tag = m.group('tag')
                less_class = m.group('class')
                less_inner = m.group('inner')

                print less_tag
                print less_class
                print less_inner

                placeholder = self.markdown.htmlStash.store(less_inner, safe=True)
                text = '<%s class="%s">\n%s\n</%s>' % (less_tag, less_class, placeholder, less_tag)
            else:
                break
        return text.split("\n")

    def _escape(self, txt):
        """ basic html escaping """
        txt = txt.replace('&', '&')
        txt = txt.replace('<', '<')
        txt = txt.replace('>', '>')
        txt = txt.replace('"', '"')
        return txt

def makeExtension(configs):
    return LessFrameworkExtension(configs)

The above extension works partially, but the output is:

<div class="large-column float-left
">
wzxhzdk:0
</div>'

This appears to be the htmlStash store placeholder. Perhaps I am missing a call to python-markdown? Looking at similar extensions in the python-markdown project, it appears that my approach is consistent.

Any help would be much appreciated!

Example Input and Expected Output

@div large-column float-left

# This is an H1

this is a paragraph right here!

and a new one

## Heading 2

and yet another one

Extended Markdown --> becomes --> HTML

<div class="large-column float left">
    <h1>This is an H1</h1>
    <p>this is a paragraph right here!</p>
    <p>and a new one</p>
    <h2>Heading 2</h2>
    <p>and yet another one</p>
</div>

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

塔塔猫 2024-12-06 16:46:19

我知道这是很久以前的事了,但是对于遇到这个问题并看到这篇文章的任何其他人(像我一样),您需要确保至少在 normalize_whitespace 步骤之后注册预处理器(这是剥离 unicode 字符 - - 以及 htmlstash 函数使用什么作为分隔符)。

在这种情况下

md.preprocessors.add('less_framework', LessBlockPreprocessor(md),'_begin')

应该是:

md.preprocessors.add('less_framework', LessBlockPreprocessor(md),'>normalize_whitespace')

更多信息在这里: https://github.com/Python-降价/降价/问题/222

I know this is from a long time ago, but for any others (like me) who run into this issue and see this post, you need to make sure to register the preprocessor after at least the normalize_whitespace step (which is stripping unicode characters -- and what the htmlstash function is using as delimiters).

in this case

md.preprocessors.add('less_framework', LessBlockPreprocessor(md),'_begin')

should be:

md.preprocessors.add('less_framework', LessBlockPreprocessor(md),'>normalize_whitespace')

More info here: https://github.com/Python-Markdown/markdown/issues/222

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文