使用 Pygments 过滤空格和换行符

发布于 2024-10-25 05:42:34 字数 1030 浏览 6 评论 0原文

我一直在尝试向我的 django 站点添加语法突出显示。问题是我也对   和字符进行了格式化。有没有办法保留这些字符？这是我正在使用的代码：

from BeautifulSoup  import BeautifulSoup
from django import template
from django.template.defaultfilters import stringfilter
import pygments
import pygments.formatters
import pygments.lexers


register = template.Library()

@register.filter
@stringfilter
def pygmentized(html):
    soup = BeautifulSoup(html)
    codeblocks = soup.findAll('code')
    for block in codeblocks:
        if block.has_key('class'):
            try:
                code = ''.join([unicode(item) for item in block.contents])
                lexer = pygments.lexers.get_lexer_by_name(block['class'], stripall=True)
                formatter = pygments.formatters.HtmlFormatter()
                code_hl = pygments.highlight(code, lexer, formatter)
                block.contents = [BeautifulSoup(code_hl)]
                block.name = 'code'
            except:
                raise
    return unicode(soup)

原文

I've been trying to add syntax highlighting to my django site. The problem is I'm getting the and <br /> characters formatted as well. Is there a way to preserve these characterss? Here is the code I'm using:

from BeautifulSoup  import BeautifulSoup
from django import template
from django.template.defaultfilters import stringfilter
import pygments
import pygments.formatters
import pygments.lexers


register = template.Library()

@register.filter
@stringfilter
def pygmentized(html):
    soup = BeautifulSoup(html)
    codeblocks = soup.findAll('code')
    for block in codeblocks:
        if block.has_key('class'):
            try:
                code = ''.join([unicode(item) for item in block.contents])
                lexer = pygments.lexers.get_lexer_by_name(block['class'], stripall=True)
                formatter = pygments.formatters.HtmlFormatter()
                code_hl = pygments.highlight(code, lexer, formatter)
                block.contents = [BeautifulSoup(code_hl)]
                block.name = 'code'
            except:
                raise
    return unicode(soup)

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

娇纵 2024-11-01 05:42:34

嗯，Petri 是对的，pre 用于代码块。在他指出我刚刚编写了一个函数来清理第一个输出之前，它很混乱，但也许只需要从最终输出中删除某些内容的人可能会发现它没问题：

from BeautifulSoup  import BeautifulSoup
from django import template
from django.template.defaultfilters import stringfilter
import pygments
import pygments.formatters
import pygments.lexers


register = template.Library()
wanted = {'br': '<br />', 'BR': '<BR />', 'nbsp': ' ', 'NBSP': '&NBSP;', '/>': ''}

def uglyfilter(html):
    content = BeautifulSoup(html)
    for node in content.findAll('span'):
        data = ''.join(node.findAll(text=True))
        if wanted.has_key(data):
            node.replaceWith(wanted.get(data))
    return unicode(content)     


@register.filter
@stringfilter
def pygmentized(html):
    soup = BeautifulSoup(html)
    codeblocks = soup.findAll('pre')
    for block in codeblocks:
        if block.has_key('class'):
            try:
                code = ''.join([unicode(item) for item in block.contents])
                lexer = pygments.lexers.get_lexer_by_name(block['class'], stripall=True)
                formatter = pygments.formatters.HtmlFormatter()
                code_hl = pygments.highlight(code, lexer, formatter)
                clean = uglyfilter(code_hl)
                block.contents = [BeautifulSoup(clean)]
                block.name = 'pre'
            except:
                raise
    return unicode(soup)

Well, Petri is right, pre is for code blocks. Before he pointed that out I just wrote a function to clean the first output, it's messy but maybe someone who just needs to remove certain things from the final output might find it ok:

from BeautifulSoup  import BeautifulSoup
from django import template
from django.template.defaultfilters import stringfilter
import pygments
import pygments.formatters
import pygments.lexers


register = template.Library()
wanted = {'br': '<br />', 'BR': '<BR />', 'nbsp': ' ', 'NBSP': '&NBSP;', '/>': ''}

def uglyfilter(html):
    content = BeautifulSoup(html)
    for node in content.findAll('span'):
        data = ''.join(node.findAll(text=True))
        if wanted.has_key(data):
            node.replaceWith(wanted.get(data))
    return unicode(content)     


@register.filter
@stringfilter
def pygmentized(html):
    soup = BeautifulSoup(html)
    codeblocks = soup.findAll('pre')
    for block in codeblocks:
        if block.has_key('class'):
            try:
                code = ''.join([unicode(item) for item in block.contents])
                lexer = pygments.lexers.get_lexer_by_name(block['class'], stripall=True)
                formatter = pygments.formatters.HtmlFormatter()
                code_hl = pygments.highlight(code, lexer, formatter)
                clean = uglyfilter(code_hl)
                block.contents = [BeautifulSoup(clean)]
                block.name = 'pre'
            except:
                raise
    return unicode(soup)

回复收藏 0 原文

~没有更多了~