OpenTbs 将 html 标签转换为 MS Word 标签

发布于 2025-01-06 06:38:25 字数 622 浏览 1 评论 0原文

我正在使用 OpenTbs,http://www.tinybutstrong.com/plugins/opentbs/tbs_plugin_opentbs。 html

我有一个 template.docx 并且能够用内容替换字段,但如果内容具有 html 代码,它将显示在模板创建的文档中。

First list <br /> Second Line

我尝试使用:

$TBS->LoadTemplate('document.docx', OPENTBS_ALREADY_XML); 

认为这将允许我用 ms office 标签替换我的 html 标签,但它只是在文档中显示 MS Office 标签:

First Line<w:br/> Second Line

How do i conversion the HTML tag into the MS Office XML相当于。

I am using OpenTbs, http://www.tinybutstrong.com/plugins/opentbs/tbs_plugin_opentbs.html.

I have a template.docx and am able to replace fields with content but if the content has html code it is displayed in the document created by the template.

First list <br /> Second Line

I have tried to use:

$TBS->LoadTemplate('document.docx', OPENTBS_ALREADY_XML); 

Thinking this would allow me to replace my html tags with ms office tags, but it just showed the MS Office tags in the document instead:

First Line<w:br/> Second Line

How do i convert the HTML tags into the MS Office XML equivalent.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

等待圉鍢 2025-01-13 06:38:25

既然您有 HTML 到 DOCX 的转换函数,那么您可以使用自定义 PHP 函数和参数“onformat”在 OpenTBS 中实现它。

以下函数仅转换换行符:

function f_html2docx($FieldName, &$CurrVal) {
  $CurrVal= str_replace('<br />', '<w:br/>', $CurrVal);
} 

在 DOCX 模板中使用:

[b.thetext;onformat=f_html2docx]

关于将 HTML 转换为 DOCX:

将格式化文本转换为另一种格式化文本通常是一场噩梦。这就是为什么存储纯数据而不是格式化数据是明智的。

将 HTML 转换为 DOCX 是一场真正的噩梦,因为格式的结构不同。

例如,在 HTML 标签中,我可以嵌套,如下所示:

<i> hello <b> this is important </b> to know </i>

在 DOCX 中,它将呈现为交叉,如下所示:

  <w:r>
    <w:rPr><w:b/></w:rPr>
    <w:t>hello</w:t>
  </w:r>

  <w:r>
    <w:rPr><w:b/><w:i/></w:rPr>
    <w:t>this is important</w:t>
  </w:r>

  <w:r>
    <w:rPr><w:i/></w:rPr>
    <w:t>to know</w:t>
  </w:r>

目前,除了换行符之外,我没有其他转换标签的解决方案。对此感到抱歉。
我认为编写一个代码是相当困难的。

Since you have a conversion function for HTML to DOCX, then you can implement it in OpenTBS using a custom PHP function and parameter "onformat".

The following function only convert line breaks:

function f_html2docx($FieldName, &$CurrVal) {
  $CurrVal= str_replace('<br />', '<w:br/>', $CurrVal);
} 

Use in the DOCX template :

[b.thetext;onformat=f_html2docx]

About converting HTML to DOCX :

Converting a formated text into another formated text is quite often a nightmare. That's why is it wise to store the pure data instead if formated data.

Converting HTML to DOCX is a real nightmare because the formating is not structured the same way.

For example, in HTML tags may me nested, like this:

<i> hello <b> this is important </b> to know </i>

In DOCX it will be presented as crossing, like this :

  <w:r>
    <w:rPr><w:b/></w:rPr>
    <w:t>hello</w:t>
  </w:r>

  <w:r>
    <w:rPr><w:b/><w:i/></w:rPr>
    <w:t>this is important</w:t>
  </w:r>

  <w:r>
    <w:rPr><w:i/></w:rPr>
    <w:t>to know</w:t>
  </w:r>

I have no solution for converting tags other than line-breaks for now. Sorry for that.
And I think it would be quite difficult to code one.

ぃ弥猫深巷。 2025-01-13 06:38:25

感谢 Skrol 对我所有 openTBS 问题的意见,刚刚注意到你是它的创建者,这是一门很棒的课程,经过一天的努力学习 MS Word 格式后,你上面所说的是真的,我灵机一动,我我现在能够生成您上面指定的格式,并且可以有粗体斜体和下划线,这就是我所需要的,我希望这能为您提供改进的基础。

我基本上注意到,在您放置的示例中,您只需要一个样式数组,当您找到结束标签时,您将从样式数组中删除该样式数组。每次找到标签时,您都需要关闭 并创建一个新标签,我已经对其进行了测试,效果非常好。

class printClass {
    private static $currentStyles = array();    

    public function __construct() {}

    public function format($string) {
            if($string !=""){
            return preg_replace_callback("#<b>|<u>|<i>|</b>|</u>|</i>#",
                                        'printClass::replaceTags',
                                        $string);
        }else{
            return false;
        }
    }


    private static function applyStyles() {

        if(count(self::$currentStyles) > 0 ) {

            foreach(self::$currentStyles as $value) {

                if($value == "b") {
                    $styles .= "<w:b/>";
                }   

                if($value == "u") {
                    $styles .= "<w:u w:val=\"single\"/>";
                }   

                if($value == "i") {
                    $styles .= "<w:i/>";
                }
            }

            return "<w:rPr>" . $styles . "</w:rPr>";
        }else{
            return false;
        }
    }



    private static function replaceTags($matches) {

        if($matches[0] == "<b>") {
            array_push(self::$currentStyles, "b");
        }   

        if($matches[0] == "<u>") {
            array_push(self::$currentStyles, "u");
        }   

        if($matches[0] == "<i>") {
            array_push(self::$currentStyles, "i");
        }

        if($matches[0] == "</b>") {
            self::$currentStyles = array_diff(self::$currentStyles, array("b"));
        }   

        if($matches[0] == "</u>") {
            self::$currentStyles = array_diff(self::$currentStyles, array("u"));
        }   

        if($matches[0] == "</i>") {
            self::$currentStyles = array_diff(self::$currentStyles, array("i"));
        }

        return "</w:t></w:r><w:r>" . self::applyStyles() . "<w:t xml:space=\"preserve\">";
    }
}

Thanks Skrol for your input on all my openTBS issues, just noticed that you are the creator of it, its a great class and what you said above was true after a day of plowing through learning the MS Word Format i had a brain wave and I am now able to produce the format that you specified above and can have bold italic and underline which is all i require, I hope this gives you a foundation to improve upon.

I basically noticed that in the example you put you just need an array of the styles which when you find a closing tag you remove from the style array. Each time you find a tag you need to close the <w:r> and create a new one, I have tested it and it works wonderfully.

class printClass {
    private static $currentStyles = array();    

    public function __construct() {}

    public function format($string) {
            if($string !=""){
            return preg_replace_callback("#<b>|<u>|<i>|</b>|</u>|</i>#",
                                        'printClass::replaceTags',
                                        $string);
        }else{
            return false;
        }
    }


    private static function applyStyles() {

        if(count(self::$currentStyles) > 0 ) {

            foreach(self::$currentStyles as $value) {

                if($value == "b") {
                    $styles .= "<w:b/>";
                }   

                if($value == "u") {
                    $styles .= "<w:u w:val=\"single\"/>";
                }   

                if($value == "i") {
                    $styles .= "<w:i/>";
                }
            }

            return "<w:rPr>" . $styles . "</w:rPr>";
        }else{
            return false;
        }
    }



    private static function replaceTags($matches) {

        if($matches[0] == "<b>") {
            array_push(self::$currentStyles, "b");
        }   

        if($matches[0] == "<u>") {
            array_push(self::$currentStyles, "u");
        }   

        if($matches[0] == "<i>") {
            array_push(self::$currentStyles, "i");
        }

        if($matches[0] == "</b>") {
            self::$currentStyles = array_diff(self::$currentStyles, array("b"));
        }   

        if($matches[0] == "</u>") {
            self::$currentStyles = array_diff(self::$currentStyles, array("u"));
        }   

        if($matches[0] == "</i>") {
            self::$currentStyles = array_diff(self::$currentStyles, array("i"));
        }

        return "</w:t></w:r><w:r>" . self::applyStyles() . "<w:t xml:space=\"preserve\">";
    }
}
请远离我 2025-01-13 06:38:25
public function f_html2docx($currVal) {

    // handling <i> tag

    $el = 'i';
    $tag_open  = '<' . $el . '>';
    $tag_close = '</' . $el . '>';
    $nb = substr_count($currVal, $tag_open);

    if ( ($nb > 0) && ($nb == substr_count($currVal, $tag_open)) ) {
        $currVal= str_replace($tag_open,  '</w:t></w:r><w:r><w:rPr><w:i/></w:rPr><w:t>', $currVal);
        $currVal= str_replace($tag_close, '</w:t></w:r><w:r><w:t>', $currVal);
    }

    // handling <b> tag

    $el = 'b';
    $tag_open  = '<' . $el . '>';
    $tag_close = '</' . $el . '>';
    $nb = substr_count($currVal, $tag_open);

    if ( ($nb > 0) && ($nb == substr_count($currVal, $tag_open)) ) {
        $currVal= str_replace($tag_open,  '</w:t></w:r><w:r><w:rPr><w:b/></w:rPr><w:t>', $currVal);
        $currVal= str_replace($tag_close, '</w:t></w:r><w:r><w:t>', $currVal);
    }

    // handling <u> tag

    $el = 'u';
    $tag_open  = '<' . $el . '>';
    $tag_close = '</' . $el . '>';
    $nb = substr_count($currVal, $tag_open);

    if ( ($nb > 0) && ($nb == substr_count($currVal, $tag_open)) ) {
        $currVal= str_replace($tag_open,  '</w:t></w:r><w:r><w:rPr><w:u w:val="single"/></w:rPr><w:t>', $currVal);
        $currVal= str_replace($tag_close, '</w:t></w:r><w:r><w:t>', $currVal);
    }

    // handling <br> tag

    $el = 'br';
    $currVal= str_replace('<br />', '<w:br/>', $currVal);

    return $currVal;
}

public function f_handleUnsupportedTags($fieldValue){
    $fieldValue = strip_tags($fieldValue, '<b><i><u><br>');

    $fieldValue = str_replace(' ',' ',$fieldValue);
    $fieldValue = str_replace('<br>','<br />',$fieldValue);

    return $fieldValue;
}

现在将此函数称为:

$fieldVal = $this->f_html2docx($this->f_handleUnsupportedTags($fieldVal));
public function f_html2docx($currVal) {

    // handling <i> tag

    $el = 'i';
    $tag_open  = '<' . $el . '>';
    $tag_close = '</' . $el . '>';
    $nb = substr_count($currVal, $tag_open);

    if ( ($nb > 0) && ($nb == substr_count($currVal, $tag_open)) ) {
        $currVal= str_replace($tag_open,  '</w:t></w:r><w:r><w:rPr><w:i/></w:rPr><w:t>', $currVal);
        $currVal= str_replace($tag_close, '</w:t></w:r><w:r><w:t>', $currVal);
    }

    // handling <b> tag

    $el = 'b';
    $tag_open  = '<' . $el . '>';
    $tag_close = '</' . $el . '>';
    $nb = substr_count($currVal, $tag_open);

    if ( ($nb > 0) && ($nb == substr_count($currVal, $tag_open)) ) {
        $currVal= str_replace($tag_open,  '</w:t></w:r><w:r><w:rPr><w:b/></w:rPr><w:t>', $currVal);
        $currVal= str_replace($tag_close, '</w:t></w:r><w:r><w:t>', $currVal);
    }

    // handling <u> tag

    $el = 'u';
    $tag_open  = '<' . $el . '>';
    $tag_close = '</' . $el . '>';
    $nb = substr_count($currVal, $tag_open);

    if ( ($nb > 0) && ($nb == substr_count($currVal, $tag_open)) ) {
        $currVal= str_replace($tag_open,  '</w:t></w:r><w:r><w:rPr><w:u w:val="single"/></w:rPr><w:t>', $currVal);
        $currVal= str_replace($tag_close, '</w:t></w:r><w:r><w:t>', $currVal);
    }

    // handling <br> tag

    $el = 'br';
    $currVal= str_replace('<br />', '<w:br/>', $currVal);

    return $currVal;
}

public function f_handleUnsupportedTags($fieldValue){
    $fieldValue = strip_tags($fieldValue, '<b><i><u><br>');

    $fieldValue = str_replace(' ',' ',$fieldValue);
    $fieldValue = str_replace('<br>','<br />',$fieldValue);

    return $fieldValue;
}

Now call this function as:

$fieldVal = $this->f_html2docx($this->f_handleUnsupportedTags($fieldVal));
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文