根据标题标签自动生成嵌套目录

发布于 2024-10-16 08:36:09 字数 717 浏览 3 评论 0原文

你们中哪一位狡猾的程序员可以向我展示一种优雅的 php 编码解决方案,用于根据页面上的标题标签自动生成嵌套目录?

所以我有一个 html 文档:

<h1> Animals </h1>

Some content goes here.
Some content goes here.

<h2> Mammals </h2>

Some content goes here.
Some content goes here.

<h3> Terrestrial Mammals </h3>
Some content goes here.
Some content goes here.

<h3> Marine Mammals </h3>
Some content goes here.
Some content goes here.

<h4> Whales </h4>
Some content goes here.
Some content goes here.

更具体地说,我想要一个链接的目录,其形式是同一页面上标题链接的嵌套列表:

目录(由 PHP 代码自动生成)

  1. 动物
    1. 哺乳动物
      1. 陆生哺乳动物
      2. 海洋哺乳动物
        1. 鲸鱼

Which one of you crafty programmers can show me an elegant php coded solution for automatically generating a nested table of contents based on heading tags on the page?

So I have a html document thus:

<h1> Animals </h1>

Some content goes here.
Some content goes here.

<h2> Mammals </h2>

Some content goes here.
Some content goes here.

<h3> Terrestrial Mammals </h3>
Some content goes here.
Some content goes here.

<h3> Marine Mammals </h3>
Some content goes here.
Some content goes here.

<h4> Whales </h4>
Some content goes here.
Some content goes here.

More specifically, I want a linked table of contents in the form of a nested list of links to headings on the same page:

Table of Contents (automatically generated by PHP code)

  1. Animals
    1. Mammals
      1. Terrestrial_Mammals
      2. Marine_Mammals
        1. Whales

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(7

[旋木] 2024-10-23 08:36:09

我觉得它并不优雅,但可能有助于了解如何创建一个;)

它使用 simple_html_dom 来查找并操作原始 html 中的元素

$htmlcode = <<< EOHTML
<h1> Animals </h1>
Some content goes here.
Some content goes here.
<h2> Mammals </h2>
Some content goes here.
Some content goes here.
<h3> Terrestrial Mammals </h3>
Some content goes here.
Some content goes here.
<h3> Marine Mammals </h3>
Some content goes here.
Some content goes here.
<h4> Whales </h4>
Some content goes here.
Some content goes here.
EOHTML;
// simpehtmldom or other dom manipulating library
require_once 'simple_html_dom.php';

$html = str_get_html($htmlcode);

$toc = '';
$last_level = 0;

foreach($html->find('h1,h2,h3,h4,h5,h6') as $h){
    $innerTEXT = trim($h->innertext);
    $id =  str_replace(' ','_',$innerTEXT);
    $h->id= $id; // add id attribute so we can jump to this element
    $level = intval($h->tag[1]);

    if($level > $last_level)
        $toc .= "<ol>";
    else{
        $toc .= str_repeat('</li></ol>', $last_level - $level);
        $toc .= '</li>';
    }

    $toc .= "<li><a href='#{$id}'>{$innerTEXT}</a>";

    $last_level = $level;
}

$toc .= str_repeat('</li></ol>', $last_level);
$html_with_toc = $toc . "<hr>" . $html->save();

I don't find it elegant, but might help in getting general idea how to create one ;)

It uses simple_html_dom to find and manipulate elements in original html

$htmlcode = <<< EOHTML
<h1> Animals </h1>
Some content goes here.
Some content goes here.
<h2> Mammals </h2>
Some content goes here.
Some content goes here.
<h3> Terrestrial Mammals </h3>
Some content goes here.
Some content goes here.
<h3> Marine Mammals </h3>
Some content goes here.
Some content goes here.
<h4> Whales </h4>
Some content goes here.
Some content goes here.
EOHTML;
// simpehtmldom or other dom manipulating library
require_once 'simple_html_dom.php';

$html = str_get_html($htmlcode);

$toc = '';
$last_level = 0;

foreach($html->find('h1,h2,h3,h4,h5,h6') as $h){
    $innerTEXT = trim($h->innertext);
    $id =  str_replace(' ','_',$innerTEXT);
    $h->id= $id; // add id attribute so we can jump to this element
    $level = intval($h->tag[1]);

    if($level > $last_level)
        $toc .= "<ol>";
    else{
        $toc .= str_repeat('</li></ol>', $last_level - $level);
        $toc .= '</li>';
    }

    $toc .= "<li><a href='#{$id}'>{$innerTEXT}</a>";

    $last_level = $level;
}

$toc .= str_repeat('</li></ol>', $last_level);
$html_with_toc = $toc . "<hr>" . $html->save();
花开柳相依 2024-10-23 08:36:09

以下是使用 DOMDocument 的示例:

$doc = new DOMDocument();
$doc->loadHTML($code);

// create document fragment
$frag = $doc->createDocumentFragment();
// create initial list
$frag->appendChild($doc->createElement('ol'));
$head = &$frag->firstChild;
$xpath = new DOMXPath($doc);
$last = 1;

// get all H1, H2, …, H6 elements
foreach ($xpath->query('//*[self::h1 or self::h2 or self::h3 or self::h4 or self::h5 or self::h6]') as $headline) {
    // get level of current headline
    sscanf($headline->tagName, 'h%u', $curr);

    // move head reference if necessary
    if ($curr < $last) {
        // move upwards
        for ($i=$curr; $i<$last; $i++) {
            $head = &$head->parentNode->parentNode;
        }
    } else if ($curr > $last && $head->lastChild) {
        // move downwards and create new lists
        for ($i=$last; $i<$curr; $i++) {
            $head->lastChild->appendChild($doc->createElement('ol'));
            $head = &$head->lastChild->lastChild;
        }
    }
    $last = $curr;

    // add list item
    $li = $doc->createElement('li');
    $head->appendChild($li);
    $a = $doc->createElement('a', $headline->textContent);
    $head->lastChild->appendChild($a);

    // build ID
    $levels = array();
    $tmp = &$head;
    // walk subtree up to fragment root node of this subtree
    while (!is_null($tmp) && $tmp != $frag) {
        $levels[] = $tmp->childNodes->length;
        $tmp = &$tmp->parentNode->parentNode;
    }
    $id = 'sect'.implode('.', array_reverse($levels));
    // set destination
    $a->setAttribute('href', '#'.$id);
    // add anchor to headline
    $a = $doc->createElement('a');
    $a->setAttribute('name', $id);
    $a->setAttribute('id', $id);
    $headline->insertBefore($a, $headline->firstChild);
}

// append fragment to document
$doc->getElementsByTagName('body')->item(0)->appendChild($frag);

// echo markup
echo $doc->saveHTML();

Here’s an example using DOMDocument:

$doc = new DOMDocument();
$doc->loadHTML($code);

// create document fragment
$frag = $doc->createDocumentFragment();
// create initial list
$frag->appendChild($doc->createElement('ol'));
$head = &$frag->firstChild;
$xpath = new DOMXPath($doc);
$last = 1;

// get all H1, H2, …, H6 elements
foreach ($xpath->query('//*[self::h1 or self::h2 or self::h3 or self::h4 or self::h5 or self::h6]') as $headline) {
    // get level of current headline
    sscanf($headline->tagName, 'h%u', $curr);

    // move head reference if necessary
    if ($curr < $last) {
        // move upwards
        for ($i=$curr; $i<$last; $i++) {
            $head = &$head->parentNode->parentNode;
        }
    } else if ($curr > $last && $head->lastChild) {
        // move downwards and create new lists
        for ($i=$last; $i<$curr; $i++) {
            $head->lastChild->appendChild($doc->createElement('ol'));
            $head = &$head->lastChild->lastChild;
        }
    }
    $last = $curr;

    // add list item
    $li = $doc->createElement('li');
    $head->appendChild($li);
    $a = $doc->createElement('a', $headline->textContent);
    $head->lastChild->appendChild($a);

    // build ID
    $levels = array();
    $tmp = &$head;
    // walk subtree up to fragment root node of this subtree
    while (!is_null($tmp) && $tmp != $frag) {
        $levels[] = $tmp->childNodes->length;
        $tmp = &$tmp->parentNode->parentNode;
    }
    $id = 'sect'.implode('.', array_reverse($levels));
    // set destination
    $a->setAttribute('href', '#'.$id);
    // add anchor to headline
    $a = $doc->createElement('a');
    $a->setAttribute('name', $id);
    $a->setAttribute('id', $id);
    $headline->insertBefore($a, $headline->firstChild);
}

// append fragment to document
$doc->getElementsByTagName('body')->item(0)->appendChild($frag);

// echo markup
echo $doc->saveHTML();
她说她爱他 2024-10-23 08:36:09

我发现了这个方法,作者是 Alex Freeman (http:// /www.10stripe.com/articles/automatically-generate-table-of-contents-php.php):

    preg_match_all('#<h[4-6]*[^>]*>.*?<\/h[4-6]>#',$html_string,$resultats);

    //reformat the results to be more usable
    $toc = implode("\n",$resultats[0]);
    $toc = str_replace('<a name="','<a href="#',$toc);
    $toc = str_replace('</a>','',$toc);
    $toc = preg_replace('#<h([4-6])>#','<li class="toc$1">',$toc);
    $toc = preg_replace('#<\/h[4-6]>#','</a></li>',$toc);

    //plug the results into appropriate HTML tags
    $toc = '<div id="toc"> 
    <p id="toc-header">Table des matières</p>
    <hr />
    <ul>
    '.$toc.'
    </ul>
    </div><br /><br />';

    return $toc;

在 HTML 中,标头必须编写为:

<h2><a name="target"></a>Text</h2>

I found this method, by Alex Freeman (http://www.10stripe.com/articles/automatically-generate-table-of-contents-php.php):

    preg_match_all('#<h[4-6]*[^>]*>.*?<\/h[4-6]>#',$html_string,$resultats);

    //reformat the results to be more usable
    $toc = implode("\n",$resultats[0]);
    $toc = str_replace('<a name="','<a href="#',$toc);
    $toc = str_replace('</a>','',$toc);
    $toc = preg_replace('#<h([4-6])>#','<li class="toc$1">',$toc);
    $toc = preg_replace('#<\/h[4-6]>#','</a></li>',$toc);

    //plug the results into appropriate HTML tags
    $toc = '<div id="toc"> 
    <p id="toc-header">Table des matières</p>
    <hr />
    <ul>
    '.$toc.'
    </ul>
    </div><br /><br />';

    return $toc;

In the HTML, the headers have to be written as:

<h2><a name="target"></a>Text</h2>
最笨的告白 2024-10-23 08:36:09

结合上面的一些内容来创建标题的嵌套索引。该函数还将链接插入到 html 本身中,以便可以链接。纯php,无需库。

   function generateIndex($html)
{
    preg_match_all('/<h([1-6])*[^>]*>(.*?)<\/h[1-6]>/',$html,$matches);

    $index = "<ul>";
    $prev = 2;

    foreach ($matches[0] as $i => $match){

        $curr = $matches[1][$i];
        $text = strip_tags($matches[2][$i]);
        $slug = strtolower(str_replace("--","-",preg_replace('/[^\da-z]/i', '-', $text)));
        $anchor = '<a name="'.$slug.'">'.$text.'</a>';
        $html = str_replace($text,$anchor,$html);

        $prev <= $curr ?: $index .= str_repeat('</ul>',($prev - $curr));
        $prev >= $curr ?: $index .= "<ul>";

        $index .= '<li><a href="#'.$slug.'">'.$text.'</a></li>';

        $prev = $curr;
    }

    $index .= "</ul>";

    return ["html"=>$html,"index"=>$index];
}

Combined some of the above to make a nested index of the headings. This function also inserts links into html itself so it can be linked. Pure php no library needed.

   function generateIndex($html)
{
    preg_match_all('/<h([1-6])*[^>]*>(.*?)<\/h[1-6]>/',$html,$matches);

    $index = "<ul>";
    $prev = 2;

    foreach ($matches[0] as $i => $match){

        $curr = $matches[1][$i];
        $text = strip_tags($matches[2][$i]);
        $slug = strtolower(str_replace("--","-",preg_replace('/[^\da-z]/i', '-', $text)));
        $anchor = '<a name="'.$slug.'">'.$text.'</a>';
        $html = str_replace($text,$anchor,$html);

        $prev <= $curr ?: $index .= str_repeat('</ul>',($prev - $curr));
        $prev >= $curr ?: $index .= "<ul>";

        $index .= '<li><a href="#'.$slug.'">'.$text.'</a></li>';

        $prev = $curr;
    }

    $index .= "</ul>";

    return ["html"=>$html,"index"=>$index];
}
迷途知返 2024-10-23 08:36:09

查看TOC 类。它允许从嵌套标题生成目录。 h1 标签后面可以跟任何较低级别的 h 标签。该类使用递归从文章文本中提取标题

Have a look at the TOC class. It allows generating table of contents from nested headings. h1 tag can be followed by any lower level h tag. The class uses recursion to extract the headings from article text

花开柳相依 2024-10-23 08:36:09

使用 SimpleHTMLDom 的简短解决方案:

public function getSummary($body) 
{
    $dom  = new Htmldom($body);
    $summ = "<ul>";
    $prev = 2;

    foreach($dom->find("h2,h3,h4") as $x => $htag) 
    {
        $curr = intval(substr($htag->tag, -1));

        $prev <= $curr ?: $summ .= "</ul>";
        $prev >= $curr ?: $summ .= "<ul>";

        $summ .= "<li>$htag->plaintext</li>";

        $prev = $curr;
    }

    $summ .= "</ul>";

    return $summ;
}

Short solution using SimpleHTMLDom :

public function getSummary($body) 
{
    $dom  = new Htmldom($body);
    $summ = "<ul>";
    $prev = 2;

    foreach($dom->find("h2,h3,h4") as $x => $htag) 
    {
        $curr = intval(substr($htag->tag, -1));

        $prev <= $curr ?: $summ .= "</ul>";
        $prev >= $curr ?: $summ .= "<ul>";

        $summ .= "<li>$htag->plaintext</li>";

        $prev = $curr;
    }

    $summ .= "</ul>";

    return $summ;
}
后eg是否自 2024-10-23 08:36:09

你有一个非常简单的库 caseyamcl/toc

$html='<h1>Title</h1>text<h2>...<h2>...';
$tocGenerator = new TOC\TocGenerator();
$toc = $tocGenerator->getHtmlMenu($html);
echo $htmlOut;

奖金:如果你愿意,他可以修复标题之前插入此代码,无需标记 id。

$tocGenerator = new TOC\TocGenerator();
$html  = $markupFixer->fix($html);

You have a very simple library for this caseyamcl/toc

$html='<h1>Title</h1>text<h2>...<h2>...';
$tocGenerator = new TOC\TocGenerator();
$toc = $tocGenerator->getHtmlMenu($html);
echo $htmlOut;

Bonus: If you want, he can fix the header without tag id by insert this code before.

$tocGenerator = new TOC\TocGenerator();
$html  = $markupFixer->fix($html);
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文