使用 PHP 的 DOM 实现返回第一个“n”; HTML 字符串的字符
给定一个 HTML 字符串,我想返回具有以下属性的修改后的字符串:
- 文本内容的前 n 个字符(除了 HTML 标签)应保留。
- 满足 n 个字符之后的元素应完全删除。
- 如果 n 个字符不在元素末尾,则同一元素中后面的文本不应保留。
- 应保留 n 个字符处和之前的元素标签。
基本上,我只想返回 HTML 的缩短版本,而不中断 DOM 结构,并且仅基于文本内容的长度。
使用 PHP 的 DOM 实现,看起来这会过于复杂。使用模式匹配并不理想,因为修改后的字符串的条件可能会随着时间的推移而改变,并且每次都需要重写。
我是否缺少一种更简单的方法来做到这一点?提前致谢。
Given an HTML string, I would like to return a modified string with the following properties:
- The first n characters of the text contents (HTML tags aside) should remain.
- Elements after n characters have been met should be removed entirely.
- If n characters is not at the end of an element, text afterwards in the same element should not remain.
- Tags on elements at and before n characters should remain.
Basically, I just want to return a shortened version of the HTML, without the DOM structure being interrupted, and based on the length of the text contents only.
Using PHP's DOM implementation, it seems this will be overly complex. Using a pattern match isn't ideal as the conditions of the modified string might change over time, and it would require rewriting each time.
Am I missing an easier way of doing this? Thanks in advance.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
真的吗?
如果您想要
内部的前 100 个字符,这里有一个非常简单的 DOM 实现
标签及其子节点。您可以进一步处理它以删除换行符和多余的空格/制表符,或者检查foreach
内的$content
字符串的长度,以打破循环并停止连接。已达到一定字符数。更新
根据您的评论,这里有一种简单的方法来计算 HTML 节点内的字符数,并在达到指定的字符限制后删除所有标签。请注意,您无法在原始
foreach
中执行删除操作,因为它会导致DOM
重新索引节点,并且您将无法获得预期的结果。相反,我们将要删除的节点存储在数组中,并在初始迭代后删除它们。Really?
Here's a very simple DOM implementation if you want the first 100 characters from inside the
<body>
tag and its child nodes. You could further massage this to remove newline characters and superfluous space/tab characters or check the length of the$content
string inside theforeach
to break the loop and stop concatenation once you've reached a certain number of characters.UPDATE
As per your comment, here's a simple way to count the characters inside HTML nodes and delete all the tags after the specified character limit is reached. Note that you can't perform the delete operation inside the original
foreach
because it causesDOM
to reindex the nodes and you won't get the results you expect. Instead, we store the nodes we want to delete in an array and delete them after the initial iteration.