好吧...我知道<- 不能使用允许的标签将其从 strip_tags 中排除,但我正在尝试使用解决方法。对于一开始就不是有效 HTML 的字符集,该解决方法可以正常工作,例如 <<或 <~ 但是,当我使用下面的代码来转换 <- 或 -> 时在处理 strip_tags 之前转换为数字,然后从数字返回到 <- 和 -> 。后。但每当这些符号出现时,所有 HTML 都会被删除,或者不会被处理。我知道我不能通过允许的标签单独保留它,这就是为什么我在 strip_Tags 之前转换它并在之后返回...但它几乎就像 strip_Tags 仍然删除它,即使它在 strip_tags 所在的行之后转换回来,因为它删除<- 并将所有内容都放在它的右侧......有什么想法或其他方法可以尝试吗?我还尝试将 <- 定义为 <—
并尝试用其他符号替换它,例如 #- 但无论如何我都会得到相同的结果。
我还应该提到 <- 和 ->它们不一起使用,它们用于指向文本中的事物。就像 innt <- 那里拼写错误一样。
`<?php
$data = file_get_contents("test.html");
$data = str_replace("<-", "999", $data);
$data = str_replace("->", "998", $data);
$data = strip_tags($data, '');
$data = str_replace("999", "<-", $data);
$data = str_replace("998", "->", $data);
echo $data;
?>`
我正在收集示例数据,并意识到如果我删除示例 HTML 的大部分内容,一切都会正常,结果是,如果我删除实际的 HTML 注释,例如
就我自己而言,转换进展顺利,因此我将寻找正则表达式匹配来删除转换之前的 HTML 注释和 striptags。
更新
我使用下面的代码首先删除了 HTML 注释,结果成功。感谢您的帮助。
`$data = preg_replace('/<!--(.*)-->/', '', $data);`
Okay... I know <- cant be excluded from strip_tags using allowable tags per say but im trying to use a work around. The work around works fine on character sets that wouldn't be valid HTML to begin with, such as << or <~ however when i use the code below to convert the <- or -> to digits before strip_tags is processed and then back from digits to the <- and -> after. But whenever those symbols show up all HTML from there on is removed, that or not processed. I understand i cant have it left alone through allowable tags which is why i convert it before the strip_Tags and back after... but its almost as if strip_Tags still removes it even though its converted back after the line where strip_tags is, since its removing <- and taking everything to the right of it.... Any ideas or other ways to try? I've also tried defining <- as <—
and tried replacing it with other symbols as well, such as #- but no matter what i have the same outcome.
I should also mention the <- and -> arent used together, they are used to point to things in text. Like internt <- is misspelled there.
`<?php
$data = file_get_contents("test.html");
$data = str_replace("<-", "999", $data);
$data = str_replace("->", "998", $data);
$data = strip_tags($data, '');
$data = str_replace("999", "<-", $data);
$data = str_replace("998", "->", $data);
echo $data;
?>`
I was gathering sample data and realize if i remove a good chunk of the sample HTML everything works fine, turns out if i strip actual HTML comments such as <!-- Header //-->
on my own the conversion goes fine, so im going to look for a regex match to remove the HTML comments before the conversion and the striptags.
Update
I used the following code below to remove the HTML comments first, which results in success. Thanks for your help.
`$data = preg_replace('/<!--(.*)-->/', '', $data);`
发布评论
评论(1)
更新:
输出(源):
输出(HTML):
Update:
Output (Source):
Output (HTML):