为每个 HTML 标签添加数字属性
我需要向文档中的每个 HTML 标记添加一个递增编号的自定义属性,类似于 这个问题,但仅限于 HTML,而不是 XML 文件。
我尝试使用 HTML Agility Pack 来完成它,这是我的代码:
HtmlDocument htmldoc = new HtmlDocument();
htmldoc.LoadHtml(text);
var num = 1;
foreach (HtmlNode node in htmldoc.DocumentNode.DescendantNodes())
{
node.Attributes.Add("gist_num",(num++).ToString());
}
var numberedfilename = Path.GetDirectoryName(fname) + @"\" + Path.GetFileNameWithoutExtension(fname) + "-num.htm";
htmldoc.Save(numberedfilename);
但是我在 HTML Agility Pack HtmlTextNode 类中遇到堆栈溢出异常。 我尝试了多种方法通过更改类来纠正这个错误,但没有成功。
你在这里有什么建议?
- - 编辑 - - 因此,例外只是“Stack Overflow”写入控制台。
“由于 StackOverflowException,进程被终止。”
由于它是堆栈溢出,因此不可能获取任何堆栈值。 这是 VS 显示发生此异常的代码:
/// <summary>
/// Gets or Sets the text of the node.
/// </summary>
public string Text
{
get
{
if (_text == null)
{
return base.OuterHtml;
}
return _text;
}
set { _text = value; }
}
那么,有什么想法吗?
I need to add a custom attribute with incrementing number to every HTML tag in the document, similar to this question, but only in HTML, not XML file.
I tried to accomplish it with HTML Agility Pack, here is my code:
HtmlDocument htmldoc = new HtmlDocument();
htmldoc.LoadHtml(text);
var num = 1;
foreach (HtmlNode node in htmldoc.DocumentNode.DescendantNodes())
{
node.Attributes.Add("gist_num",(num++).ToString());
}
var numberedfilename = Path.GetDirectoryName(fname) + @"\" + Path.GetFileNameWithoutExtension(fname) + "-num.htm";
htmldoc.Save(numberedfilename);
But I get a stack overflow exception here in HTML Agility Pack HtmlTextNode class.
I tried several ways to correct this bug by changing the class, but at no avail.
What would you suggest here?
--- edit ---
So, the exception is just "Stack Overflow" written to the console.
"Process is terminated due to StackOverflowException."
Since it is Stack Overflow there is no possibility to get any stack values.
Here is the code where VS shows this exception happening:
/// <summary>
/// Gets or Sets the text of the node.
/// </summary>
public string Text
{
get
{
if (_text == null)
{
return base.OuterHtml;
}
return _text;
}
set { _text = value; }
}
So, any ideas?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
您需要过滤节点,以便仅选择元素。由于某种原因,浏览 HTML Agility Pack 中的后代会错误地包括其他节点,例如文档和文本节点。由于您盲目地向所有节点添加属性,因此它会阻碍非元素节点的序列化。
You need to filter the nodes so you're only selecting the elements. For some reason, going through the descendants in HTML Agility Pack includes other nodes like document and text nodes incorrectly. Since you're blindly adding attributes to all nodes, it chokes serializing the non-element nodes.