如何读/写文本并避免特殊字符符号(<、、> 等)

发布于 2024-10-03 14:07:18 字数 525 浏览 5 评论 0 原文

我目前正在解析存储在数据库中的一些C#脚本,提取代码中一些方法的主体,然后编写一个XML文件来显示id、提取的方法的主体等。

我现在写的问题是当我在 XML 中编写代码时,我必须将其写为文字字符串,所以我想我需要在开头和结尾添加 "

new XElement("MethodName", @"""" + Extractor.GetMethodBody(rule.RuleScript, "MethodName") + @"""")

这可行,但我有一个问题,在数据库中写入的内容

for (int n = 1; n < 10; n++)

被写入 XML 文件(或打印到控制台):

for (int n = 1; n &lt; 10; n++)

如何让它打印实际字符而不是其代码?数据库中的代码是用实际字符编写的? ,而不是像这样的“安全”<

I am currently parsing some C# scripts that are stored in a database, extracting the body of some methods in the code, and then writing an XML file that shows the id, the body of the extracted methods, etc.

The problem I have write now is that when I write the code in the XML I have to write it as a literal string, so I thought I'd need to add " at the beginning and end:

new XElement("MethodName", @"""" + Extractor.GetMethodBody(rule.RuleScript, "MethodName") + @"""")

This works, but I have a problem, things that are written in the DB as

for (int n = 1; n < 10; n++)

are written into the XML file (or printed to console) as:

for (int n = 1; n < 10; n++)

How can I get it to print the actual character and not its code? The code in the database is written with the actual charaters, not the "safe" < like one.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

明媚如初 2024-10-10 14:07:18

在 xml(作为文本值)内,将 < 编码为 <正确。 xml 的内部表示不会影响该值,因此对其进行编码。您可以通过强制使用 CDATA 部分来解决此问题,但是老实说 - 这不值得。但这里有一个使用 CDATA 的示例:

string noEncoding = new XElement("foo", new XCData("a < b")).ToString();

Inside xml (as a text value) it is correct for < to be encoded as <. The internal representation of xml doesn't affect the value, so let it get encoded. You can get around this by forcing a CDATA section, but in all honesty - it isn't worth it. But here is an example using CDATA:

string noEncoding = new XElement("foo", new XCData("a < b")).ToString();
2024-10-10 14:07:18

您为什么认为必须将其写为文字字符串?事实并非如此。此外,您根本没有将其写为文字字符串,它仍然是一个动态字符串值,只是您在它周围添加了引号。

文字字符串是在代码中随意编写的字符串,例如 "Hello world"。如果您以任何其他方式获取该字符串,它就不是文字字符串。

添加到字符串中的引号只是将引号添加到值中,它们不会对字符串执行任何其他操作。您可以添加带引号的字符串即可:

new XElement("MethodName", Extractor.GetMethodBody(rule.RuleScript, "MethodName"))

现在,将字符放入 XML 中时进行编码,是因为它们需要进行编码。您不能将 < 字符放入值中而不对其进行编码。

如果显示 XML,您将看到编码值,这只是它正常工作的标志。当您读取 XML 时,编码的字符将被解码,最终得到原始字符串。

Why do you think that you have to write it as a literal string? That is not so. Besides, you are not writing it as a literal string at all, it's still a dynamic string value only that you have added quotation marks around it.

A literal string is a string that is written litteraly in the code, like "Hello world". If you get the string in any other way, it's not a literal string.

The quotation marks that you have added to the string simply adds quotation marks to the value, they don't do anything else to the string. You can add the string with the quotation marks just fine:

new XElement("MethodName", Extractor.GetMethodBody(rule.RuleScript, "MethodName"))

Now, the characters that are encoded when they are put in the XML, is because they need to be encoded. You can't put a < character inside a value without encoding it.

If you show the XML, you will see the encoded values, and that is just a sign that it works as it should. When you read the XML, the encoded characters will be decoded, and you end up with the original string.

紅太極 2024-10-10 14:07:18

我不知道他将使用什么软件来读取 XML,但我所知道的任何软件都会在解析任何未转义 << 的 XML 时抛出错误。和>不用作标记开始和结束的字符。它只是 XML 规范的一部分;这些字符被保留作为结构的一部分。

如果我是您,那么我会放弃 System.XML 实用程序并自己编写此文件。任何像样的 XML 工具都会为您编码这些字符,因此您可能不应该使用它们。使用 StreamWriter 并按照您被告知的方式创建输出。这样您就可以自己控制 XML 输出,即使这意味着违反 XML 规范。

using (StreamWriter sw = new StreamWriter("c:\\xmlText.xml", false, Encoding.UTF8))
{
 sw.WriteLine("<?xml version=\"1.0\"?>");
 sw.WriteLine("<Class>");

 sw.Write("\t<Method Name=\"MethodName\">");
 sw.Write(@"""" + Extractor.GetMethodBody(rule.RuleScript, "MethodName") + @"""");
 sw.WriteLine("</Method>");

 // ... and so on and so forth

 sw.WriteLine("</Class>");
}

I don't know what software he's going to use to read the XML, but any that I know of will throw an error on parsing any XML that does not escape < and > chars which aren't used as tag starts and ends. It's just part of the XML specification; these chars are reserved as part of the structure.

If I were you, then, I'd part ways with the System.XML utilities and write this file yourself. Any decent XML tool is going to encode those chars for you, so you should probably not use them. Go with a StreamWriter and create the output the way you are being told to. That way you can control the XML output yourself, even if it means breaking the XML specification.

using (StreamWriter sw = new StreamWriter("c:\\xmlText.xml", false, Encoding.UTF8))
{
 sw.WriteLine("<?xml version=\"1.0\"?>");
 sw.WriteLine("<Class>");

 sw.Write("\t<Method Name=\"MethodName\">");
 sw.Write(@"""" + Extractor.GetMethodBody(rule.RuleScript, "MethodName") + @"""");
 sw.WriteLine("</Method>");

 // ... and so on and so forth

 sw.WriteLine("</Class>");
}
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文