从 C# 中的 URL 中删除锚点
我正在尝试从 XML 文档中提取 src 值,在我正在测试的文档中,src 为:
这会在尝试打开文件时产生问题。我不确定 #(stuff) 后缀叫什么,所以我没能找到答案。如果可能的话,我只想要一种简单的方法来删除它。我想我可以编写一个函数来搜索 # 并删除后面的任何内容,但是如果文件名包含 # 符号(或者文件甚至可以有该符号吗?),那么就会中断,
谢谢!
I'm trying to pull in an src value from an XML document, and in one that I'm testing it with, the src is:
<content src="content/Orwell - 1984 - 0451524934_split_2.html#calibre_chapter_2"/>
That creates a problem when trying to open the file. I'm not sure what that #(stuff) suffix is called, so I had no luck searching for an answer. I'd just like a simple way to remove it if possible. I suppose I could write a function to search for a # and remove anything after, but that would break if the filename contained a # symbol (or can a file even have that symbol?)
Thanks!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
如果您的 src 位于字符串中,则可以使用
它将返回不带 # 的 src。如果您要检索的值都是 Web url,那么这应该可以工作,# 是 url 中的书签,可将您带到页面的特定部分。
If you had the src in a string you could use
Which would return the src without the #. If the values you are retreiving are all web urls then this should work, the # is a bookmark in a url that takes you to a specific part of the page.
假设 URL 不包含“#”应该没问题
来源(搜索“#”或“unsafe”)。
因此只需使用
String.Split()
以“#”作为分隔符。这应该给你 2 部分。在极不可能的情况下,它会给出更多,只需丢弃最后一个并重新加入其余的即可。You should be OK assuming that URLs won't contain a "#"
Source (search for "#" or "unsafe").
Therefore just use
String.Split()
with the "#" as the split character. This should give you 2 parts. In the highly unlikely event it gives more, just discard the last one and rejoin the remainder.来自维基百科:
# 在网页或其他资源的 URL 中使用,以引入“片段标识符”——定义该资源内位置的 id。例如,在 URL http://en.wikipedia.org/wiki/Number_sign#Other_uses< /a> # (Other_uses) 之后的部分是片段标识符,在本例中表示应移动显示以显示 HTML 中由 ... 标记的标签
From Wikipedia:
# is used in a URL of a webpage or other resource to introduce a "fragment identifier" – an id which defines a position within that resource. For example, in the URL http://en.wikipedia.org/wiki/Number_sign#Other_uses the portion after the # (Other_uses) is the fragment identifier, in this case indicating that the display should be moved to show the tag marked by ... in the HTML
删除网址的锚点是不安全的。我的意思是像 ajax 的网站利用锚点来跟踪上下文。例如 Gmail。如果您访问 http://www.gmail.com/#inbox,您将直接访问您的收件箱,但如果您访问 http://www.gmail.com/#all,您将查看所有邮件。
服务器可以根据锚点给出不同的响应,即使响应是文件。
It's not safe to remove de anchor of the url. What I mean is that ajax like sites make use of the anchor to keep track of the context. For example gmail. If you go to http://www.gmail.com/#inbox, you go directly to your inbox, but if you go to http://www.gmail.com/#all, you'll go to all your mail.
The server can give a different response based on the anchor, even if the response is a file.