将文件路径转换为文件 URI?
.NET Framework 是否有任何方法可以将路径(例如 "C:\whatever.txt"
)转换为文件 URI(例如 "file:///C:/whatever.txt)。 txt”
)?
System.Uri 类有相反的情况(从文件 URI 到绝对路径),但据我所知,没有任何东西可以转换为文件 URI。
此外,这不是 ASP.NET 应用程序。
Does the .NET Framework have any methods for converting a path (e.g. "C:\whatever.txt"
) into a file URI (e.g. "file:///C:/whatever.txt"
)?
The System.Uri class has the reverse (from a file URI to absolute path), but nothing as far as I can find for converting to a file URI.
Also, this is not an ASP.NET application.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(8)
System.Uri
构造函数能够解析完整文件路径并将其转换为 URI 样式路径。所以你可以执行以下操作:The
System.Uri
constructor has the ability to parse full file paths and turn them into URI style paths. So you can just do the following:似乎没有人意识到,没有一个
System.Uri
构造函数能够正确处理某些带有百分号的路径。这将为您提供
"file:///C:/Q.txt"
而不是"file:///C:/%2551.txt"
。已弃用的 dontEscape 参数的值都没有任何区别,并且指定 UriKind 也会给出相同的结果。尝试使用 UriBuilder 也没有帮助:
这也会返回
"file:///C:/Q.txt"
。据我所知,该框架实际上缺乏任何正确执行此操作的方法。
我们可以尝试通过用正斜杠替换反斜杠并将路径提供给
Uri.EscapeUriString
- 即,这似乎一开始可行,但如果你给它路径
C:\a b.txt
那么你最终会得到file:///C:/a%2520b.txt
而不是file:///C:/a%20b.txt
- 不知何故,它决定应该解码某些序列,但不解码其他序列。现在我们可以自己添加"file:///"
前缀,但这无法考虑\\remote\share\foo.txt
这样的 UNC 路径 - 什么Windows 上普遍接受的做法是将它们转换为file://remote/share/foo.txt
形式的伪 URL,因此我们也应该考虑到这一点。EscapeUriString
也存在无法转义'#'
字符的问题。在这一点上,我们似乎别无选择,只能从头开始制定自己的方法。所以这就是我的建议:故意将 + 和 : 保留为未编码,因为这似乎是 Windows 上通常的做法。它还仅对 latin1 进行编码,因为 Internet Explorer 无法理解文件 url 中的 unicode 字符(如果进行了编码)。
What no-one seems to realize is that none of the
System.Uri
constructors correctly handles certain paths with percent signs in them.This gives you
"file:///C:/Q.txt"
instead of"file:///C:/%2551.txt"
.Neither values of the deprecated dontEscape argument makes any difference, and specifying the UriKind gives the same result too. Trying with the UriBuilder doesn't help either:
This returns
"file:///C:/Q.txt"
as well.As far as I can tell the framework is actually lacking any way of doing this correctly.
We can try to it by replacing the backslashes with forward slashes and feed the path to
Uri.EscapeUriString
- i.e.This seems to work at first, but if you give it the path
C:\a b.txt
then you end up withfile:///C:/a%2520b.txt
instead offile:///C:/a%20b.txt
- somehow it decides that some sequences should be decoded but not others. Now we could just prefix with"file:///"
ourselves, however this fails to take UNC paths like\\remote\share\foo.txt
into account - what seems to be generally accepted on Windows is to turn them into pseudo-urls of the formfile://remote/share/foo.txt
, so we should take that into account as well.EscapeUriString
also has the problem that it does not escape the'#'
character. It would seem at this point that we have no other choice but making our own method from scratch. So this is what I suggest:This intentionally leaves + and : unencoded as that seems to be how it's usually done on Windows. It also only encodes latin1 as Internet Explorer can't understand unicode characters in file urls if they are encoded.
上述解决方案在 Linux 上不起作用。
使用 .NET Core,尝试执行
new Uri("/home/foo/README.md")
会导致异常:您需要向 CLR 提供一些有关您拥有的 URL 类型的提示。
这有效:
...
fileUri.ToString()
返回的字符串是"file:///home/foo/README.md"
这适用于 Windows,也。
new Uri(new Uri("file://"), @"C:\Users\foo\README.md").ToString()
...发出
"file:/ //C:/Users/foo/README.md"
The solutions above do not work on Linux.
Using .NET Core, attempting to execute
new Uri("/home/foo/README.md")
results in an exception:You need to give the CLR some hints about what sort of URL you have.
This works:
...and the string returned by
fileUri.ToString()
is"file:///home/foo/README.md"
This works on Windows, too.
new Uri(new Uri("file://"), @"C:\Users\foo\README.md").ToString()
...emits
"file:///C:/Users/foo/README.md"
VB.NET:
不同的输出:
一条线:
VB.NET:
Different outputs:
One liner:
至少在 .NET 4.5+ 中你还可以这样做:
At least in .NET 4.5+ you can also do:
不幸的是,@poizan42 的答案没有考虑到我们生活在 Unicode 世界中的事实,并且根据 RFC3986 它的限制性太大。
@pierre-arnaud 和 @jaredpar 接受的答案依赖于 System.Uri 构造函数,该构造函数必须处理 Uri 的太多组件才能管理文件名的可变性,并且在百分比字符和其他情况下表现不佳。
其他答案要么过于简单化,要么根本没有用。最好的应该是@is4,但是在发布这篇文章的第一个版本之后,我在为我编写的测试用例中一起测试了它,并且它在许多 Unicode 字符上失败了。
就我而言,我开始研究 @poizan42 代码以及评论哪些有效、哪些无效的各种答案,因此我采取了略有不同的方法。
首先,我认为输入字符串是有效的文件路径,因此我在测试中使用所有有效的 unicode 字符和代理项对以编程方式创建了路径。通过这一点,我验证了至少 Path.GetInvalidFileNameChars() 似乎至少在 Windows 中返回了正确的设置。
然后,我将这些路径传递给我按照 ABNF 路径规则实现的方法,您可以在 https://www.ietf.org/rfc/rfc3986.txt。
我将其结果与 UriBuilder 生成的结果进行比较,这就是最终的修复:
这完全未优化并执行三个替换,因此请随意将其转换为 Span 或 StringBuilder。
Unfortunately @poizan42 answer does not take into account the fact that we live in a Unicode world and it's too restrictive according to RFC3986.
The accepted answer of @pierre-arnaud and @jaredpar relies on the System.Uri constructor that has to take care for too many components of the Uri to be able to manage the variability of file names and it fails poorly on percent character and others cases.
The other answers are simplicistics or simply unuseful. The best one would have been @is4, but after posting the first version of this post I tested it together in the test case I wrote for mine and it fails on many Unicode characters.
In my case I started looking into @poizan42 code and the various answer commenting what was working and what not, so I took a slightly different approach.
First I consider the input string to be a valid file path, so I programmatically created path in my test using all the valid unicode characters and surrogate pairs. With this I verified that at least Path.GetInvalidFileNameChars() seems to return the correct set at least in Windows.
Then I passed these paths to a method that I implemented following the ABNF rules for path that you can find at page 22 of https://www.ietf.org/rfc/rfc3986.txt.
I compare the results of it with what the UriBuilder was generating and this is the resulting fix:
This is totally unoptimized and perform three replaces, so feel free to convert it to Span or StringBuilder.
UrlCreateFromPath
来救援!好吧,不完全是,因为它不支持扩展和 UNC 路径格式,但这并不难克服:如果路径以特殊前缀开头,它就会被删除。尽管文档没有提及,但即使缓冲区较小,该函数也会输出 URL 的长度,因此我首先获取长度,然后分配缓冲区。
我的一些非常有趣的观察是,虽然
\\\device\path
正确地转换为file://device/path
,特别是< code>\\\localhost\path 被转换为file:///path
。与
Uri
构造函数不同,WinApi
函数设法对特殊字符进行编码,但不对 Unicode 特定字符进行编码。在这种情况下,AbsoluteUri
包含正确编码的 URL,而OriginalString
可用于保留 Unicode 字符。UrlCreateFromPath
to the rescue! Well, not entirely, as it doesn't support extended and UNC path formats, but that's not so hard to overcome:In case the path starts with with a special prefix, it gets removed. Although the documentation doesn't mention it, the function outputs the length of the URL even if the buffer is smaller, so I first obtain the length and then allocate the buffer.
Some very interesting observation I had is that while
\\\device\path
is correctly transformed tofile://device/path
, specifically\\\localhost\path
is transformed to justfile:///path
.The
WinApi
function managed to encode special characters, but leaves Unicode-specific characters unencoded, unlike theUri
constructor. In that case,AbsoluteUri
contains the properly encoded URL, whileOriginalString
can be used to retain the Unicode characters.解决方法很简单。只需使用 Uri().ToString() 方法并对空格(如果有)进行百分比编码即可。
正确返回file:///C:/my%20exampleㄓ.txt
The workaround is simple. Just use the Uri().ToString() method and percent-encode white-spaces, if any, afterwards.
properly returns file:///C:/my%20exampleㄓ.txt