解析带有大括号的 URI,URI::InvalidURIError: bad URI(is not URI?)

发布于 2024-12-26 14:01:37 字数 940 浏览 1 评论 0原文

使用红宝石 1.9.2-p290。我在尝试解析如下 URI 时遇到了一个问题:

require 'uri'
my_uri = "http://www.anyserver.com/getdata?anyparameter={330C-B5A2}"
the_uri = URI.parse(my_uri)

发出以下错误:

URI::InvalidURIError: bad URI(is not URI?)

我需要一个与每次编码大括号不同的解决方案,如下所示:

new_uri = URI.encode("http://www.anyserver.com/getdata?anyparameter={330C-B5A2}")
=> "http://www.anyserver.com/getdata?anyparameter=%7B330C-B5A2%7D"

现在我可以像往常一样解析 new_uri,但每次都必须这样做我需要它。无需每次都这样做的最简单方法是什么?

我发布了我自己的解决方案,因为我没有完全按照我解决的方式看到这个问题。


# Accepts URIs when they contain curly braces
# This overrides the DEFAULT_PARSER with the UNRESERVED key, including '{' and '}'
module URI
  def self.parse(uri)
    URI::Parser.new(:UNRESERVED => URI::REGEXP::PATTERN::UNRESERVED + "\{\}").parse(uri)
  end
end

现在我可以将 URI.parse(uri) 与包含大括号的 uri 一起使用,并且不会引发错误。

Using ruby 1.9.2-p290. I came across an issue trying to parse a URI like the following:

require 'uri'
my_uri = "http://www.anyserver.com/getdata?anyparameter={330C-B5A2}"
the_uri = URI.parse(my_uri)

issuing the following error:

URI::InvalidURIError: bad URI(is not URI?)

I require a different solution than encoding the curly braces every time like this:

new_uri = URI.encode("http://www.anyserver.com/getdata?anyparameter={330C-B5A2}")
=> "http://www.anyserver.com/getdata?anyparameter=%7B330C-B5A2%7D"

Now I can parse the new_uri as usual, but had to do this every time I needed it. What is the simplest way to achieve this without doing it every time?

I post my own solution as I hadn't seen this exactly as I solved it.


# Accepts URIs when they contain curly braces
# This overrides the DEFAULT_PARSER with the UNRESERVED key, including '{' and '}'
module URI
  def self.parse(uri)
    URI::Parser.new(:UNRESERVED => URI::REGEXP::PATTERN::UNRESERVED + "\{\}").parse(uri)
  end
end

Now I can use URI.parse(uri) with uri containing curly braces and no error is thrown.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

心在旅行 2025-01-02 14:01:37
# Need to not fail when uri contains curly braces
# This overrides the DEFAULT_PARSER with the UNRESERVED key, including '{' and '}'
# DEFAULT_PARSER is used everywhere, so its better to override it once
module URI
  remove_const :DEFAULT_PARSER
  unreserved = REGEXP::PATTERN::UNRESERVED
  DEFAULT_PARSER = Parser.new(:UNRESERVED => unreserved + "\{\}")
end

跟进同样的问题,由于 DEFAULT_PARSER 随处使用,因此最好将其完全替换为 URI#parse 方法。此外,这避免了每次都为新的 Parser 对象的实例化分配内存。

# Need to not fail when uri contains curly braces
# This overrides the DEFAULT_PARSER with the UNRESERVED key, including '{' and '}'
# DEFAULT_PARSER is used everywhere, so its better to override it once
module URI
  remove_const :DEFAULT_PARSER
  unreserved = REGEXP::PATTERN::UNRESERVED
  DEFAULT_PARSER = Parser.new(:UNRESERVED => unreserved + "\{\}")
end

Following up the same issue, since DEFAULT_PARSER is used everywhere, its better to substitute it completely insted of just for the URI#parse method. Additionally this avoids allocating memory for the instantiation of a new Parser object every time.

画▽骨i 2025-01-02 14:01:37

RFC 1738 - http://www.faqs.org/rfcs/rfc1738.html 表示你必须对大括号进行编码

Thus, only alphanumerics, the special characters "$-_.+!*'(),", and
reserved characters used for their reserved purposes may be used
unencoded within a URL.

RFC 1738 - http://www.faqs.org/rfcs/rfc1738.html means that you do have to encode the braces

Thus, only alphanumerics, the special characters "$-_.+!*'(),", and
reserved characters used for their reserved purposes may be used
unencoded within a URL.
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文