Hpricot 解析 URI 中的特殊字符时出错
我正在编写一个 ruby 脚本来从雅虎获取历史股票价格,使用 Hpricot 来解析页面。这基本上是直截了当的:网址是“http://finance.yahoo.com/q/ hp?s=TickerSymbol" 例如,要查找 Google,我会使用“http://finance.yahoo.com/q/hp?s=GOOG"
不幸的是,当我查找指数价格时,它崩溃了。索引以插入符号为前缀,例如“http://finance.yahoo.com/ q/hp?s=^DJI”为道指。
该行:
ticker_symbol = '^DJI'
doc = Hpricot(open("http://finance.yahoo.com/q/hp?s=#{ticker_symbol}"))
抛出此异常:
bad URI(is not URI?): http://finance.yahoo.com/q/hp?s=^DJI
Hpricot 在插入符号上被阻塞(我认为是因为底层 Ruby URI 库确实如此)。有没有办法逃避该角色或迫使图书馆尝试它?
I'm working on a ruby script to grab historical stock prices from Yahoo, using Hpricot to parse the pages. This is mostly straighforward: the url is "http://finance.yahoo.com/q/hp?s=TickerSymbol" For example, to look up Google, I would use "http://finance.yahoo.com/q/hp?s=GOOG"
Unfortunately, it breaks down when I'm looking up the price of an index. The indexes are prefixed with a caret, such as "http://finance.yahoo.com/q/hp?s=^DJI" for the Dow.
The line:
ticker_symbol = '^DJI'
doc = Hpricot(open("http://finance.yahoo.com/q/hp?s=#{ticker_symbol}"))
throws this exception:
bad URI(is not URI?): http://finance.yahoo.com/q/hp?s=^DJI
Hpricot chokes on the caret (I think because the underlying Ruby URI library does). Is there a way to escape that character or force the library to try it?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
嗯,我不觉得自己很蠢吗?再过五分钟,我就开始工作了:
所以如果其他人想知道,那就这么做吧。 捂脸
Well, don't I feel dumb. Five more minutes and I got this working:
So if anyone else is wondering, that's how you do it. facepalm
^ 的转义为 %5E;您可以直接替换 URL。
http://finance.yahoo.com/q/hp?s=%5EDJI< /a>
The escape for ^ is %5E; you could do a straight substitution on the URL.
http://finance.yahoo.com/q/hp?s=%5EDJI