YQL robots.txt 受限 URL 问题
我正在开发一个包含以下 YQL 查询的 Web 应用程序:
SELECT * FROM html WHERE url="{URL}" and xpath="*"
我上周部署了一个新版本,并注意到该页面挂在 YQL 查询上。当我昨天回来时,问题似乎在周末就解决了。我刚刚向服务器部署了新版本,问题又出现了。服务器堆栈是 Ngnix / Passenger / Sinatra
将查询打入 YQL 控制台时出现错误: “请求 robots.txt 受限 URL:”
我添加了以下 robots.txt:
User-agent: Yahoo Pipes 2.0
Allow: /
但这似乎没有任何作用。
想法?我很好奇为什么 YQL 报告该 URL 受 robots.txt 限制,而实际上并非如此。
I'm developing a webapp that includes the following YQL query:
SELECT * FROM html WHERE url="{URL}" and xpath="*"
I deployed a new version last week, and noticed that the page was hanging on the YQL query. When I came back yesterday, the problem seemed to have fixed itself over the weekend. I just deployed a new version to the server and the problem has come back again. The server stack is Ngnix / Passenger / Sinatra
Punching the query into YQL Console I get an error:
"Requesting a robots.txt restricted URL:"
I've added the following robots.txt:
User-agent: Yahoo Pipes 2.0
Allow: /
But that doesn't seem to do anything.
Thoughts? It's pretty curious to me why YQL is reporting the URL is robots.txt restricted when it's not.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
我也遇到过同样的问题。我怀疑这部分是雅虎方面的问题。
在我的 Sinatra 应用程序中,我添加了...
get 'robots.txt' do
“用户代理:* 允许:/”
结尾
这偶尔会起作用......然后访问将再次被拒绝一段时间。
如果您使用它来避免 javascript 的跨域问题...我最终屈服并使用本地 PHP 脚本来检索数据而不是使用 YQL。
I've had the same problem. I have a suspicion that this is in part a problem on Yahoo's end.
In my Sinatra apps I added...
get 'robots.txt' do
"User-agent: * Allow: /"
end
This would work occasionally... and then access would be denied again for a period of time.
If you are using this to avoid cross-domain issues with javascript... I eventually gave in and used a local PHP script to retrieve data rather than use YQL.
考虑在 YQL 查询中添加&diagnostics=true。这对我有用。
Consider added &diagnostics=true in the YQL query. It worked for me.