创建动态 Facebook 抓取器友好的 URL/页面
我正在创建一个网站,其中有一个图像库,可以通过 AJAX 请求加载图像,并且我有一个可以使用箭头键等浏览照片的系统,并且为了轻松共享 URL,我更改了地址栏,并在页面上,使用 javascript 检查哈希并在需要时重定向到适当的位置(就像 Facebook 所做的那样)。系统正在工作,但我不知道如何使这个获取器/爬虫友好。例如,用户可以复制地址 http://mysite.com/photos#photo/123
,其中 123 是照片 ID。普通浏览器将重定向到 http://mysite.com/photo/123 并毫无问题地显示页面,但我希望当访问者将地址粘贴到 Facebook 时也保留此功能(作为他们墙上的链接等)这样做的最佳实践是什么? Facebook 是否有任何处理超出其自身范围的主题标签的“知识”?我目前没有机会尝试,而且我不认为爬虫会解析并执行 javascript 来转到正确的页面。
I'm creating a website where I have an image gallery that does AJAX requests for loading images, and I've got a system that I can navigate through photos using arrow keys etc, and for sharing the URLs easily, I change the hash of the address bar, and on the page, check hash using javascript and redirect to the appropriate location if needed (just like facebook does anyway). The system is working, but I can't figure out how to make this fetcher/crawler friendly. For example, a user may copy the address http://mysite.com/photos#photo/123
, where 123 is the photo ID. A normal browser WILL redirect to http://mysite.com/photo/123
and display the page without any problem, but I want this functionality to be preserved when a visitor pastes the address to Facebook too (as a link on their wall etc) What is the best practice of doing this? Does Facebook have any "knowledge" of handling hashtags out of it's own scope? I currently don't have the chance to try it, and I don't think the crawler would parse and execute javascript to go to the right page.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
如果您或您的网络托管提供商正在运行 Apache HTTP 服务器,则可以使用 URL 重写 在您的
httpd.conf
或每个目录的基础上.htaccess
文件(这是最常见的方式,特别是对于您对 Apache 配置的控制有限的共享托管环境)。尝试将其放入基目录中的
.htaccess
文件中。 (注意;这不是我的想法,仅用作开始)If you are, or your web hosting provider is, running an Apache HTTP server, this can be accomplished with URL rewrites in your
httpd.conf
or on a per-directory basis with.htaccess
files (which is the most common way, particularly for a shared hosting enviroment where you have limited control over Apache's configuration).Try putting this in a
.htaccess
file in your base directory. (Note; this is off the top of my head, use only as a start)