SEO 和 AJAX(Twitter 风格)

发布于 2024-10-27 11:46:52 字数 231 浏览 1 评论 0原文

好吧,所以我正在想办法。我正处于网站的规划阶段,我想通过 JQuery 实现“滚动获取数据”,就像 Facebook 和 Twitter 一样,这样我就不会立即从数据库中提取所有数据。

但是我在SEO方面遇到了一些问题,Google如何才能看到所有数据?因为当用户滚动时页面会自动获取更多数据,所以我不能包含任何“转到第 2 页”样式的链接,我希望 Google 只索引该页面。

有什么简单而聪明的解决方案的想法吗?

Okay, so I'm trying to figure something out. I am in the planing stages of a site, and I want to implement "fetch data on scroll" via JQuery, much like Facebook and Twitter, so that I don't pull all data from the DB at once.

But I some problems regarding the SEO, how will Google be able to see all the data? Because the page will fetch more data automatically when the user scrolls, I can't include any links in the style of "go to page 2", I want Google to just index that one page.

Any ideas for a simple and clever solution?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

喵星人汪星人 2024-11-03 11:46:52

将指向第 2 页的链接放在适当的位置。

如果您检测到自动加载代码可以正常工作,请使用 JavaScript 删除它们。

渐进增强就是很好的实践。

Put links to page 2 in place.

Use JavaScript to remove them if you detect that your autoloading code is going to work.

Progressive enhancement is simply good practise.

伏妖词 2024-11-03 11:46:52

您可以使用 PHP(或其他服务器端脚本)来检测您特别想要定位的网络爬虫的用户代理,例如 Googlebot。

对于网络爬虫,您必须使用非基于 JavaScript 的技术来提取数据库内容并布局页面。我建议不要对搜索引擎目标内容进行分页 - 假设您没有对“人类”版本进行分页。网络爬虫发现的 URL 应与您的(人类)访问者将访问的 URL 相同。在我看来,该页面应该只通过一次性从数据库中提取更多内容来偏离“人类”版本。

网络爬虫及其用户代理(包括 Google 的)列表位于:

http://www.useragentstring.com /pages/Crawlerlist/

是的,正如其他人所说,不要在 JavaScript 上回复您希望在搜索引擎中看到的内容。事实上,当开发人员没有出现在搜索引擎中的内容时,它经常被使用。

所有这些都伴随着骑手,它假设您根本没有分页。如果是,那么您应该使用服务器端脚本对页面进行分页,以便搜索引擎能够获取它们。另外,请记住对为搜索引擎提取的数据库数量设置合理的限制。您不希望它在获取页面之前超时。

You could use PHP (or another server-side script) to detect the user agent of webcrawlers you specifically want to target such as Googlebot.

In the case of a webcrawler, you would have to use non-JavaScript-based techniques to pull down the database content and layout the page. I would recommended not paginating the search-engine targeted content - assuming that you are not paginating the "human" version. The URLs discovered by the webcrawler should be the same as those your (human) visitors will visit. In my opinion, the page should only deviate from the "human" version by having more content pulled from the DB in one go.

A list of webcrawlers and their user agents (including Google's) is here:

http://www.useragentstring.com/pages/Crawlerlist/

And yes, as stated by others, don't reply on JavaScript for content you want see in search engines. In fact, it is quite frequently use where a developer doesn't something to appear in search engines.

All of this comes with the rider that it assumes you are not paginating at all. If you are, then you should use a server-side script to paginate you pages so that they are picked up by search engines. Also, remember to put sensible limits on the amout of your DB that you pull for the search engine. You don't want it to timeout before it gets the page.

千秋岁 2024-11-03 11:46:52

创建一个 Google 网站站长工具帐户,生成一个 站点地图 为您的网站(手动、自动或使用 cronjob - 无论适合什么)并告诉 Google 网站管理员工具。当您的网站获得新内容时更新站点地图。 Google 将抓取此内容并将您的网站编入索引。

站点地图将确保您的所有内容都是可发现的,而不仅仅是 Googlebot 访问时恰好出现在主页上的内容。

鉴于您的问题主要是关于 SEO,我建议您阅读 Jeff Atwood 介绍了 Stackoverflow 站点地图的重要性以及它对 Google 流量的影响。

您还应该添加分页链接,这些链接会被样式表隐藏,并且当您的无限滚动被不使用 JavaScript 的人禁用时,可以作为后备链接。如果您正确地构建了网站,那么这些将只是您无限滚动加载的部分内容,因此确保它们位于页面上是理所当然的。

Create a Google webmaster tools account, generate a sitemap for your site (manually, automatically or with a cronjob - whatever suits) and tell Google webmaster tools about it. Update the sitemap as your site gets new content. Google will crawl this and index your site.

The sitemap will ensure that all your content is discoverable, not just the stuff that happens to be on the homepage when the googlebot visits.

Given that your question is primarily about SEO, I'd urge you to read this post from Jeff Atwood about the importance of sitemaps for Stackoverflow and the effect it had on traffic from Google.

You should also add paginated links that get hidden by your stylesheet and are a fallback for when your endless-scroll is disabled by someone not using javascript. If you're building the site right, these will just be partials that your endless scroll loads anyway, so it's a no-brainer to make sure they're on the page.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文