自动检测任意电子商务网站的产品数据源?

发布于 2024-08-23 10:57:51 字数 316 浏览 6 评论 0原文

我的网络应用程序需要访问任意电子商务商店并确定它是否具有产品数据提要(即 Google Base 提要;商店中所有产品的 RSS/ATOM 提要)。另外,我需要提取该提要的位置。

到目前为止,我能想到的最好的解决方案是维护给定电子商务平台的这些提要的已知位置的完整列表,并逐一检查站点,当它们返回 404 时将它们从列表中划掉。

两个问题:

  1. 你能想到更好的方法吗?
  2. 我将如何生成此已知产品数据源位置列表?根据我的经验,它们通常不会公开(与博客 RSS 提要不同)。

非常感谢! :)

-富有的

My web app needs to access an arbitrary E-Commerce store and determine whether or not it has a product data feed (i.e. a Google Base feed; an RSS/ATOM feed of all products in the store). Also, I need to extract the location of this feed.

The best solution I can think of so far is to maintain a comprehensive list of known locations of these feeds for given E-Commerce platforms and check them one by one for the site, crossing them off the list as they come back 404.

Two questions:

  1. Can you think of a better approach?
  2. How would I go about generating this list of known product data feed locations? In my experience, they are generally not made public (unlike blog RSS feeds).

Thanks so much! :)

-Rich

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

森罗 2024-08-30 10:57:51

你能想出更好的方法吗?

用户搜索引擎 API 来发现 Feed。您可以尝试使用 Google、Bing 和 Yahoo 搜索 API 来发现您所在域的产品 Feed。可以按如下方式完成:

  1. 列出您感兴趣的公共提要格式(例如 Google Base、Shopzilla 等)
  2. 检查每个提要规范以查找您可以搜索的唯一字符串。
  3. 制作返回相关结果的搜索 API 查询(限制域、文件类型等)。
  4. 测试您返回的产品源链接。

显然,这假设搜索引擎已找到提要并对其建立索引。

我将如何生成此已知产品数据源位置列表?

我不相信产品数据源存在“已知位置”这样的东西。不过,您可以尝试在算法中包含以下模式:

  • 来自您已经了解的任何提要的 URL 模式。
  • 您已经猜到的 URL 模式(将自己置于网站管理员的立场上,思考他/她会如何命名它们)。
  • 查看常用电子商务软件和产品数据源插件的文档,以确定其默认源位置。包括他们的 URL 模式。

Can you think of a better approach?

User Search Engine APIs to Discover Feeds. You could try using the Google, Bing and Yahoo Search APIs to discover product feeds on the domains you are interested in. This could be done as follows:

  1. List the public feed formats you are interested in (e.g. Google Base, Shopzilla etc)
  2. Examine each feed spec for unique strings you can search on.
  3. Craft search API queries that return relevant results (restrict on domain, file type etc).
  4. Test the links you get back for product feeds.

Obviously, this assumes that the feeds have been found and indexed by the search engines.

How would I go about generating this list of known product data feed locations?

I don't believe there is such a thing as a "known location" for a product data feed. However, you could try including the following patterns in your algorithm:

  • URL patterns from any feeds you already know about.
  • URL patterns you have guessed (put yourself in the webmaster's shoes and think what he/she would name them).
  • Review the documentation for commonly used eCommerce software and product data feed plugins to determine their default feed locations. Include their URL patterns.
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文