关于高流量网站缓存的问题

发布于 2024-08-20 17:39:19 字数 496 浏览 14 评论 0原文

假设我们正在构建一个电子商务网站，允许消费者通过输入关键字来搜索产品。假设最多有 20 万种产品，并且有数百万消费者使用该系统。假设产品表更新得相当频繁。由于产品数量不是那么多，我们可以将整个产品表存储在内存中并对其进行搜索，而不是访问数据库。我们希望创建存储相同数据但驻留在不同服务器中的分布式缓存（出于高可用性和性能原因），并且我们需要能够在这些缓存之间同步数据并在修改产品表时使缓存失效。

我们的应用程序是使用 ASP.NET MVC 和 NHibernate 构建的。我试图了解 NHibernate 的 2 级缓存是否对我的情况有所帮助。如果你们能对此有所了解，我将非常感激。

我知道二级缓存将有助于缓存查询结果，因此如果两个不同的用户使用相同的关键字进行搜索，二级缓存将从缓存而不是从数据库提供结果。但这对我们没有多大帮助，因为产品表会频繁更新，并且缓存的结果会过时。我的问题是我是否正确理解 L2 缓存，是否存在任何可以帮助按照我想要的方式管理缓存的内容（多个缓存、相同的数据、缓存之间的同步和无效缓存）。任何想法都受到高度赞赏。

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

虚拟世界 2024-08-27 17:39:19

使用二级缓存（使用 memcached 提供程序）和 NHibernate.Search 附加组件后，在我看来，您可以从两者中受益。

NHibernate.Search 组件依赖于 Lucene.Net，关键字搜索与数据库本身是解耦的。每个映射的类都会创建一个不同的索引文件，并且可以使用属性在属性级别上设置优化，从而为您提供额外的粒度级别。此外，您还可以实现最佳匹配和建议（检查“行动中的 Lucene”和/或“行动中的 Hibernate 搜索”）。请注意，您不必维护索引（除非您明确请求重建索引）；尽管您可以根据需要操作索引，但该实现会在幕后管理所有内容。因此，添加/删除/更新产品将自动更新相应的索引。

对于二级缓存，您可以获得即时性能提升。在具有大约 200 万行数据集的测试环境中，即使请求数极低，我也获得了超过 20% 的改进。随着请求计数的增加，性能提升逐渐变大 - 应用程序首先访问二级缓存，如果没有找到，则访问数据库以获取所需的行并将它们插入到缓存中以供将来查询。同样，您可以管理缓存持续时间和其他配置设置等内容，如果您愿意的话，还可以显式清除缓存（全部、部分或特定条目）。请注意，缓存状态由应用程序在保存/更新/删除期间管理。

为了可扩展性
* 二级缓存取决于提供商（即memcached 具有高性能和可扩展性并支持分布式实例）。
* 对于 Lucene.Net/NHibernate.Search，您需要设置索引驻留的特定位置，并且所有 Web 应用程序实例都必须可以访问该位置以进行读/写。请注意，这里的敏感链接是 I/O 和文件争用，因此设置一台具有比光速更快的文件系统的计算机将防止这种情况发生（我指的是每秒有数千个搜索请求的场景

）请注意，我强烈推荐 NHibernate.Search，因为它比 LIKE 查询要快得多，并且比在应用程序中实现 SQL-Server 的全文搜索（我已经完成）更容易使用。

Having used both the second-level cache (using the memcached provider) and the NHibernate.Search add-on it seems to me you could benefit from both.

The NHibernate.Search component depends on Lucene.Net and keyword search is decoupled from the Database it self. A different index file is created per class mapped and optimizations can be set on the property level using attributes, giving you an extra level of granularity. Additionally, you can implement best match and propositions (check Lucene in Action and/or Hibernate Search in action). As a note, you don't have to maintain the index (unless you explicitly request an index rebuild); the implementation manages everything behind the scenes although you can manipulate the index if you wish to do so. So, adding/deleting/updating a product will automatically update the according index.

For the second-level cache you get instant performance boost. On a test environment with a data set of approx 2 mil rows i had more than 20% improvement even on an extremely low request count. The performance boost is gradually larger as the request count increases - the application first hits the 2nd level cache and if it does not find it then hits the DB to fetch the required rows and inserts them on the cache for future queries. Again you can manage stuff like cache duration and other configuration settings, as well as explicitly clear the cache (all of it, a part of it, or particular entries) if you wish to do so. Note that cache state is managed by the application during save/update/delete.

For scallability
* the 2nd level cache depends on the provider (ie memcached is highly performant and scalable and supports distributed instances).
* for the Lucene.Net/NHibernate.Search you will need to set up a specific place that the indexes will reside and that place must be accessible for read/write by all web-application instances. Note here that the sensitive link is I/O and file contention, so setting up a machine with a faster than light file system will prevent that from happening (i am speaking for your scenario with many thousands of search requests per second)

As a side note i would highly recommend NHibernate.Search since it is extremely faster than LIKE queries and is easier to use than implementing SQL-Server's FullText search inside the application (which i have done).

回复收藏 0 原文