RDBMS 作为缓存，需要设计建议

发布于 2024-12-17 16:59:40 字数 813 浏览 2 评论 0原文

我有一个“黑匣子”应用程序，它获取值映射作为参数，执行繁重且长时间（最多 5 秒）的计算，并生成可以保存在数据库中的单个 Result。我对该应用程序的了解是：

结果相对于提供的映射 af 值是唯一的
参数是一个 String->String 映射，两者的最大长度都已知键和值
参数映射的长度可变（从 2-3 到 1000 个条目或 so)
可能的键值列表的大小约为 1000 个

示例参数为：

Map: {'k1'->'a', 'k2'->'b'} 
Map: {'k1'->'a', 'k2'->'b', ... 'k100'->'zzz'}
Map: {'k1'->'x', 'k8'->'y'}
Map: {'k6'->'z'}

上述每个参数都会生成唯一的 Result 对象。

现在想象另一个服务，它构建在那个缓慢的库之上，并且需要上线并每秒处理数十个计算请求。如果不缓存已经计算的结果，这是不可能的。我对可能的缓存大小总数的估计约为 100-5 亿条记录，这使我倾向于使用 RDBMS 作为缓存存储。

由于结果由提供的映射唯一标识，我可以按键对参数映射进行排序并将其连接到字符串“k1:a:k2:b....”。这肯定是缓存键，但是：

缓存键将会很大，超过许多 RDBMS 的键大小限制，并且需要索引 CLOB
我不会利用键值限制的事实可能的值。

你有什么建议？性能是我在这里主要关心的问题。

原文

I have a 'black box' application that gets a map of values as parameters, performs heavy and long (up to 5s) calculations and generates single Result which can be persisted in a database.
All I know about that application is that:

Result is unique with respect to provided map af values
Argument is a String->String map with known maximun length for both
key and value
Argument map is of variable length (from 2-3 up to 1000 entries or
so)
The size of list of possible key values is around 1000

Sample arguments are:

Map: {'k1'->'a', 'k2'->'b'} 
Map: {'k1'->'a', 'k2'->'b', ... 'k100'->'zzz'}
Map: {'k1'->'x', 'k8'->'y'}
Map: {'k6'->'z'}

Each of the above will produce unique Result object.

Now imagine another service, which is built on top of that slow library, and which needs to go online and handle dozens of calculation requests per second.
This is impossible without caching of already calculated results.
My estimation of total number of possible cache size is somewhat around 100-500 millions of records, which leads me towards using RDBMS as cache storage.

As the result is uniquely identified by provided map, I could sort argument map by key and concatenate it into the string 'k1:a:k2:b....'. That will definetely be the cache key, but:

Cache key will be huge, above key size limits for many RDBMS and
require indexed CLOB's
I will make no use of the fact that key values are limited in
possible values.

What'd be your advice? Performance is my main concern here.

分享到QQ

分享到微博