我正在学习Cassandra的阅读道路。
根据一些消息来源:
“当Cassandra收到读取请求时,将首先在记忆中搜索数据,然后将数据搜索,如果存在数据,则将返回数据。”
此外,我也知道,记忆会定期刷新到磁盘上的Sstables。
我的问题:
-
在冲洗到sstables后,
是否已完全从RAM中删除?
-
假设,我们在节点上有一个读取请求。节点同时包含模拟物和Sstables。
Cassandra是否只能从Memtables 获得所需的数据而无需访问Sstables?如果是的,则在可能的情况下,Cassandra如何确定仅存储在模拟物中的所需数据,并且在磁盘(sstables)上没有其他相关数据?
?
I'm learning Cassandra's read path.
According to some sources:
"When Cassandra receives the read request, data will be searched first in the Memtable, then data will be searched in SSTables and if data exists it is returned"
Also, I know that Memtables are periodically flushed to SSTables on disk.
My questions:
-
are memtables fully deleted from RAM after flushing to SSTables?
-
Suppose, we have a read request on a node. Node contains both memtables and SSTables.
Is it possible for Cassandra to get required data only from Memtables without accessing SSTables? If yes, when it is possible and how can Cassandra determine that required data stored only in Memtables and there are no other related data stored on disk (SSTables)?
发布评论
评论(1)
第二个问题的简短答案 - 即使数据在记忆中,Cassandra也将始终检查Sstables。原因是记忆中的数据可能比Sstable中的数据更古老。例如,如果您明确设置了记录的写入时间戳,否则数据将从其他节点上的提示中重播。当被记住时,将数据从内存中删除。但是在某些情况下,如果您有经常访问的数据,则可以使用行缓存。
您可以阅读更多有关。
Short answer to 2nd question - No. Cassandra will always check SSTables even if the data is in the memtable. The reason for that is the data in the memtable could be older than data in the SSTable. For example, if you're explicitly set write timestamp for records, or data is replayed from the hints on other node. When memtable is flushed, data is removed from memory. But in some cases you can use row cache if you have data that is often accessed.
You can read more about read path in the DSE arch guide.