我什么时候应该在内核blockDev驱动程序中使用req_op_flush? (Req_op_flush Bio' flush Dirty Raid Controller Caches吗?)
我什么时候应该在内核blockDev驱动程序中使用req_op_flush,并且接收req_op_flush(或同等的SCSI CMD)的硬件的预期行为是什么?
在Linux内核中,当A struct Bio
标记为req_op_flush
以写入式控制器的形式传递给RAID Controller卷时,RAID控制器是否应该冲洗其肮脏的缓存器?
在我看来,这是req_op_flush
的目的,但这与想要快速使用写入返回的情况是不符的:如果缓存是电池收益的,是否应该忽略控制器吗?
- 冲洗行为是否取决于固件实施和制造商?
- SAS/SCSI规范在哪里?
- 其他考虑?
When should I use REQ_OP_FLUSH in my kernel blockdev driver, and what is the expected behavior of the hardware that receives the REQ_OP_FLUSH (or equivalent SCSI cmd)?
In the Linux kernel, when a struct bio
is flagged as REQ_OP_FLUSH
is passed to a RAID controller volume in writeback mode, is the RAID controller supposed to flush its dirty caches?
It seems to me that this is the purpose of REQ_OP_FLUSH
but that is at odds with wanting to be fast with writeback: If the cache is battery-backed, shouldn't the controller ignore the flush?
In ext4's super.c ext4_sync_fs() function, the write skips a call to blkdev_issue_flush()
when barriers are disabled via the barrier=0
mount option. This seems to imply that RAID controllers will flush their caches when they are told to...but does RAID firmware ever break the rules?
- Is the flush behavior dependent on the firmware implementation and manufacturer?
- Where is the SAS/SCSI specification on the subject?
- Other considerations?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
Linux-Block邮寄清单上的Christoph Hellwig说:
Kernel.org的Keith Busch:
如果这听起来很后退,请考虑使用突袭
控制器缓存为示例:
a RAID控制器,带有 non-dolaTile “ writeback”缓存(从
控制器的观点,即带有电池的)是“通过”
就内核而言,设备是因为控制器将
在持续的缓存中,将写入作为完整的写入。
具有 domaidile “写下”缓存的RAID控制器(来自
控制器的观点,即没有电池的)是“写回”
就内核而言,设备是因为控制器将
在缓存中返回写入的文字,但
缓存不是持久的!因此,在这种情况下,需要冲洗/FUA。
[参考:https://lore.kernel.org/all/< 8DDF8C88A98C9E818C8C8C9BC7878C9D“> [email&nbsp; procepted] < /a>/]
从个人经验中,并非所有RAID控制器都将正确设置Queue/write_cache,如上述Keith所示。如果您知道自己的数组具有以写入模式运行的非易失性高速缓存,请确保它在“写”中,以便将冲洗液被删除:
如果不处于适当的模式,则将其修复。下面的这些设置似乎反映了,但是如果它们这样做,则在上面重新阅读#1和#2,因为这些是正确的:
如果您有非挥发性cache ( IE,带有 bbu):
如果您有一个挥发性缓存(即,没有 bbu):
因此,关于何时标记的问题的答案
req_op_flush
在您的内核代码中是:每当您认为代码应将磁盘提交时。由于块层可以重新订购任何bio
请求,,然后保证您将在磁盘上的#1中获得IO 。
但是,如果所编写的设备在“通过”模式中具有CACHE_MODE,则齐平将立即完成,即使在功率损失后,它都可以完成工作并保持非挥发性高速缓存(BBU,supercap之后) ,flashcache等)。
Christoph Hellwig on the linux-block mailing list said:
Keith Busch at kernel.org:
If this sounds backwards, then consider this using a RAID
controller cache as an example:
A RAID controller with a non-volatile "writeback" cache (from the
controller's perspective, ie, with battery) is a "write through"
device as far as the kernel is concerned because the controller will
return the write as complete as soon as it is in the persistent cache.
A RAID controller with a volatile "writeback" cache (from the
controller's perspective, ie without battery) is a "write back"
device as far as the kernel is concerned because the controller will
return the write as complete as soon as it is in the cache, but the
cache is not persistent! So in that case flush/FUA is necessary.
[ Reference: https://lore.kernel.org/all/[email protected]/ ]
From personal experience, not all raid controllers will properly set queue/write_cache as indicated by Keith above. If you know your array has a non-volatile cache running in write-back mode then check make sure it is in "write through" so flushes will be dropped:
and fix it if it isn't in the proper mode. These settings below might seem backdwards, but if they do, then re-read #1 and #2 above because these are correct:
If you have a non-volatile cache (ie, with BBU):
If you have a volatile cache (ie, without BBU):
So the answer to the question about when to flag
REQ_OP_FLUSH
in your kernel code is this: whenever you think your code should commit to disk. Since the block layer can re-order anybio
request,and then you are guaranteed to have the IO from #1 on disk.
However, if the device being written has cache_mode in "write through" mode, then the flush will complete immediately and its up to your controller do do its job and keep the non-volatile cache active, even after a power loss (BBU, supercap, flashcache, etc).