我正在用户空间设计一个文件系统,需要测试它。我不想使用可用的基准测试工具,因为我的要求不同。因此,为了测试文件系统,我希望模拟文件访问操作。为此,我首先使用 ftw() 函数遍历现有文件系统(实验性)并列出文件中的所有文件和目录。
然后,我调用模拟器来模拟多个进程的文件访问。因此,模拟器随机启动一个进程,即它分叉一个线程,该线程执行真实进程会执行的操作。线程随机选择一个文件操作(读、写、重命名等),从列表(由 ftw() 生成)中选择该操作的参数。该线程执行许多此类文件操作,然后退出,标志着进程的结束。模拟器继续生成线程;线程执行可以像真实进程一样重叠。现在,由于操作是由线程执行的,因此文件会被插入、删除、重命名,并且会在文件列表中更新。
我还没有开始编码。这个计划看起来合理吗?我也不确定如何对模拟器进行编码...它将如何在一段时间内产生线程。我应该使用一些随机延迟来做到这一点吗?
谢谢
I am designing a file system in user space and need to test it. I do not want to use the available benchmarking tools as my requirements are different. So to test the file system I wish to simulate file access operation. To do this, I first use the ftw() function to walk through one f my existing file system(experimental) and list all the files and directories in a file.
Then I invoke a simulator to simulate file access by a number of processes. Thus, the simulator randomly starts a process i.e it forks a thread which does what a real process would have done. The thread randomly selects a file operation (read, write, rename etc) selects arguments to this operation from the list(generated by ftw()) . The thread does a number of such file operations and then exits marking the end of a process. The simulator continues to spawn threads; thread execution can overlap just as real processes do. Now, as operations are performed by threads, files get inserted, deleted, renamed and this is updated in the list of files.
I have not yet started coding. Does the plan seem sane? I am also not sure how to code the simulator...how will it spawn threads over a period of time. Should I be using some random delay to do this.
Thanks
发布评论
评论(2)
是的,这对我来说似乎相当合理。我会考虑尝试对您的文件操作(以及对特定文件的访问)施加统计分布,该统计分布在某种程度上与您的预期工作负载相匹配。您也许可以找到一些有关典型文件系统工作负载的统计信息作为起点。
Yep, that seems fairly reasonable to me. I would consider attempting to impose a statistical distribution over your file operations (and accesses to particular files) that is somehow matched to your expected workload. You might be able to find some statistics about typical filesystem workloads as a starting point.
这听起来对于一个像样的测试用例来说是正确的,只是为了确保它正常工作。您可以使用 sleep() 在生成线程之间等待,或者一次生成所有线程并让它们执行一个操作,然后稍等一下,然后执行另一个操作,等等...IMO,如果您遇到大量请求并且遇到困难,它可以工作,那么你的文件系统很可能会运行得很好。以 PostMark 为例,它所做的只是疯狂地附加到不同的文件和其他基准测试中,这些基准测试在不同位置进行随机访问读/写,以确保必须从磁盘读取页面。
That sounds about right for a decent test case just to make sure it's working. You could use sleep() to wait between spawning threads or just spawn them all at once and have them do an operation then wait a bit, then do another operation, etc... IMO if you hit it hard with a lot of requests and it works then there's a likely chance your filesystem will do just fine. Take an example from PostMark which all it does is append like crazy to different files and other benchmarks that do random access reads/writes in different locations to make sure that the page has to be read from disk.