多线程共享的文件系统

发布于 2024-11-24 16:08:11 字数 1633 浏览 1 评论 0原文

情况是这样的:

I am talking about general linux concurrent programming environment

Definition: 
  Node: a machine with a processor.
  file system: can be accessed both locally and remotely. 
 it includes large set of files varied in random size.      

Node I: Process A with multithreads access A's file system, operations include 
read and write. Process B, similar to A. Think about more similar processes 
C,D,etc. 

Then, thinking about scaling. The same FS system is located on a separate node. 
Operated by processes E,F,G etc on node II, and processes A,B,C,D on Node I.
thinking about similar node III,IV,V, etc.

这既是一个实用问题,也是一个面试问题。这是我的解决方案:

I can use mutex and signal resolve multi reader and writer of the same 
file within a process. And also using IPC resolve multiprocesses 
communication and synchronization.
the code could work very well for single node multiprocesses. 

But, when dealing with multi node. We need similar but more 
 complicated mechanism to detect are there any node 
 writting on the FS, if yes, wait; otherwise, access  
writting mutex and write, then notify waiting guys. 

经过更多思考,我的想法如下:

From the point of a NFS, we define file lock of course based on file. 
My target is:
at each moment,there is only one writer write the file, 
there can be more than one reader read the file. 
Then, all the processes on different nodes are the same. 
 they should have their own mechanism to acquire either read or write lock, 
 of course, dealing with connection, failures and retries.   

我想知道此类问题是否有一些原型?

here is the case:

I am talking about general linux concurrent programming environment

Definition: 
  Node: a machine with a processor.
  file system: can be accessed both locally and remotely. 
 it includes large set of files varied in random size.      

Node I: Process A with multithreads access A's file system, operations include 
read and write. Process B, similar to A. Think about more similar processes 
C,D,etc. 

Then, thinking about scaling. The same FS system is located on a separate node. 
Operated by processes E,F,G etc on node II, and processes A,B,C,D on Node I.
thinking about similar node III,IV,V, etc.

This is both a practical and interview question. Here is my solution:

I can use mutex and signal resolve multi reader and writer of the same 
file within a process. And also using IPC resolve multiprocesses 
communication and synchronization.
the code could work very well for single node multiprocesses. 

But, when dealing with multi node. We need similar but more 
 complicated mechanism to detect are there any node 
 writting on the FS, if yes, wait; otherwise, access  
writting mutex and write, then notify waiting guys. 

After more thinking, follows is my idea:

From the point of a NFS, we define file lock of course based on file. 
My target is:
at each moment,there is only one writer write the file, 
there can be more than one reader read the file. 
Then, all the processes on different nodes are the same. 
 they should have their own mechanism to acquire either read or write lock, 
 of course, dealing with connection, failures and retries.   

I am wondering if there is some prototype for such kind of problem?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

盛夏尉蓝 2024-12-01 16:08:11

我假设“节点”意味着“网络节点”,即运行自己的操作系统副本的实体。它可以是真实的机器或虚拟机。

老实说,这个问题的措辞非常糟糕。不太清楚所问的是什么;我可以假设他们没有询问节点间同步或锁定。

所以你对第一部分很擅长:线程之间的互斥,同一台机器上进程之间的 IPC 信号量。

如果要处理不同节点之间的交互,首先需要有一个网络文件系统,例如 NFS 或 CIFS。其次,您需要文件锁(或锁定文件)来管理对共享文件的访问。文件锁也可以在其他级别(线程间和进程间)使用,尽管它们不像互斥锁和信号量那么简单。

您还可以从套接字构建一个同步系统,但这需要每个节点都有一个到其他节点的套接字,这意味着 N^2 个套接字可能存在竞争条件,或者是一个中央交换所节点,这会成为单点故障。

I assume "node" means "network node", i.e. an entity running its own copy of the operating system. It could be an actual machine or a virtual machine.

The question is pretty badly worded, honestly. It's not really clear what is being asked; I could assume that they're not asking about inter-node synchronization or locking.

So you're good on the first part: mutexes between threads, IPC semaphores between processes on the same machine.

If you want to handle interactions between separate nodes, first you need to have a networked filesystem, such as NFS or CIFS. Second, you need file locks (or lock files) to manage access to shared files. File locks can also be used at the other levels, inter-thread and inter-process, though they're not as straightforward as mutexes and semaphores.

You could also build up a synchronization system from sockets, but that requires each node to have a socket to each other node, which means N^2 sockets with possible race conditions, or a central clearinghouse node, which becomes a single point of failure.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文