您将使用什么来实现快速且轻量级的文件服务器?
我需要有一个文件服务器作为桌面应用程序的一部分,它应该尽快响应文件传输请求(来自远程客户端,通常位于同一 LAN 上)。 对于小文件会有很多文件请求。 服务器应该能够提供上传和下载服务。
我对任何特定技术并不严格,所以我对任何编程语言、工具包、库都持开放态度,只要它们可以在 Windows 上运行。
我最初的想法是使用 Windows Sockets 进行 C/C++ 实现,或者使用 Boost(asio 等)等库提供的服务。 我也考虑过 Erlang,但我必须学习,因此性能优势应该证明由于必须学习该语言而增加的开发时间是合理的。
稍后编辑:我很欣赏那些说使用 FTP 或 HTTP 或基本上任何已经创建的东西的答案,但考虑到您仍然想从头开始编写一个,您会做什么?
I need to have as part of a desktop application a file server which should respond as fast as possible to file transfer requests (from remote clients, usually located on the same LAN). There will be many file requests for small sized files. The server should be able to provide both upload and download services.
I am not tight to any particual technology so I am open to any programming language, toolkits, libraries as long as they can run on Windows.
My initial take is to go with a C/C++ implementation using Windows Sockets or use the services provided by libraries such as Boost (asio or such). I have also thought of Erlang but that I'll have to learn and so the performance benefits should justify the increased development time due to having to learn the language.
LATER EDIT: I appreciate the answers that say use FTP or HTTP or basically anything that has been already created but considering you still want to write one from scratch, what would you do?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(8)
为什么不直接使用 FTP 呢? 您应该能够找到任何语言的足够的服务器实现,以及客户端访问库。
听起来像是很多轮子的重新发明。 诚然,FTP 并不理想,并且有一些奇怪的地方,但是……它就在那里,它是标准的、众所周知的,并且已经得到了广泛的实施。
Why not just go with FTP? You should be able to find an adequate server implementation in any language, and client access libraries too.
It sounds like a lot of wheel-reinvention. Granted, FTP is not ideal, and has a few odd spots, but ... it's there, it's standard, well-known, and already very widely implemented.
对于频繁上传小文件,最快的方法是实现自己的专有协议,但这需要大量的工作 - 而且它是非标准的,这意味着未来的集成将很困难,除非您能够实现您支持的任何客户端中的协议。 如果您无论如何选择这样做,这是我对简单协议的建议:
这可以在简单的 TCP 套接字之上实现。 您还可以使用 UDP,避免建立连接的成本,但在这种情况下您必须处理重传控制。
在决定实现您自己的协议之前,请查看像 libcurl 这样的 HTTP 库,您可以让您的服务器使用标准 HTTP 命令,例如 GET 进行下载和 POST 进行上传。 这将节省大量工作,并且您将能够使用任何网络浏览器测试下载。
另一个提高性能的建议是使用 SQLite 之类的东西而不是文件系统作为文件存储库。 您可以创建一张表,其中包含一个用于文件名的 char 列和一个用于文件内容的 blob 列。 由于 SQLite 是轻量级的并且具有高效的缓存,因此大多数时候您将避免磁盘访问开销。
我假设您不需要客户端身份验证。
最后:尽管 C++ 是您首选的为您提供原始本机代码速度的语言,但这很少是此类应用程序的主要瓶颈。 最有可能的是磁盘访问和网络带宽。 我提到这一点是因为在 Java 中,您可能能够用不到 100 行代码创建一个 servlet 来完成完全相同的事情(使用 HTTP GET 进行下载,使用 POST 进行上传)。 在本例中,使用 Derby 而不是 SQLite,将该 servlet 放入任何容器(Tomcat、Glassfish 等)中,然后就完成了。
For frequent uploads of small files, the fastest way would be to implement your own proprietary protocol, but that would require a considerable amount of work - and also it would be non-standard, meaning future integration would be difficult unless you are able to implement your protocol in any client you'll support. If you choose to do it anyway, this is my suggestion for a simple protocol:
This could be implemented on top of a simple TCP socket. You can also use UDP, avoiding the cost of establishing a connection but in this case you have to deal with retransmission control.
Before deciding to implement your own protocol, take a look at HTTP libraries like libcurl, you could make your server use standard HTTP commands like GET for download and POST for upload. This would save a lot of work and you'll be able to test the download with any web browser.
Another suggestion to improve performance is to use as the file repository not the filesystem, but something like SQLite. You can create a single table containing one char column for the file name and one blob column for the file contents. Since SQLite is lightweight and does an efficient caching, you'll most of the time avoid the disk access overhead.
I'm assuming you don't need client authentication.
Finally: although C++ is your preference to give you raw native code speed, rarely this is the major bottleneck in this kind of application. Most probably will be disk access and network bandwidth. I'm mentioning this because in Java you'll probably be able to make a servlet to do exactly the same thing (using HTTP GET for download and POST for upload) with less than 100 lines of code. Use Derby instead of SQLite in this case, put that servlet in any container (Tomcat, Glassfish, etc) and it's done.
如果所有计算机都在同一 LAN 上运行 Windows,那么为什么还需要服务器呢? 为什么不直接使用 Windows 文件共享呢?
If all the machines are running on Windows on the same LAN, why do you need a server at all? Why not simply use Windows file sharing?
我建议不要使用 FTP、SFTP 或任何其他面向连接的技术。 相反,应选择无连接协议或技术。
原因是,如果您需要上传或下载大量小文件,并且响应应该尽可能快,您希望避免建立和破坏连接的成本。
我建议您考虑使用现有的实现或实现您自己的 HTTP 或 HTTPS 服务器/服务。
I would suggest not to use FTP, or SFTP, or any other connection oriented technique. Instead, go for a connectionless protocol or technique.
The reason is that, if you require lots of small files to be uploaded or downloaded, and the response should be as fast as possible, you want to avoid the cost of setting up and destroying connections.
I would suggest that you look at either using an existing implementation or implementing your own HTTP or HTTPS server/service.
您的瓶颈可能来自以下来源之一:
硬盘 I/O - WD velociraptor 的随机访问速度应该约为 100MB/s。 另外,将其设置为 RAID0、1、5 还是其他也很重要。 有些读得快但写得慢。 权衡。
网络 I/O - 假设您在快速 RAID 设置中拥有最快的硬盘,除非您使用 Gbit I/O,否则您的网络将会很慢。 如果您的管道很大,您仍然需要为其提供数据。
内存缓存 - 内存文件系统缓存需要足够大来缓冲所有网络 I/O,这样才不会减慢速度。 这将需要大量的内存来完成您正在查看的工作类型。
文件系统结构 - 假设您有千兆字节的内存,那么瓶颈很可能是您用于文件系统的数据结构。 如果文件系统结构很麻烦,它会减慢您的速度。
假设所有其他问题都解决了,那么您还担心您的应用程序本身吗? 请注意,大多数瓶颈都超出了软件的控制范围。 因此,无论您使用 C/C++ 进行编码还是使用特定的库,您仍然会受到操作系统和硬件的支配。
Your bottlenecks are likely to come from one of the following sources:
Harddisk I/O - The WD velociraptor is supposed to have a random access speed of about 100MB/s. Also, it is important whether you set it up as RAID0,1,5 or what nots. Some read fast but write slow. Trade-offs.
Network I/O - Assuming that you have the fastest harddisks in a fast RAID setup, unless you use Gbit I/O, your network will be slow. If your pipes are big, you still need to supply it with data.
Memory cache - The in-memory file-system cache will need to be big enough to buffer all the network I/O so that it does not slow you down. That will require large amounts of memory for the kind of work you're looking at.
File-system structure - Assuming that you have gigabytes worth of memory, then the bottleneck will most likely be the data-structure that you use for the file-system. If the file-system structure is cumbersome it will slow you down.
Assuming that all the other problems are solved, then do you worry about your application itself. Notice, that most of the bottlenecks are outside your software control. Therefore, whether you code it in C/C++ or use specific libraries, you will still be at the mercy of the OS and hardware.
听起来你应该使用 SFTP (SSH) 服务器,它是防火墙/NAT 安全的,安全,并且已经完成了您想要的事情以及更多。 您还可以使用 SAMBA 或 Windows 文件共享来实现更简单的实现。
Sounds like you should use an SFTP (SSH) server, it's firewall/NAT safe, secure, and already does what you want and more. You could also use SAMBA or windows file sharing for an even more simple implementation.
为什么不使用现有的东西,例如普通的 Web 服务器可以很好、快速地处理大量小文件(图像)。
很多人已经花时间优化代码了。
第二个好处是传输是通过 HTTP 完成的,HTTP 是一个既定的协议。 如果您需要更高的安全性,可以轻松切换到 SSL。
对于上传,使用脚本或自定义模块也没有问题 - 使用相同的方法您也可以添加授权。
只要您不需要动态查找文件,我想这将是最好的解决方案之一。
Why not use something existing, for example a normal Web server handles a lot of small files (images) very well and fast.
And lots of people already spent time in optimizing the code.
And the second benefit is that the transfer is done with HTTP which is an established protocol. And is easily switched to SSL if you need more security.
For the uploads, they are also no problem with a script or custom module - with the same method you can also add authorization.
As long as you don't need to dynamically seek the files i guess this would be one of the best solutions.
它是现有桌面应用程序的新部分吗? 服务器的目标是什么? 它是否保护上传/下载的文件并提供身份验证和/或授权? 它是否提供某种结构来存储上传内容?
一种选择可能是在计算机上安装 Apache HTTP Server 并通过该服务器提供文件。 使用 POST 上传,GET 下载。
如果客户端位于 LAN 内,您能不能只共享驱动器?
It's a new part to an existing desktop application? What's the goal of the server? Is it protecting the files that are uploaded/downloaded and providing authentication and/or authorisation? Does it provide some kind of structure for the uploads to be stored in?
One option may be to install Apache HTTP Server on the machine and serve the file via that. Use POST to upload and GET to download.
If the clients are within a LAN could you not just share a drive?