由于额外读取,通过网络读取文件速度较慢

发布于 2024-08-17 18:15:59 字数 3533 浏览 1 评论 0原文

我正在读取一个文件,我要么读取一行数据(1600 次连续读取 17 字节),要么读取一列数据(1600 次读取 17 字节,间隔为 1600*17=27,200 字节)。该文件位于本地驱动器或远程驱动器上。我执行了 10 次读取操作,因此我希望每次读取 272,000 字节的数据。

在本地驱动器上,我看到了我所期望的。在远程驱动器上,当顺序读取时,我也看到了我所期望的内容,但是当读取列时,我看到正在进行大量额外的读取。它们的长度为 32,768 字节,似乎没有被使用,但它们使读取的数据量从 272,000 字节跃升至 79 MB 到 106 MB。以下是使用 Process Monitor 的输出:

1:39:39.4624488 PM  DiskSpeedTest.exe   89628   ReadFile    \\BCCDC01\BCC-raid3\SeisWareInc Temp Dir\BPepers_Temp\Projects\PT_4\Horizons\BaseName3D_1\RR_AP SUCCESS Offset: 9,390,069, Length: 17
1:39:39.4624639 PM  DiskSpeedTest.exe   89628   FASTIO_CHECK_IF_POSSIBLE    \\BCCDC01\BCC-raid3\SeisWareInc Temp Dir\BPepers_Temp\Projects\PT_4\Horizons\BaseName3D_1\RR_AP SUCCESS Operation: Read, Offset: 9,390,069, Length: 17
1:39:39.4624838 PM  DiskSpeedTest.exe   89628   ReadFile    \\BCCDC01\BCC-raid3\SeisWareInc Temp Dir\BPepers_Temp\Projects\PT_4\Horizons\BaseName3D_1\RR_AP SUCCESS Offset: 9,388,032, Length: 32,768, I/O Flags: Non-cached, Paging I/O, Synchronous Paging I/O, Priority: Normal
1:39:39.4633839 PM  DiskSpeedTest.exe   89628   ReadFile    \\BCCDC01\BCC-raid3\SeisWareInc Temp Dir\BPepers_Temp\Projects\PT_4\Horizons\BaseName3D_1\RR_AP SUCCESS Offset: 9,417,269, Length: 17
1:39:39.4634002 PM  DiskSpeedTest.exe   89628   FASTIO_CHECK_IF_POSSIBLE    \\BCCDC01\BCC-raid3\SeisWareInc Temp Dir\BPepers_Temp\Projects\PT_4\Horizons\BaseName3D_1\RR_AP SUCCESS Operation: Read, Offset: 9,417,269, Length: 17
1:39:39.4634178 PM  DiskSpeedTest.exe   89628   ReadFile    \\BCCDC01\BCC-raid3\SeisWareInc Temp Dir\BPepers_Temp\Projects\PT_4\Horizons\BaseName3D_1\RR_AP SUCCESS Offset: 9,444,469, Length: 17
1:39:39.4634324 PM  DiskSpeedTest.exe   89628   FASTIO_CHECK_IF_POSSIBLE    \\BCCDC01\BCC-raid3\SeisWareInc Temp Dir\BPepers_Temp\Projects\PT_4\Horizons\BaseName3D_1\RR_AP SUCCESS Operation: Read, Offset: 9,444,469, Length: 17
1:39:39.4634529 PM  DiskSpeedTest.exe   89628   ReadFile    \\BCCDC01\BCC-raid3\SeisWareInc Temp Dir\BPepers_Temp\Projects\PT_4\Horizons\BaseName3D_1\RR_AP SUCCESS Offset: 9,441,280, Length: 32,768, I/O Flags: Non-cached, Paging I/O, Synchronous Paging I/O, Priority: Normal
1:39:39.4642199 PM  DiskSpeedTest.exe   89628   ReadFile    \\BCCDC01\BCC-raid3\SeisWareInc Temp Dir\BPepers_Temp\Projects\PT_4\Horizons\BaseName3D_1\RR_AP SUCCESS Offset: 9,471,669, Length: 17
1:39:39.4642396 PM  DiskSpeedTest.exe   89628   FASTIO_CHECK_IF_POSSIBLE    \\BCCDC01\BCC-raid3\SeisWareInc Temp Dir\BPepers_Temp\Projects\PT_4\Horizons\BaseName3D_1\RR_AP SUCCESS Operation: Read, Offset: 9,471,669, Length: 17
1:39:39.4642582 PM  DiskSpeedTest.exe   89628   ReadFile    \\BCCDC01\BCC-raid3\SeisWareInc Temp Dir\BPepers_Temp\Projects\PT_4\Horizons\BaseName3D_1\RR_AP SUCCESS Offset: 9,498,869, Length: 17
1:39:39.4642764 PM  DiskSpeedTest.exe   89628   FASTIO_CHECK_IF_POSSIBLE    \\BCCDC01\BCC-raid3\SeisWareInc Temp Dir\BPepers_Temp\Projects\PT_4\Horizons\BaseName3D_1\RR_AP SUCCESS Operation: Read, Offset: 9,498,869, Length: 17
1:39:39.4642922 PM  DiskSpeedTest.exe   89628   ReadFile    \\BCCDC01\BCC-raid3\SeisWareInc Temp Dir\BPepers_Temp\Projects\PT_4\Horizons\BaseName3D_1\RR_AP SUCCESS Offset: 9,498,624, Length: 32,768, I/O Flags: Non-cached, Paging I/O, Synchronous Paging I/O, Priority: Normal

注意,I/O 标志设置为非缓存、分页 I/O、同步分页 I/O、优先级:正常时,额外读取次数为 32,768。这些额外的读取将其从 272 KB 增加到 106 MB,并导致速度缓慢。当从本地文件读取或我正在读取一行时,它们不会发生,所以它都是连续的。

我尝试设置 FILE_FLAG_RANDOM_ACCESS 但似乎没有帮助。关于导致这些额外读取的原因以及如何阻止它们有什么想法吗???

测试在 Vista 64 位系统上运行。我可以提供用于演示问题的程序的源代码以及执行测试的控制台程序。

I'm reading a file and I either read a row of data (1600 sequential reads of 17 bytes) or a column of data (1600 reads of 17 bytes separated by 1600*17=27,200 bytes). The file is either on a local drive or a remote drive. I do the reads 10 times so I expect in each case to read in 272,000 bytes of data.

On the local drive, I see what I expect. On the remote drive when reading sequentially I also see what I expect but when reading a column, I see a ton of extra reads being done. They are 32,768 bytes long and don't seem to be used but they make the amount of data being read jump from 272,000 bytes to anywhere from 79 MB to 106 MB. Here is the output using Process Monitor:

1:39:39.4624488 PM  DiskSpeedTest.exe   89628   ReadFile    \\BCCDC01\BCC-raid3\SeisWareInc Temp Dir\BPepers_Temp\Projects\PT_4\Horizons\BaseName3D_1\RR_AP SUCCESS Offset: 9,390,069, Length: 17
1:39:39.4624639 PM  DiskSpeedTest.exe   89628   FASTIO_CHECK_IF_POSSIBLE    \\BCCDC01\BCC-raid3\SeisWareInc Temp Dir\BPepers_Temp\Projects\PT_4\Horizons\BaseName3D_1\RR_AP SUCCESS Operation: Read, Offset: 9,390,069, Length: 17
1:39:39.4624838 PM  DiskSpeedTest.exe   89628   ReadFile    \\BCCDC01\BCC-raid3\SeisWareInc Temp Dir\BPepers_Temp\Projects\PT_4\Horizons\BaseName3D_1\RR_AP SUCCESS Offset: 9,388,032, Length: 32,768, I/O Flags: Non-cached, Paging I/O, Synchronous Paging I/O, Priority: Normal
1:39:39.4633839 PM  DiskSpeedTest.exe   89628   ReadFile    \\BCCDC01\BCC-raid3\SeisWareInc Temp Dir\BPepers_Temp\Projects\PT_4\Horizons\BaseName3D_1\RR_AP SUCCESS Offset: 9,417,269, Length: 17
1:39:39.4634002 PM  DiskSpeedTest.exe   89628   FASTIO_CHECK_IF_POSSIBLE    \\BCCDC01\BCC-raid3\SeisWareInc Temp Dir\BPepers_Temp\Projects\PT_4\Horizons\BaseName3D_1\RR_AP SUCCESS Operation: Read, Offset: 9,417,269, Length: 17
1:39:39.4634178 PM  DiskSpeedTest.exe   89628   ReadFile    \\BCCDC01\BCC-raid3\SeisWareInc Temp Dir\BPepers_Temp\Projects\PT_4\Horizons\BaseName3D_1\RR_AP SUCCESS Offset: 9,444,469, Length: 17
1:39:39.4634324 PM  DiskSpeedTest.exe   89628   FASTIO_CHECK_IF_POSSIBLE    \\BCCDC01\BCC-raid3\SeisWareInc Temp Dir\BPepers_Temp\Projects\PT_4\Horizons\BaseName3D_1\RR_AP SUCCESS Operation: Read, Offset: 9,444,469, Length: 17
1:39:39.4634529 PM  DiskSpeedTest.exe   89628   ReadFile    \\BCCDC01\BCC-raid3\SeisWareInc Temp Dir\BPepers_Temp\Projects\PT_4\Horizons\BaseName3D_1\RR_AP SUCCESS Offset: 9,441,280, Length: 32,768, I/O Flags: Non-cached, Paging I/O, Synchronous Paging I/O, Priority: Normal
1:39:39.4642199 PM  DiskSpeedTest.exe   89628   ReadFile    \\BCCDC01\BCC-raid3\SeisWareInc Temp Dir\BPepers_Temp\Projects\PT_4\Horizons\BaseName3D_1\RR_AP SUCCESS Offset: 9,471,669, Length: 17
1:39:39.4642396 PM  DiskSpeedTest.exe   89628   FASTIO_CHECK_IF_POSSIBLE    \\BCCDC01\BCC-raid3\SeisWareInc Temp Dir\BPepers_Temp\Projects\PT_4\Horizons\BaseName3D_1\RR_AP SUCCESS Operation: Read, Offset: 9,471,669, Length: 17
1:39:39.4642582 PM  DiskSpeedTest.exe   89628   ReadFile    \\BCCDC01\BCC-raid3\SeisWareInc Temp Dir\BPepers_Temp\Projects\PT_4\Horizons\BaseName3D_1\RR_AP SUCCESS Offset: 9,498,869, Length: 17
1:39:39.4642764 PM  DiskSpeedTest.exe   89628   FASTIO_CHECK_IF_POSSIBLE    \\BCCDC01\BCC-raid3\SeisWareInc Temp Dir\BPepers_Temp\Projects\PT_4\Horizons\BaseName3D_1\RR_AP SUCCESS Operation: Read, Offset: 9,498,869, Length: 17
1:39:39.4642922 PM  DiskSpeedTest.exe   89628   ReadFile    \\BCCDC01\BCC-raid3\SeisWareInc Temp Dir\BPepers_Temp\Projects\PT_4\Horizons\BaseName3D_1\RR_AP SUCCESS Offset: 9,498,624, Length: 32,768, I/O Flags: Non-cached, Paging I/O, Synchronous Paging I/O, Priority: Normal

Notice the extra reads of 32,768 with I/O Flags set to non-cached, Paging I/O, Synchronous Paging I/O, Priority: Normal. These extra reads are what take it from 272 KB to 106 MB and are causing the slowness. They don't happen when reading from a local file or if I'm reading a row so it's all sequential.

I've tried setting the FILE_FLAG_RANDOM_ACCESS but it doesn't seem to help. Any ideas on what is causing these extra reads and how to make them stop???

The tests are being run on a Vista 64 bit system. I can provide source code for a program to demonstrate the problem as well as a console program that does the tests.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

千笙结 2024-08-24 18:15:59

您可能会遇到 smb 上的操作锁定问题。通常,当通过网络读取/保存文件时,Windows 会将完整文件拉到客户端进行处理并发回更改。当您使用平面文件数据库或文件时,可能会导致跨 smb 文件共享进行不必要的读取。

我不确定是否有一种方法可以拉取整个文件,从本地副本上的该文件中读取行,然后推回更改。

您将读到一些关于机会锁和平面文件数据库的噩梦。

http://msdn.microsoft.com/en- us/library/aa365433%28VS.85%29.aspx

不确定这是否能解决您的问题,但它可能会为您指明正确的方向。祝你好运!

You might be running into op lock issues over smb. Typically when reading/saving a file over the network windows will pull over the full file to the client work on it and send back changes. When you are working with flat file databases or files it can cause unnecessary reads across an smb file share.

I'm not sure if there is a way to just pull over the whole file, read the rows from that file on the local copy and then push back the changes or not.

You'll read some nightmares about oplocks and flat file databases.

http://msdn.microsoft.com/en-us/library/aa365433%28VS.85%29.aspx

Not sure if this solves your problem, but it might get you pointed in the right direction. Good luck!

蒲公英的约定 2024-08-24 18:15:59

我找到了这个问题的答案。 Windows 通过页面缓存进行文件读取,因此当我读取 17 个字节时,它首先必须传输 32K 的整页,然后才能从页面缓存中复制我想要的 17 个字节。性能结果很糟糕!

第一次在本地文件上完成读取时实际上会发生同样的事情,因为在这种情况下,它仍然一次将整个页面加载到页面缓存中。但是当我第二次在本地运行测试时,文件都已经在页面缓存中,所以我看不到它。如果 SuperFetch 已打开并且我已经进行这些测试一段时间了,Windows 将在我运行测试应用程序之前开始将文件加载到缓存中,所以我再次看不到该页面正在阅读。

因此,操作系统在幕后做了很多事情,这使得很难完成良好的性能测试!

I found the answer to this. Windows does file reads through the page cache so when I read 17 bytes, it first has to transfer a full page of 32K over and then can copy the 17 bytes I want out of the page cache. Nasty result on performance!

The same thing is actually happening the first time the reads are done on a local file since in that case it does still load a full page at a time into the page cache. But the second time I run the test locally, the files are all already in the page cache so I don't see it. And if SuperFetch is turned on and I've been doing these tests for a while, Windows will start loading the file into the cache before I even run my test application so again I don't see the page reads being done.

So the operating system is doing a lot of things behind the scenes that makes it tough to get good performance testing done!

醉生梦死 2024-08-24 18:15:59

我经常看到这样的情况,但它是你无法控制的:网络会按照它的意愿行事。

如果您知道该文件将小于 1MB,只需将整个文件放入内存即可。

I see this all the time, and it's out of your control: the network does what it wants.

If you know the file is going to be less than 1MB, just pull the whole thing into memory.

一口甜 2024-08-24 18:15:59

我的猜测是,操作系统正在自行预读文件,以防您稍后需要数据。如果它没有伤害你,那就没关系。

查看缓存行为部分创建文件 API。

您可能想尝试“FILE_FLAG_NO_BUFFERING”以查看它是否会停止额外的读取。但请注意,使用此标志可能会减慢您的应用程序速度。通常,如果您了解如何尽可能快地从磁盘流式传输数据,并且操作系统缓存只会妨碍您,则可以使用此标志。

此外,如果您使用“FILE_FLAG_SEQUENTIAL_SCAN”标志,您可能可以获得与带有本地文件的网络文件相同的行为。该标志向 Windows 缓存管理器提示您将要做什么,并将尝试提前为您获取数据。

My guess is that the OS is doing it's own read-ahead of the file on the off chance you need the data at a later point. If it's not hurting you then it shouldn't matter.

Check out caching behavoir section of the CreateFile API.

You may like to try the 'FILE_FLAG_NO_BUFFERING' to see if it stops the extra reads. Be warned tho, using this flag may slow your application down. Normally you use this flag if you understand how to stream data off the disk as fast as you can and the OS caching is only getting in the way.

Also you may be able to get the same sort of behavior as the network file with local files if you use the 'FILE_FLAG_SEQUENTIAL_SCAN' flag. This flag hint's to the windows cache manager what you will be doing and will try to get the data for you ahead of time.

凌乱心跳 2024-08-24 18:15:59

我认为 SMB 总是传输一个块,而不是一小组字节。

可以在此处找到有关块大小协商的一些信息。
http://support.microsoft.com/kb/q223140

因此,您看到的内容是复制相关块,然后本地读取该块内的 17 个字节。 (如果您查看该模式,会发现有一些 17 字节读取对,其中两个读取位于同一块内)。

修复显然取决于您对应用程序的控制以及数据库的大小和结构。 (例如,如果数据库每个文件有一列,那么所有读取都将是连续的。如果您使用数据库服务器,则不会使用 SMB 等)

如果有任何安慰,使用网络驱动器时 iTunes 的性能也很糟糕

I think SMB always transfers a block, rather than a small set of bytes.

Some information on block size negotiation can be found here.
http://support.microsoft.com/kb/q223140

So you are seeing a read to copy the relevant block, followed by the local read(s) of 17 bytes within the block. (If you look at the pattern, there are some pairs of 17 byte reads where two reads fall within the same block).

The fix obviously depends upon the control you have over the application and the size and structure of the database. (e.g. if the database had one column per file, then all the reads would be sequential. If you used a database server, you wouldn't be using SMB, etc.)

If it's any consolation, iTunes performs abysmally when using a network drive too.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文