上下文切换通常发生在调用函数和执行函数之间吗？

发布于 2025-01-14 21:05:11 字数 1600 浏览 2 评论 0原文

因此，我现在一直在研究一个复杂应用程序（由数百名程序员编写）的源代码。除此之外，我创建了一些时间检查函数以及合适的数据结构来测量主循环不同部分的执行周期并对这些测量结果进行一些分析。

这是一个有助于解释的伪代码：

main()

{

TimeSlicingSystem::AddTimeSlice(0);

FunctionA();

TimeSlicingSystem::AddTimeSlice(3);

FuncitonB();

TimeSlicingSystem::AddTimeSlice(6);

PrintTimeSlicingValues();

}



void FunctionA()

{

TimeSlicingSystem::AddTimeSlice(1);

//...

TimeSlicingSystem::AddTimeSlice(2);

}



FuncitonB()

{

TimeSlicingSystem::AddTimeSlice(4);

//...

TimeSlicingSystem::AddTimeSlice(5);

}



PrintTimeSlicingValues()

{

//Prints the different between each slice, and the slice before it,

//starting from slice number 1.

}

大多数测量都非常合理，例如为局部变量分配一个值将花费不到一微秒的时间。大多数函数从开始执行到结束只需几微秒，很少能达到一毫秒。

然后我进行了一些测试，持续了半个小时左右，我发现了一些我不太理解的奇怪结果。某些函数将被调用，并且在测量从调用函数（“调用”代码中的最后一行）到“被调用”函数内的第一行的时间时，将需要很长的时间，最多 30 毫秒。这是在一个循环中发生的，否则该循环将在不到 8 毫秒的时间内完成完整的迭代。

为了了解这一点，在我包含的伪代码中，测量了切片编号0和切片编号1之间的时间周期，或者切片编号3和切片编号4之间的时间。我指的就是这种时期。它是调用函数和运行被调用函数内第一行之间的测量时间。

问题A.此行为是否是由于操作系统的线程或进程切换所致？调用函数是否是一个独特的漏洞？我正在使用的操作系统是 Windows 10。

有趣的是，在“调用”代码问题中的调用之后，函数中的最后一行根本没有返回到第一行（从切片编号 2 到 3 或从 5 的周期）到伪代码中的 6）！所有测量时间始终小于 5 微秒。

问题B.这是否是由于我使用的时间测量方法造成的？由于时钟差异，不同内核之间的切换是否会暗示上下文切换比实际速度慢？（尽管到目前为止我从未发现过一个负的增量时间，这似乎完全反驳了这个假设）。同样，我正在使用的操作系统是 Windows 10。

我的时间测量功能如下所示：

FORCEINLINE double Seconds()

{

Windows::LARGE_INTEGER Cycles;

Windows::QueryPerformanceCounter(&Cycles);

// add big number to make bugs apparent where return value is being passed to float

return Cycles.QuadPart * GetSecondsPerCycle() + 16777216.0;

}

原文

So I have been working on the source code of a complex application (written by hundreds of programmers) for a while now. And among other things, I have created some time checking functions, along with suitable data structures to measure execution periods of different segments of the main loop and run some analysis on these measurements.

Here's a pseudocode that helps explaining:

main()

{

TimeSlicingSystem::AddTimeSlice(0);

FunctionA();

TimeSlicingSystem::AddTimeSlice(3);

FuncitonB();

TimeSlicingSystem::AddTimeSlice(6);

PrintTimeSlicingValues();

}



void FunctionA()

{

TimeSlicingSystem::AddTimeSlice(1);

//...

TimeSlicingSystem::AddTimeSlice(2);

}



FuncitonB()

{

TimeSlicingSystem::AddTimeSlice(4);

//...

TimeSlicingSystem::AddTimeSlice(5);

}



PrintTimeSlicingValues()

{

//Prints the different between each slice, and the slice before it,

//starting from slice number 1.

}

Most measurements were very reasonable, for instance assigning a value to a local variable will cost less than a fraction of a microsecond. Most functions will execute from start to finish in a few microseconds, and rarely ever reach one millisecond.

I then ran a few tests for half an hour or so, and I found some strange results that I couldn't quite understand. Certain functions will be called, and when measuring the time from the moment of calling the function (last line in 'calling' code) to the first line inside the 'called' function will take a very long time, up to a 30 milliseconds period. That's happening in a loop that would otherwise complete a full iteration in less than 8 milliseconds.

To get a picture of that, in the pseudocode I included, the time period between the slice number 0, and the slice number 1, or the time between the slice number 3, and the slice number 4 is measured. This the sort of periods I am referring to. It is the measured time between calling a function, and running the first line inside the called function.

QuestionA. Could this behavior be due to thread, or process switching by the OS? Does calling a function is a uniquely vulnerable spot to that? The OS I am working on is Windows 10.

Interestingly enough, there was never a last line in a function returning to the first line after the call in the 'calling' code problem at all ( periods from slice number 2 to 3 or from 5 to 6 in pseudocode)! And all measurements were always less than 5 microseconds.

QuestionB. Could this be, in any way, due to the time measurement method I am using? Could switching between different cores gives an allusion of slower than actually is context switching due to clock differences? (although I never found a single negative delta time so far, which seems to refute this hypothesis altogether). Again, the OS I am working on is Windows 10.

My time measuring function looks looks this:

FORCEINLINE double Seconds()

{

Windows::LARGE_INTEGER Cycles;

Windows::QueryPerformanceCounter(&Cycles);

// add big number to make bugs apparent where return value is being passed to float

return Cycles.QuadPart * GetSecondsPerCycle() + 16777216.0;

}

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

往日 2025-01-21 21:05:11

问题A。此行为是否是由于操作系统的线程或进程切换所致？

是的。线程切换可能随时发生（例如，当设备发送 IRQ 导致不同的更高优先级线程立即解锁并抢占您的线程时），这可能/将导致线程中出现意外的时间延迟。

调用函数是否是一个独特的漏洞？

调用您自己的函数并没有什么特别之处，这使得它们特别容易受到攻击。如果函数涉及内核的API，则更有可能发生线程切换，并且某些事情（例如调用“sleep()”）几乎肯定会导致线程切换。

此外，还存在与虚拟内存管理的潜在交互 - 通常，事物（例如您的可执行文件、代码、数据）使用“内存映射文件”，首次访问它可能会导致操作系统从磁盘（以及您的计算机）获取代码或数据。线程可以被阻塞，直到它想要的代码或数据从磁盘到达）；很少使用的代码或数据也可以发送到交换空间并需要获取。

问题B。这是否是由于我使用的时间测量方法造成的？

实际上，Windows 的 QueryPerformanceCounter() 很可能是使用 RDTSC 指令（假设 80x86 CPU/s）实现的，并且根本不涉及内核，并且对于现代硬件这很可能是单原子的。理论上，Windows 可以以另一种方式模拟 RDTSC 和/或实现 QueryPerformanceCounter() 来防范安全问题（定时侧通道），正如英特尔在大约 30 年间所建议的那样几年前，但这不太可能（现代操作系统，包括但不限于 Windows，往往更关心性能而不是安全性）；理论上，您的硬件/CPU 可能太旧了（大约 10 年以上），以至于 Windows 必须以不同的方式实现 QueryPerformanceCounter()，或者您可能正在使用其他一些 CPU（例如 ARM 和不是 80x86）。

换句话说;您使用的时间测量方法不太可能（但并非不可能）导致任何计时问题。

QuestionA. Could this behavior be due to thread, or process switching by the OS?

Yes. Thread switches can happen at any time (e.g. when a device sends an IRQ that causes a different higher priority thread to unblock and preempt your thread immediately) and this can/will cause unexpected time delays in your thread.

Does calling a function is a uniquely vulnerable spot to that?

There's nothing particularly special about calling your own functions that makes them uniquely vulnerable. If the function involves the kernel's API a thread switch can be more likely, and some things (e.g. calling "sleep()") are almost guaranteed to cause a thread switch.

Also there's potential interaction with virtual memory management - often things (e.g. your executable file, your code, your data) use "memory mapped files" where accessing it for the first time may cause OS to fetch the code or data from disk (and your thread can be blocked until the code or data it wanted arrived from disk); and rarely used code or data can also be sent to swap space and need to be fetched.

QuestionB. Could this be, in any way, due to the time measurement method I am using?

In practice it's likely that Windows' QueryPerformanceCounter() is implemented with an RDTSC instruction (assuming 80x86 CPU/s) and doesn't involve the kernel at all, and for modern hardware it's likely that this is monatomic. In theory Windows could emulate RDTSC and/or implement QueryPerformanceCounter() in another way to guard against security problems (timing side channels), as has been recommended by Intel for about 30 years now, but this is unlikely (modern operating systems, including but not limited to Windows, tend to care more about performance than security); and in theory your hardware/CPU could be so old (about 10+ years old) that Windows has to implement QueryPerformanceCounter() in a different way, or you could be using some other CPU (e.g. ARM and not 80x86).

In other words; it's unlikely (but not impossible) that the time measurement method you're using is causing any timing problems.

回复收藏 0 原文

~没有更多了~