关于模拟CPU时周期计数精度的问题

发布于 2024-11-30 13:53:52 字数 1028 浏览 1 评论 0原文

我计划在接下来的几个月内创建一个 Sega Master System 模拟器,作为 Java 的一个业余爱好项目(我知道这不是最好的语言,但我发现它工作起来非常舒服,并且作为 Java 的频繁用户) Windows 和 Linux 我认为跨平台应用程序会很棒)。我的问题是关于周期盘点的;

我查看了另一个 Z80 模拟器以及其他模拟器的源代码,特别是执行循环引起了我的兴趣 - 当调用它时,会传递一个 int 作为参数(以 1000 为例)。现在我知道每个操作码需要不同数量的周期来执行,并且当这些操作码被执行时,周期数从总体数字中减少。一旦剩余周期数 <= 0,执行循环就会结束。

我的问题是,许多模拟器没有考虑到最后执行的指令可能会将周期数推至负值这一事实 - 这意味着在执行循环之间,最终可能会出现 1002 个周期执行而不是 1000。这很重要吗?有些模拟器通过补偿下一个执行循环来解决这个问题,有些则不然 - 哪种方法最好?请允许我来说明我的问题,因为我不太擅长表达自己:

public void execute(int numOfCycles) 
{ //this is an execution loop method, called with 1000.
   while (numOfCycles > 0)
   {
      instruction = readInstruction();
      switch (instruction)
      {
         case 0x40: dowhatever, then decrement numOfCycles by 5;
         break; 
         //lets say for arguments sake this case is executed when numOfCycles is 3.
      }
}

在这个特定的循环示例结束后,numOfCycles 将变为 -2。这只是一个很小的错误,但对于人们的整体体验来说,这是否重要?我很感激任何人对此的见解。我计划在每一帧之后中断 CPU,因为这似乎是合适的,所以 1000 个周期很低,我知道,但这只是一个例子。

非常感谢, 菲尔

I am planning on creating a Sega Master System emulator over the next few months, as a hobby project in Java (I know it isn't the best language for this but I find it very comfortable to work in, and as a frequent user of both Windows and Linux I thought a cross-platform application would be great). My question regards cycle counting;

I've looked over the source code for another Z80 emulator, and for other emulators as well, and in particular the execute loop intrigues me - when it is called, an int is passed as an argument (let's say 1000 as an example). Now I get that each opcode takes a different number of cycles to execute, and that as these are executed, the number of cycles is decremented from the overall figure. Once the number of cycles remaining is <= 0, the execute loop finishes.

My question is that many of these emulators don't take account of the fact that the last instruction to be executed can push the number of cycles to a negative value - meaning that between execution loops, one may end up with say, 1002 cycles being executed instead of 1000. Is this significant? Some emulators account for this by compensating on the next execute loop and some don't - which approach is best? Allow me to illustrate my question as I'm not particularly good at putting myself across:

public void execute(int numOfCycles) 
{ //this is an execution loop method, called with 1000.
   while (numOfCycles > 0)
   {
      instruction = readInstruction();
      switch (instruction)
      {
         case 0x40: dowhatever, then decrement numOfCycles by 5;
         break; 
         //lets say for arguments sake this case is executed when numOfCycles is 3.
      }
}

After the end of this particular looping example, numOfCycles would be at -2. This will only ever be a small inaccuracy but does it matter overall in peoples experience? I'd appreciate anyone's insight on this one. I plan to interrupt the CPU after every frame as this seems appropriate, so 1000 cycles is low I know, this is just an example though.

Many thanks,
Phil

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

失与倦" 2024-12-07 13:53:52
  1. 大多数仿真器/模拟器仅处理 CPU 时钟问题

    这对于游戏等来说很好......所以你有一些计时器或其他东西并运行CPU的模拟,直到CPU模拟计时器的持续时间。然后它会休眠,直到下一个计时器间隔发生。这很容易模拟。您可以通过您所询问的方法来减少计时错误。但正如这里所说,对于游戏来说,这通常是不必要的。

    这种方法有一个显着的缺点,那就是您的代码只能实时运行一小部分。如果计时器间隔(计时粒度)足够大,即使在游戏中,这种情况也会很明显。例如,当仿真休眠时,您及时按下键盘键,则不会检测到它。 (按键有时不起作用)。您可以通过使用较小的计时粒度来解决此问题,但这在某些平台上非常困难。在这种情况下,定时错误在软件生成的声音中会更加“明显”(至少对于那些能听到它并且对像我这样的东西不充耳不闻的人来说)。

  2. 如果您需要更复杂的东西

    例如,如果您想将真实的硬件连接到您的仿真/仿真,那么您需要仿真/模拟总线。此外,诸如系统的浮动总线争用之类的东西也很难添加到方法#1中(这是可行的,但会带来很大的痛苦)。< /p>

    如果将时序和仿真移植到机器周期,事情就会变得容易得多,突然出现争用或硬件中断等问题,浮动总线几乎可以自行解决他们自己的。我将我的 ZXSpectrum Z80 模拟器移植到这种时序并看到了曙光。许多事情变得显而易见(例如 Z80 操作码文档中的错误、计时等)。而且,争论从那里变得非常简单(几乎每个指令类型条目都只有几行代码,而不是可怕的解码表)。硬件仿真也变得非常容易,我以这种方式向 Z80 添加了诸如 FDC 控制器 AY 芯片仿真之类的东西(没有黑客,它真正在其原始代码上运行......甚至软盘格式化:))所以不再有磁带加载黑客并且无法工作对于像 TURBO 这样的自定义加载器

    为了完成这项工作,我创建了 Z80 的仿​​真/模拟,它对每条指令使用类似微代码的东西。由于我经常纠正 Z80 指令集中的错误(因为我知道没有一个 100% 正确的文档,即使其中一些声称它们没有错误且完整),我附带了一个如何处理它而不需要痛苦地重新编程模拟器。

    每条指令都由表中的一个条目表示,其中包含有关时序、操作数、功能的信息...整个指令集是所有指令的所有这些条目的表。然后我为我的指令集创建了一个 MySQL 数据库。并为我找到的每个指令集形成类似的表格。然后痛苦地比较他们所有的选择/修复什么是错误的,什么是正确的。结果将导出到在仿真启动时加载的单个文本文件。这听起来很可怕,但实际上它大大简化了事情,甚至加速了仿真,因为指令解码现在只是访问指针。可以在此处找到指令集数据文件示例硬件仿真的正确实现是什么

几年前我也发表了关于此的论文(可悲的是,认为该会议已不存在的机构,因此这些旧论文的服务器永久关闭,幸运的是我仍然有一份副本)因此,这里的图片描述了问题:

CPU 调度

  • a) 全油门没有同步,只有原始速度
  • b) #1 有很大的差距,导致硬件同步问题
  • c) #2需要睡觉很多粒度非常小的(可能会出现问题并减慢速度)但是指令的执行非常接近实时...
  • 红线是主机CPU的处理速度(显然上面的速度需要更多的时间,所以应该是在下一条指令之前剪切并插入,但很难正确绘制)
  • 洋红色线是模拟/模拟 CPU 处理速度,
  • 交替绿色/蓝色颜色代表下一条指令
  • 两个轴都是时间

[编辑1]更精确的图像

上面是手工的画...这个是VCL/C++程序生成的:

timing

由这些参数生成:

const int iset[]={4,6,7,8,10,15,21,23}; // possible timings [T]
const int n=128,m=sizeof(iset)/sizeof(iset[0]); // number of instructions to emulate, size of iset[]
const int Tps_host=25;  // max possible simulation speed [T/s]
const int Tps_want=10;  // wanted simulation speed [T/s]
const int T_timer=500;  // simulation timer period [T]

因此主机可以以所需速度的 250% 进行模拟,模拟粒度为 500T。伪随机生成的指令...

  1. most emulators/simulators dealing just with CPU Clock tics

    That is fine for games etc ... So you got some timer or what ever and run the simulation of CPU until CPU simulate the duration of the timer. Then it sleeps until next timer interval occurs. This is very easy to simulate. you can decrease the timing error by the approach you are asking about. But as said here for games is this usually unnecessary.

    This approach has one significant drawback and that is your code works just a fraction of a real time. If the timer interval (timing granularity) is big enough this can be noticeable even in games. For example you hit a Keyboard Key in time when emulation Sleeps then it is not detected. (keys sometimes dont work). You can remedy this by using smaller timing granularity but that is on some platforms very hard. In that case the timing error can be more "visible" in software generated Sound (at least for those people that can hear it and are not deaf-ish to such things like me).

  2. if you need something more sophisticated

    For example if you want to connect real HW to your emulation/simulation then you need to emulate/simulate BUS'es. Also things like floating bus or contention of system is very hard to add to approach #1 (it is doable but with big pain).

    If you port the timings and emulation to Machine cycles things got much much easier and suddenly things like contention or HW interrupts, floating BUS'es are solving themselves almost on their own. I ported my ZXSpectrum Z80 emulator to this kind of timing and see the light. Many things get obvious (like errors in Z80 opcode documentation, timings etc). Also the contention got very simple from there (just few lines of code instead of horrible decoding tables almost per instruction type entry). The HW emulation got also pretty easy I added things like FDC controlers AY chips emulations to the Z80 in this way (no hacks it really runs on their original code ... even Floppy formating :)) so no more TAPE Loading hacks and not working for custom loaders like TURBO

    To make this work I created my emulation/simulation of Z80 in a way that it uses something like microcode for each instruction. As I very often corrected errors in Z80 instruction set (as there is no single 100% correct doc out there I know of even if some of them claim that they are bug free and complete) I come with a way how to deal with it without painfully reprogramming the emulator.

    Each instruction is represented by an entry in a table, with info about timing, operands, functionality... Whole instruction set is a table of all theses entries for all instructions. Then I form a MySQL database for my instruction set. and form similar tables to each instruction set I found. Then painfully compared all of them selecting/repairing what is wrong and what is correct. The result is exported to single text file which is loaded at emulation startup. It sound horrible but In reality it simplifies things a lot even speedup the emulation as the instruction decoding is now just accessing pointers. The instruction set data file example can be found here What's the proper implementation for hardware emulation

Few years back I also published paper on this (sadly institution that holds that conference does not exist anymore so servers are down for good on those old papers luckily I still got a copy) So here image from it that describes the problematics:

CPU Scheduling

  • a) Full throtlle has no synchronization just raw speed
  • b) #1 has big gaps causing HW synchronization problems
  • c) #2 needs to sleep a lot with very small granularity (can be problematic and slow things down) But the instructions are executed very near their real time ...
  • Red line is the host CPU processing speed (obviously what is above it take a bit more time so it should be cut and inserted before next instruction but it would be hard to draw properly)
  • Magenta line is the Emulated/Simulated CPU processing speed
  • alternating green/blue colors represent next instruction
  • both axises are time

[edit1] more precise image

The one above was hand painted... This one is generated by VCL/C++ program:

timing

generated by these parameters:

const int iset[]={4,6,7,8,10,15,21,23}; // possible timings [T]
const int n=128,m=sizeof(iset)/sizeof(iset[0]); // number of instructions to emulate, size of iset[]
const int Tps_host=25;  // max possible simulation speed [T/s]
const int Tps_want=10;  // wanted simulation speed [T/s]
const int T_timer=500;  // simulation timer period [T]

so host can simulate at 250% of wanted speed and simulation granularity is 500T. Instructions where generated pseudo-randomly...

梦里泪两行 2024-12-07 13:53:52

最近 Arstechnica 上有一篇非常有趣的文章谈论控制台模拟,还链接到很多模拟器,这些模拟器可能会进行很好的研究:

精度取胜:一man's 3GHz quest to build a Perfect SNES emulator

相关的一点是作者提到的,我倾向于同意,即使时序偏差为 +/-20%,大多数游戏看起来也能正常运行。您提到的问题看起来可能永远不会真正引入超过百分之一的计时误差,这在玩最终游戏时可能是难以察觉的。作者可能认为不值得处理。

Was a quite interesting article on Arstechnica talking about console simulation recently, also links to quite a few simulators that might make for quite good research:

Accuracy takes power: one man's 3GHz quest to build a perfect SNES emulator

The relevant bit is that the author mentions, and I am inclined to agree, that most games will appear to function pretty correctly even with timing deviations of +/-20%. The issue you mention looks likely to never really introduce more than a fraction of a percent timing error, which is probably imperceptible whilst playing the final game. The authors probably didn't consider it worth dealing with.

不回头走下去 2024-12-07 13:53:52

我想这取决于您希望模拟器的准确性。我认为它不必那么准确。想想 x86 平台的模拟,处理器有很多变体,每个都有不同的执行延迟和问题率。

I guess that depends on how accurate you want your emulator to be. I do not think that it has to be that accurate. Think emulation of x86 platform, there are so many variants of processors and each has different execution latencies and issue rates.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文