有没有办法在 Windows 7 系统中独立执行任务并使用异构多 GPU?
我的台式机上可以有两个混合芯片组/一代 AMD GPU吗? 6950 和 4870,并专用一个 gpu (4870) 仅用于 opencl/gpgpu 目的,消除操作系统考虑的视频输出或显示驱动考虑的设备,允许 4870 本质上保持深度睡眠或显示弹出/禁用,直到它被流处理器被调用?
相比4870,6950在opencl计算上算是重量级的;足以使其可以处理数字并仍然允许活跃的用户会话,甚至网页浏览。然而,一旦我导航到嵌入 Flash 视频的网页,忘记我正在运行的内容并打开媒体播放器或媒体中心 - 基本上任何需要 6950 初始化 UVD 的 GPU 加速视频任务,显示系统都会挂起。
我正在寻找一种方法将我的 4870 插入开放式 PCIe 插槽,让它处于休眠状态,产生的热量和功耗接近于 0(本质上只维持接口信号,就像断电时的以太网卡一样)桌面维护线路并等待 WOL 命令),并达到 D0 状态(我什至不关心此唤醒事件的延迟是否以秒为单位),然后自行运行 opencl 计算。我不希望实现非 CF 异构 GPU 分组设置!在我的 UVD 挂起情况示例中,我会看到手动停止 6950 上的 opencl 计算,然后在 4870 上开始这些计算,以释放 6950 用于多媒体使用/游戏,这是我想要的结果(当然,计算受到影响)速度)。如果两个 GPU 能够在没有人使用桌面的情况下独立运行类似的计算,那就更好了。我什至不介意是否必须启动 4870 从“关闭”状态到“关闭”状态的电源状态转换(例如,通过桌面上的快捷方式),只要它不需要重新启动系统即可,结束用户会话并注销...4870 的手动开/关“开关”是任何熟练的 Windows 最终用户都可以做的事情 - 例如单击快捷方式来运行脚本,甚至进入设备管理和切换启用/禁用。只要 4870 不会偶尔发生浪费地闲置 1 次即可。
除了为 4870 编写一个新的 ini 来覆盖为将该设备用作典型显卡而编写的典型电源管理特征(例如进入/退出通电状态)之外,我想不出一个解决方案来促进此功能。 o 放弃 irq 或其他分配的资源以“保持接口打开”的可用性和寻址)。但这是一项远远超出我能力的努力,而且我很容易看到需要额外参与许可才能实现。
Can I have two mixed chipset/generation AMD gpus in my desktop; a 6950 and 4870, and dedicate one gpu (4870) for opencl/gpgpu purposes only, eliminating the device from video output or display driving consideration by the OS, allowing the 4870 to essentially remain in a deep sleep or appear ejected/disabled until it's stream processors are called upon?
Compared to the 4870, the 6950 is a heavyweight in opencl calculations; enough so that it can crunch numbers and still allow an active user session, and even web browsing. HOWEVER, as soon as I navigate to a webpage with embedded flash video, forget what I have running and open media player or media center- basically any gpu-accelerated video task that requires the 6950 to initialize UVD, the display system hangs.
I'm looking for a way to plug my 4870 in an open pcie slot, have it sit in a dormant state with near-0 heat production and power consumption (essentially only maintain the interface signalling, like an ethernet card in a powered-off desktop maintaining the line and waiting for a WOL command), and attain a D0 state (I don't even care if the latency of this wake event is on the scale of seconds) to then run opencl calculations ON ITS OWN. I do not wish to achieve a non-CF heterogeneous gpu teaming setup! In my example of a UVD hung situation I would see manually stopping the opencl calculations on the 6950, beginning those calculations then on the 4870 to free up the 6950 for multimedia usage/gaming as my desire outcome (granted, with a hit to the calculation rate). Even better if the two gpus could independently run similar calculations while no one is using the desktop. I don't even mind if I have to initiate the power-state transitions of the 4870 from/into an 'OFF' state (say, by a shortcut on the desktop), as long as it doesn't require a system restart, ending the user session and logging off... and the manual ON/OFF 'switch' for the 4870 is something any proficient windows end-user could do- like click a shortcut to run a script, or even go into device manage and toggle enable/disable. As long as the 4870 isn't wastefully idling by for 1 sole use that may occur sporadically.
I couldn't think of a solution to facilitate this function besides writing a new ini for the 4870 to override the typical power management characteristics written for usage of the device as a typical graphics card (say to drop in/out of powered state w/o relinquishing irq or other allocated resources to 'hold the door open' on interface availability and addressing). But that is an endeavor that is both well above my abilities, and I easily see an additional involvement of licensing being necessitated to achieve.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
Windows 7(也许还有 Windows 10)没有定义“选定的设备”。选择正确的设备是软件自己的责任。例如,谷歌浏览器的附加软件(用于视频解码)将选择任何 GPU 作为其中定义的第一个目标。如果编写为选择第一个索引设备,则需要重新插入两张卡的 PCI-E 来切换它们的角色。
该操作系统是为大多数(%99)用户而编写的,而不是为多 GPU 用户(%1?)而编写的。它只是选择一个 GPU 或对设备有明确控制的软件,然后简单地对所有 GPU 进行基准测试并选择最快的。所以你应该寻找软件的能力而不是操作系统。
游戏也是如此!当我在 vulkan api 上玩 dota-2 时,它使用 HD7870 进行计算(纹理、粒子等......),但使用 R7-240 进行图形!但我更喜欢相反的,因为 R7-240 画得慢。因为这个游戏是为大多数没有超过 1 个 GPU 的人编写的。
金钱控制发展 我对此感到抱歉。然后,资金需要市场渗透。 %99 的市场渗透需要为公众编写代码,而不是为科学家或富人编写代码。 Public 只有 1 个 GPU,而且很便宜。
我希望这样:
但不能保证成为现实,因为金钱仍然是万能的。
如果我是一个驱动程序开发人员,我会添加一个“重命名”选项(并变得贫穷)来为操作系统提供 N 个虚拟设备,这样操作系统和其他软件将只能获得整个系统的 1/N 次方或仅使用这些重命名或主要设备来 N/N。重命名可以是 GPU 的单个计算单元。当操作系统告诉驱动程序“给我共享相同内存的所有内核的 %25”时,它会选择一个设备并给出系统总内核的 %25。因此,即使用户也可以为自己的作品创建重命名。
我什至向微软发送了一条消息“我的第二块显卡上的文件系统缓存”,但他们没有回复!
Windows 7 (and maybe windows 10) doesn't define a "selected device". It's softwares' own responsibility to pick the right device. For example, google chrome's add-on software(for video decode) will pick whatever gpu comes as first target defined in it. If it is written to pick first-indexed device, then it needs a pci-e re-plug of both cards to switch their roles.
This OS written to fit for majority(%99) of users, not for multi-gpu users(%1 ?). It simply picks one of gpus or software has explicit control over devices and simply benchmarks all gpus and picks fastest. So you should look for software's abilities instead of OS.
Same thing goes for games too! When I play dota-2 on vulkan api, it uses HD7870 for compute(of textures, particles, etc..) but uses R7-240 for graphics! But I prefer the opposite because R7-240 can't draw fast. Because this game is written for majority of people who don't have more than 1 gpu.
Money controls development I'm sorry for this. Then, market-penetration is needed for money. %99 market penetration needs writing code for public, not scientific guys or rich ones. Public has simply 1 gpu and that is a cheap one.
I wish this:
but is not guaranteed to become true because money still talks.
If I were a driver developer, I would add a "rename" option(and become poor in return) to give N virtual devices to OS, so OS and other software will be able to gain only 1/N 'th power of whole system or N/N by just using those renames or main devices. A rename could be a single compute unit of a gpu. When OS tells drivers "give me %25 of all cores that share same memory" so it pick a device and gives %25 of total cores of system. So even users could create renames for their own work.
I even sent a message to microsoft for "file system cache on my 2nd graphics card" but they did not return!