onnxRuntime nodejs通过执行模式设置intraopnumthreads和Interopnumthreads
我正在使用 onnxruntime 在nodejs中以执行 onnx onnx onnx
innx 转换 cpu
后端到 run run pernection。 /a>
根据文档,可选参数如下:
var options = {
/**
*
*/
executionProviders: ['cpu'],
/*
* The optimization level.
* 'disabled'|'basic'|'extended'|'all'
*/
graphOptimizationLevel: 'all',
/**
* The intra OP threads number.
* change the number of threads used in the threadpool for Intra Operator Execution for CPU operators
*/
intraOpNumThreads: 1,
/**
* The inter OP threads number.
* Controls the number of threads used to parallelize the execution of the graph (across nodes).
*/
interOpNumThreads: 1,
/**
* Whether enable CPU memory arena.
*/
enableCpuMemArena: false,
/**
* Whether enable memory pattern.
*
*/
enableMemPattern: false,
/**
* Execution mode.
* 'sequential'|'parallel'
*/
executionMode: 'sequential',
/**
* Log severity level
* @see ONNX.Severity
* 0|1|2|3|4
*/
logSeverityLevel: ONNX.Severity.kERROR,
/**
* Log verbosity level.
*
*/
logVerbosityLevel: ONNX.Severity.kERROR,
};
具体来说,我可以控制(如在张量中)螺纹参数 intraopnumthreads
and code> InterOpnumThreads ,这些定义为定义为多于。
我想为顺序
和 Parallel
执行模式(由 executionMode
参数所定义的参数)优化两个。
我的方法就像是
var numCPUs = require('os').cpus().length;
options.intraOpNumThreads = numCPUs;
为了具有至少多个线程,例如可用CPU的数量,因此我在MacBook Pro上获得了 sequential
执行模式的此会话配置:
{
executionProviders: [ 'cpu' ],
graphOptimizationLevel: 'all',
intraOpNumThreads: 8,
interOpNumThreads: 1,
enableCpuMemArena: false,
enableMemPattern: false,
executionMode: 'sequential',
logSeverityLevel: 3,
logVerbosityLevel: 3
}
以及 Parallel
for
/code>执行模式我同时设置:
{
executionProviders: [ 'cpu' ],
graphOptimizationLevel: 'all',
intraOpNumThreads: 8,
interOpNumThreads: 8,
enableCpuMemArena: false,
enableMemPattern: false,
executionMode: 'parallel',
logSeverityLevel: 3,
logVerbosityLevel: 3
}
或其他方法可能是考虑可用CPU的一定百分比:
var perc = (val, tot) => Math.round( tot*val/100 );
var numCPUs = require('os').cpus().length;
if(options.executionMode=='parallel') { // parallel
options.interOpNumThreads = perc(50,numCPUs);
options.intraOpNumThreads = perc(10,numCPUs);
} else { // sequential
options.interOpNumThreads = perc(100,numCPUs);
options.intraOpNumThreads = 1;
}
但是我找不到任何文档来确保这是基于ExecutionMode('sequential''和expecutionMode( “并行”执行模式)。理论上正确正确吗?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
这实际上取决于模型结构。通常,我使用顺序执行模式,因为大多数模型是顺序模型 - 例如,对于CNN模型,每个层都取决于上一层,因此您必须一个一个一层执行每个层。
我的答案是尝试测试不同的配置,并根据PERT数字选择您的选择。
另一个考虑因素是您如何期望您的应用程序执行:消耗所有CPU以获得最佳性能(最低推理潜伏期)或达到绩效和功耗的平衡。选择完全取决于您。
It really depends on the model structure. Usually, I use sequential execution mode because most models are sequential models - for example for a CNN model each layer depends on the previous layer, so you have to execute each layer one by one.
My answer is to try testing different configs and pick your choice based on perf numbers.
Another consideration is how do you expect your application to perform: to consume all CPUs for best performance (lowest inference latency) or reach to a balance for performance and power consumption. The choice is totally up to you.