TensorFlow.js 预测时间是第一次试验和后续试验之间的差异

发布于 2025-01-11 12:42:41 字数 751 浏览 0 评论 0原文

我正在测试加载 TensorFlow.js 模型并尝试测量预测需要多少毫秒。例如,第一次需要大约 300 毫秒来预测值,但第二次则时间减少到 13~20 毫秒。我不是根据模型加载来计算时间。我只计算加载模型后的预测值。

谁能解释为什么预测价值的时间减少了?

// Calling TensorFlow.js model
const MODEL_URL = 'https://xxxx-xxxx-xxxx.xxx.xxx-xxxx-x.xxxxxx.com/model.json'
let model;
let prediction;
export async function getModel(input){
  console.log("From helper function: Model is being retrieved from the server...")
  model = await tf.loadLayersModel(MODEL_URL);

  // measure prediction time
  var str_time = new Date().getTime(); 
  prediction = await model.predict(input)
  var elapsed = new Date().getTime() - str_time;
  console.log("Laoding Time for Tensorflow: " + elapsed)        
    
  console.log(prediction.arraySync())
  ...
}

I am testing to load the TensorFlow.js model and trying to measure how many milliseconds it takes to predict. For example, the first time, it takes about 300 milliseconds to predict value but the time is decreased to 13~20 milliseconds from the second trial. I am not calculating time from the model loading. I am calculating only the prediction value after the model is loaded.

Can anyone explain why it gets decreased time to predict value?

// Calling TensorFlow.js model
const MODEL_URL = 'https://xxxx-xxxx-xxxx.xxx.xxx-xxxx-x.xxxxxx.com/model.json'
let model;
let prediction;
export async function getModel(input){
  console.log("From helper function: Model is being retrieved from the server...")
  model = await tf.loadLayersModel(MODEL_URL);

  // measure prediction time
  var str_time = new Date().getTime(); 
  prediction = await model.predict(input)
  var elapsed = new Date().getTime() - str_time;
  console.log("Laoding Time for Tensorflow: " + elapsed)        
    
  console.log(prediction.arraySync())
  ...
}

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

后来的我们 2025-01-18 12:42:41

通常,第一次预测会花费更长的时间,因为需要从 API 请求将模型加载到内存中,一旦完成,它将被缓存,您不需要再次发出相同的 API 请求。

如果您想查看实际的预测时间,请多次重复预测计时的过程(可能是 1000 次),然后获取第 99 个分位数值,该值将显示 99% 的情况的预测时间(您可以更改分位数值)以及 90 或 50)。

Usually the first prediction would take longer due to needing to load the model into memory from the API request, once thats done it would be cached and you would not need make the same API request again.

If you wanted to see the actual prediction time, repeat the process of timing the predictions many times(perhaps 1000) then get the 99th quantile value which will show what is the prediction time for 99% of the cases(you can alter the quantile value as well to 90 or 50).

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文