@aaxis/coco-ssd 中文文档教程

发布于 3年前 浏览 25 项目主页 更新于 3年前

Object Detection (coco-ssd)

旨在定位和识别单个图像中的多个对象的对象检测模型。

此模型是 COCO-SSD 模型的 TensorFlow.js 端口。 有关 Tensorflow 对象检测 API 的更多信息,请查看此自述文件 tensorflow/object_detection

该模型检测在 COCO 数据集中定义的对象,COCO 数据集是一个大规模的对象检测、分割和字幕数据集。 您可以在此处找到更多信息。 该模型能够检测 80 类对象。 (SSD 代表单发多盒检测)。

此 TensorFlow.js 模型不需要您了解机器学习。 它可以将输入作为任何基于浏览器的图像元素( 元素,例如)并返回具有类名和置信度的边界框数组。

Usage

有两种主要方法可以在您的 JavaScript 项目中获取此模型:通过脚本标签或从 NPM 安装它并使用 Parcel、WebPack 或 Rollup 等构建工具。

via Script Tag

<!-- Load TensorFlow.js. This is required to use coco-ssd model. -->
<script src="https://cdn.jsdelivr.net/npm/@tensorflow/tfjs"> </script>
<!-- Load the coco-ssd model. -->
<script src="https://cdn.jsdelivr.net/npm/@tensorflow-models/coco-ssd"> </script>

<!-- Replace this with your image. Make sure CORS settings allow reading the image! -->
<img id="img" src="cat.jpg"/>

<!-- Place your code in the script tag below. You can also use an external .js file -->
<script>
  // Notice there is no 'import' statement. 'cocoSsd' and 'tf' is
  // available on the index-page because of the script tag above.

  const img = document.getElementById('img');

  // Load the model.
  cocoSsd.load().then(model => {
    // detect objects in the image.
    model.detect(img).then(predictions => {
      console.log('Predictions: ', predictions);
    });
  });
</script>

via NPM

注意:下面展示了如何使用coco-ssd npm转译web 部署,而不是关于如何在节点环境中使用 coco-ssd 的示例。

// Note: Require the cpu and webgl backend and add them to package.json as peer dependencies.
require('@tensorflow/tfjs-backend-cpu');
require('@tensorflow/tfjs-backend-webgl');
const cocoSsd = require('@tensorflow-models/coco-ssd');

(async () => {
  const img = document.getElementById('img');

  // Load the model.
  const model = await cocoSsd.load();

  // Classify the image.
  const predictions = await model.detect(img);

  console.log('Predictions: ');
  console.log(predictions);
})();

您还可以查看演示应用

API

Loading the model

coco-ssd是模块名,当你使用

export interface ModelConfig {
  base?: ObjectDetectionBaseModel;
  modelUrl?: string;
}

cocoSsd.load(config: ModelConfig = {});

参数: config 具有以下属性的 ModelConfig 接口类型:

  • base 控制基本 cnn 模型,可以是 'mobilenetv1'、'mobilenetv2' 或'litemobilenetv2'。 默认为“litemobilenetv2”。 litemobilenetv2 体积最小,推理速度最快。 mobilenet_v2 具有最高的分类精度。

  • modelUrl: 一个可选字符串,用于指定模型的自定义 url。 这对于无法访问托管在 GCP 上的模型的地区/国家很有用。

返回一个 model 对象。

Detecting the objects

您可以使用模型检测对象,而无需创建张量。 model.detect 接受一个输入图像元素并返回一组具有类名和置信度的边界框。

此方法存在于从 cocoSsd.load 加载的模型上。

model.detect(
  img: tf.Tensor3D | ImageData | HTMLImageElement |
      HTMLCanvasElement | HTMLVideoElement, maxNumBoxes: number, minScore: number
)

Args:

  • img: A Tensor or an image element to make a detection on.
  • maxNumBoxes: The maximum number of bounding boxes of detected objects. There can be multiple objects of the same class, but at different locations. Defaults to 20.
  • minScore: The minimum score of the returned bounding boxes of detected objects. Value between 0 and 1. Defaults to 0.5.

返回类和概率的数组,如下所示:

[{
  bbox: [x, y, width, height],
  class: "person",
  score: 0.8380282521247864
}, {
  bbox: [x, y, width, height],
  class: "kite",
  score: 0.74644153267145157
}]

Technical details for advanced users

此模型基于 TensorFlow 对象检测 API。 您可以从此处下载原始模型。 我们应用了以下优化来提高浏览器执行的性能:

  1. Removed the post process graph from the original model.
  2. Used single class NonMaxSuppression instead of original multiple classes NonMaxSuppression for faster speed with similar accuracy.
  3. Executes NonMaxSuppression operations on CPU backend instead of WebGL to avoid delays on the texture downloads.

这是用于删除后处理图的转换器命令。

tensorflowjs_converter --input_format=tf_frozen_model \
                       --output_format=tfjs_graph_model \
                       --output_node_names='Postprocessor/ExpandDims_1,Postprocessor/Slice' \
                       ./frozen_inference_graph.pb \
                       ./web_model

Object Detection (coco-ssd)

Object detection model that aims to localize and identify multiple objects in a single image.

This model is a TensorFlow.js port of the COCO-SSD model. For more information about Tensorflow object detection API, check out this readme in tensorflow/object_detection.

This model detects objects defined in the COCO dataset, which is a large-scale object detection, segmentation, and captioning dataset. You can find more information here. The model is capable of detecting 80 classes of objects. (SSD stands for Single Shot MultiBox Detection).

This TensorFlow.js model does not require you to know about machine learning. It can take input as any browser-based image elements (<img>, <video>, <canvas> elements, for example) and returns an array of bounding boxes with class name and confidence level.

Usage

There are two main ways to get this model in your JavaScript project: via script tags or by installing it from NPM and using a build tool like Parcel, WebPack, or Rollup.

via Script Tag

<!-- Load TensorFlow.js. This is required to use coco-ssd model. -->
<script src="https://cdn.jsdelivr.net/npm/@tensorflow/tfjs"> </script>
<!-- Load the coco-ssd model. -->
<script src="https://cdn.jsdelivr.net/npm/@tensorflow-models/coco-ssd"> </script>

<!-- Replace this with your image. Make sure CORS settings allow reading the image! -->
<img id="img" src="cat.jpg"/>

<!-- Place your code in the script tag below. You can also use an external .js file -->
<script>
  // Notice there is no 'import' statement. 'cocoSsd' and 'tf' is
  // available on the index-page because of the script tag above.

  const img = document.getElementById('img');

  // Load the model.
  cocoSsd.load().then(model => {
    // detect objects in the image.
    model.detect(img).then(predictions => {
      console.log('Predictions: ', predictions);
    });
  });
</script>

via NPM

Note: The following shows how to use coco-ssd npm to transpile for web deployment, not an example on how to use coco-ssd in the node env.

// Note: Require the cpu and webgl backend and add them to package.json as peer dependencies.
require('@tensorflow/tfjs-backend-cpu');
require('@tensorflow/tfjs-backend-webgl');
const cocoSsd = require('@tensorflow-models/coco-ssd');

(async () => {
  const img = document.getElementById('img');

  // Load the model.
  const model = await cocoSsd.load();

  // Classify the image.
  const predictions = await model.detect(img);

  console.log('Predictions: ');
  console.log(predictions);
})();

You can also take a look at the demo app.

API

Loading the model

coco-ssd is the module name, which is automatically included when you use the <script src> method. When using ES6 imports, coco-ssd is the module.

export interface ModelConfig {
  base?: ObjectDetectionBaseModel;
  modelUrl?: string;
}

cocoSsd.load(config: ModelConfig = {});

Args: config Type of ModelConfig interface with following attributes:

  • base: Controls the base cnn model, can be 'mobilenetv1', 'mobilenetv2' or 'litemobilenetv2'. Defaults to 'litemobilenetv2'. litemobilenetv2 is smallest in size, and fastest in inference speed. mobilenet_v2 has the highest classification accuracy.

  • modelUrl: An optional string that specifies custom url of the model. This is useful for area/countries that don't have access to the model hosted on GCP.

Returns a model object.

Detecting the objects

You can detect objects with the model without needing to create a Tensor. model.detect takes an input image element and returns an array of bounding boxes with class name and confidence level.

This method exists on the model that is loaded from cocoSsd.load.

model.detect(
  img: tf.Tensor3D | ImageData | HTMLImageElement |
      HTMLCanvasElement | HTMLVideoElement, maxNumBoxes: number, minScore: number
)

Args:

  • img: A Tensor or an image element to make a detection on.
  • maxNumBoxes: The maximum number of bounding boxes of detected objects. There can be multiple objects of the same class, but at different locations. Defaults to 20.
  • minScore: The minimum score of the returned bounding boxes of detected objects. Value between 0 and 1. Defaults to 0.5.

Returns an array of classes and probabilities that looks like:

[{
  bbox: [x, y, width, height],
  class: "person",
  score: 0.8380282521247864
}, {
  bbox: [x, y, width, height],
  class: "kite",
  score: 0.74644153267145157
}]

Technical details for advanced users

This model is based on the TensorFlow object detection API. You can download the original models from here. We applied the following optimizations to improve the performance for browser execution:

  1. Removed the post process graph from the original model.
  2. Used single class NonMaxSuppression instead of original multiple classes NonMaxSuppression for faster speed with similar accuracy.
  3. Executes NonMaxSuppression operations on CPU backend instead of WebGL to avoid delays on the texture downloads.

Here is the converter command for removing the post process graph.

tensorflowjs_converter --input_format=tf_frozen_model \
                       --output_format=tfjs_graph_model \
                       --output_node_names='Postprocessor/ExpandDims_1,Postprocessor/Slice' \
                       ./frozen_inference_graph.pb \
                       ./web_model
    我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
    原文