使用卡尔曼滤波器来跟踪物体的位置,但需要知道该物体的位置作为卡尔曼滤波器的输入。到底是怎么回事?

发布于 2024-10-13 22:43:29 字数 931 浏览 5 评论 0原文

我正在尝试研究如何使用卡尔曼滤波器自己跟踪视频序列中移动的物体(球),所以请向我解释一下,因为我还是个孩子。

  • 通过一些算法(颜色分析、光流...),我能够获得每个视频帧的二进制图像,其中有跟踪对象(白色像素)和背景(黑色像素)->我知道物体大小、物体质心、物体位置 ->只需简单地在对象周围绘制一个边界框 -->结束。为什么我需要在这里使用卡尔曼滤波器?

  • 好吧,有人告诉我,由于噪声的存在,我无法检测到每个视频帧中的对象,因此我需要使用卡尔曼滤波器来估计对象的位置。好吧,好吧。但据我所知,我需要向卡尔曼滤波器提供输入。它们是先前的状态和测量。

    • 前一个状态(所以我认为它是前一帧中物体的位置、速度、加速度......)->好的,这对我来说很好。
    • 当前状态的测量:这是我无法理解的。测量可以是什么? - 对象在当前帧中的位置?这很有趣,因为如果我知道对象的位置,我所需要的只是在对象周围绘制一个简单的边界框(矩形)。为什么我这里还需要卡尔曼滤波器?因此,无法将当前帧中物体的位置作为测量值。 - “视频监控系统中基于卡尔曼滤波器的跟踪”文章说

      <块引用>

      卡尔曼滤波块的主要作用是分配一个跟踪 过滤到从系统进入系统的每个测量值 光流分析模块。

      如果你阅读全文,你会发现作者将 blob 的最大数量和 blob 的最小大小作为卡尔曼滤波器的输入。这些参数如何用作测量

我想我现在陷入了循环。我想使用卡尔曼滤波器来跟踪对象的位置,但我需要知道该对象的位置作为卡尔曼滤波器的输入。到底是怎么回事?

还有一个问题,我不理解术语“卡尔曼滤波器的数量”。在一个视频序列中,如果有2个对象需要跟踪->需要使用2个卡尔曼滤波器吗?是这个意思吗?

I am trying to study how to use Kalman filter in tracking an object (ball) moving in a video sequence by myself so please explain it to me as I am a child.

  • By some algorithms (color analysis, optical flow...), I am able to get a binary image of each video frame in which there is the tracking object ( white pixels) and background (black pixels) -> I know the object size, object centroid, object position -> Just simple draw a bounding box around the object --> Finish. Why do I need to use Kalman filter here?

  • Ok, somebody told me that because I can not detect the object in each video frame because of noise, I need to use Kalman filter to estimate the position of the object. Ok, fine. But as I know, I need to provide the input to Kalman filter. They are previous state and measurement.

    • previous state ( so I think it is the position, the velocity, acceleration...of the object in the previous frame) -> Ok, this is fine to me.
    • measurement of current state: Here is what I can not understand. What can measurement be?
      - The position of the object in the current frame? It is funny because if I know the position of the object, all I need is just to draw a simple boundingbox (rectangular) around the object. Why I need Kalman filter here anymore? Therefore, it is impossible to take the position of the object in the current frame as measurement value.
      - "Kalman Filter Based Tracking in an Video Surveillance System" article says

      The main role of the Kalman filtering block is to assign a tracking
      filter to each of the measurements entering the system from the
      optical flow analysis block.

      If you read the full paper, you will see that the author takes the maximum number of blob and the minimum size of the blob as an input to the Kalman filter. How can those parameters be used as measurement?

I think I am in a loop now. I want to use Kalman filter to track the position of an object, but I need to know the position of that object as an input of Kalman filter. What is going on?

And 1 more question, I dont understand the term "number of Kalman filter". In a video sequence, if there are 2 objects need to track -> need to use 2 Kalman filter? Is that what it means?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(6

兔小萌 2024-10-20 22:43:29

您不使用卡尔曼滤波器来对某些事物进行初步估计;您可以使用它根据一系列噪声估计来给出改进的估计

为了使这一点更容易理解,想象一下您正在测量一些非动态的东西,例如成人的身高。你测量了一次,但你不确定结果的准确性,所以你连续10天再次测量,每次测量都略有不同,比如说相差几毫米。那么您应该选择哪个测量值作为最佳值呢?我认为很容易看出,与使用任何单一测量值相比,取平均值可以让您更好地估计人的真实身高。

好吧,但这与卡尔曼滤波器有什么关系呢?

卡尔曼滤波器本质上是对一系列测量值取平均值,如上所述,但适用于动态系统。例如,假设您正在使用连接到跑步者的 GPS + 发射器单元提供的信息来测量马拉松运动员在赛道上的位置。 GPS 每分钟为您提供一个读数。但这些读数不准确,您希望提高对跑步者当前位置的了解。您可以通过以下方式做到这一点:

步骤 1) 使用最后的几个读数,您可以估计跑步者的速度并估计他在未来任何时间的位置(这是预测部分)卡尔曼滤波器)。

步骤 2) 每当您收到新的 GPS 读数时,请对读数和步骤 1 中获得的估计值进行加权平均(这是卡尔曼滤波器的更新部分)。加权平均值的结果是位于预测位置和测量位置之间的新估计,并且比两者本身都更准确。

请注意,您必须指定希望卡尔曼滤波器在预测部分中使用的模型。在马拉松运动员示例中,您可以使用等速模型。

You don't use the Kalman filter to give you an initial estimate of something; you use it to give you an improved estimate based on a series of noisy estimates.

To make this easier to understand, imagine you're measuring something that is not dynamic, like the height of an adult. You measure once, but you're not sure of the accuracy of the result, so you measure again for 10 consecutive days, and each measurement is slightly different, say a few millimeters apart. So which measurement should you choose as the best value? I think it's easy to see that taking the average will give you a better estimate of the person's true height than using any single measurement.

OK, but what has that to do with the Kalman filter?

The Kalman filter is essentially taking an average of a series of measurements, as above, but for dynamic systems. For instance, let's say you're measuring the position of a marathon runner along a race track, using information provided by a GPS + transmitter unit attached to the runner. The GPS gives you one reading per minute. But those readings are inaccurate, and you want to improve your knowledge of the runner's current position. You can do that in the following way:

Step 1) Using the last few readings, you can estimate the runner's velocity and estimate where he will be at any time in the future (this is the prediction part of the Kalman filter).

Step 2) Whenever you receive a new GPS reading, do a weighted average of the reading and of your estimate obtained in step 1 (this is the update part of the Kalman filter). The result of the weighted average is a new estimate that lies in between the predicted and measured position, and is more accurate than either by itself.

Note that you must specify the model you want the Kalman filter to use in the prediction part. In the marathon runner example you could use a constant velocity model.

梓梦 2024-10-20 22:43:29

卡尔曼滤波器的目的是减轻测量中的噪声和其他不准确性。在您的情况下,测量值是已从框架中分割出来的对象的 x,y 位置。如果您可以在每一帧中完美地分割出球并且仅分割出背景中的球,则不需要卡尔曼滤波器,因为您的测量实际上不包含噪声。

在大多数应用中,由于多种原因(照明变化、背景变化、其他移动物体等)无法保证完美的测量,因此需要有一种方法来过滤测量结果以产生真实轨迹的最佳估计。

卡尔曼滤波器的作用是使用模型来预测下一个位置(假设模型成立),然后将该估计值与您传入的实际测量值进行比较。实际测量值与预测和噪声特性结合使用,以形成最终位置估计并更新噪声特征(测量测量值与模型的差异程度)。

该模型可以是任何对您尝试跟踪的系统进行建模的模型。常见的模型是恒速模型,它假设物体将继续以与先前估计相同的速度移动。这并不是说该模型不会跟踪速度变化的物体,因为测量结果将反映速度的变化并影响估计。

有多种方法可以解决同时跟踪多个对象的问题。最简单的方法是为每个轨道使用独立的卡尔曼滤波器。这就是卡尔曼滤波器真正开始发挥作用的地方,因为如果您使用仅使用边界框质心的简单方法,那么如果两个对象相互交叉会发生什么?分开后你能再次区分出哪个物体是哪个物体吗?借助卡尔曼滤波器,您可以获得模型和预测,这将有助于在其他物体干扰时保持轨迹正确。

还有更高级的方法可以联合跟踪多个对象,例如 JPDAF

The purpose of the Kalman filter is to mitigate the noise and other inaccuracies in your measurements. In your case, the measurement is the x,y position of the object that has been segmented out of the frame. If you can perfectly segmement out the ball and only the ball from the background for every frame, there is no need for the Kalman filter since your measurements in effect contain no noise.

In most applications, perfect measurements cannot be guaranteed for a number of reasons (change in lighting, change in background, other moving objects, etc.) so there needs to be a way of filtering the measurements to produce the best estimate of the true track.

What the Kalman Filter does is use a model to predict what the next position should be assuming the model holds true, and then compares that estimate to the actual measurement you pass in. The actual measurement is used in conjunction with the prediction and noise characteristics to form the final position estimate and update a characterization of the noise (measure of how much the measurements are differing from the model).

The model could be anything that models the system you are trying to track. A common model is a constant velocity model which just assumes that the object will continue to move with the same velocity as in the previous estimate. This is not to say that this model will not track something with a changing velocity since the measurements will reflect the change in velocity and affect the estimate.

There are a number of ways you can attack the problem of tracking multiple objects at once. The simplest way is to use an independent Kalman filter for each track. This is where the Kalman filter really starts to pay off because if you are using the simple approach of just using the centroid of a bounding box, what happens if the two objects cross one another? Can you again differentiate which object is which after they separate? With the Kalman filter, you have the model and prediction that will help keep the track correct when other objects are interfering.

There are also more advanced ways of tracking multiple objects jointly like a JPDAF.

梦里南柯 2024-10-20 22:43:29

Jason 对卡尔曼滤波器的概念有了一个良好的开端。关于你的问题,论文如何使用最大数量的blob和最小的blob大小,这正是卡尔曼滤波器的威力。

测量不一定是位置、速度或加速度。测量可以是您在某个时间实例中可以观察到的任何数量。如果您可以定义一个模型,在给定当前测量值的情况下预测下一个时间实例的测量值,卡尔曼滤波器可以帮助您减轻噪声。

我建议您研究更多有关图像处理和计算机视觉的介绍材料。这些材料几乎总是覆盖卡尔曼滤波器。

这是关于跟踪器的 SIGGRAPH 课程。它不是介绍性的,但应该让您更深入地了解该主题。
http://www.cs.unc.edu/~tracker/media /pdf/SIGGRAPH2001_CoursePack_08.pdf

Jason has given a good start on what Kalman filter is. In regard to your question as to how the paper can use the maximum number of blobs and the minimum size of the blob, this is exactly the power of Kalman filter.

A measurement needs not be a position, a velocity or an acceleration. A measurement can be any quantity that you can observe at a time instance. If you can define a model that predict your measurement in the next time instance given the current measurement, Kalman filter can help you mitigate the noise.

I would suggest you look into more introductory materials on Image Processing and Computer Vision. These materials will almost always cover Kalman filter.

Here is a SIGGRAPH course on trackers. It is not introductory but should give you a more in-depth look at the topic.
http://www.cs.unc.edu/~tracker/media/pdf/SIGGRAPH2001_CoursePack_08.pdf

超可爱的懒熊 2024-10-20 22:43:29

几周前我有这个问题。我希望这个答案可以帮助其他人。

  • 如果您可以在每一帧(整个球)获得良好的分割,则不需要使用卡尔曼滤波器。但是分割可以给你一组不相连的斑点(只有球的几个部分)。问题是要知道哪些部分(斑点)属于对象或只是噪声。使用卡尔曼滤波器,我们可以将估计位置附近的斑点分配为对象的一部分。例如,如果球的半径为 10 像素,则距离大于 15 的斑点不应被视为对象的一部分。
  • 卡尔曼滤波器使用先前的状态来预测当前的状态。但是,使用当前测量(当前对象位置)来改进其下一个预测。例如,如果车辆位于位置 10(先前状态)并以 5 m/s 的速度行驶,卡尔曼滤波器将预测下一个位置为位置 15。但是如果我们测量物体的位置,我们发现物体是在位置 18 处。为了改进估计,卡尔曼滤波器将速度更新为 8 m/s。

综上所述,卡尔曼滤波器主要用于解决数据
视频跟踪中的关联问题。估计也不错
物体的位置,因为它考虑了噪声
来源和观察。

对于你的最后一个问题,你是对的。它对应的数量是
要跟踪的对象(每个对象一个卡尔曼滤波器)。

I had this question few weeks ago. I hope this answer helps another people.

  • If you can get the a good segmentation at each frame (the whole ball), you don't need to use kalman filter. But segmentation can give you a set of unconected blobs (only few parts of the ball). The problem is to know what parts (blobs) belong to the object or are just noise. Using kalman filter we can assign blobs near of the estimated position as parts of the object. E.g. if the ball has 10 pixels of radius, blobs with a distance higher than 15 should not be considered as part of the object.
  • Kalman filter uses the previous state to predict the current state. But, uses the current measurement (current object position) to improve its next prediction. E.g. if a vehicle is at the position 10 (previous state) and goes with a velocity of 5 m/s, kalman filter predict the next position at the position 15. But if we measure the position of the object, we found the object is at position 18. In order to improve the estimation, kalman filter updates the velocity to 8 m/s.

As summary, kalman filter is mainly used to solve the data
association problem in video tracking. It is also good to estimate
the object position, because it take into account the noise in the
source and in the observation.

And for you final question, you are right. It corresponds to the number of
object to track (one kalman filter per object).

梦魇绽荼蘼 2024-10-20 22:43:29

如果您可以在每一帧中准确地找到球,则不需要卡尔曼滤波器。仅仅因为您发现某个博客可能是球,并不意味着该斑点的中心将是球的完美中心。将其视为您的测量误差。此外,如果您碰巧选择了错误的博客,使用卡尔曼滤波器将有助于防止您相信那个错误的测量结果。就像您之前所说,如果您在框架中找不到球,您还可以使用过滤器来估计它可能在的位置。

以下是您需要的一些矩阵,以及我对它们适合您的猜测。由于球的 x 和 y 位置是独立的,因此我认为使用两个过滤器(每个过滤器一个)更容易。两者看起来都有点像这样:

x = [position ; velocity] //This is the output of the filter
P = [1, 0 ; 0 ,1] //This is the uncertainty of the estimation, I am not quite sure what you should have to start, but it will converge once the filter is running.
F = [ 1,dt ; 0,1] when you do x*F this will predict the next location of the ball. Notice that this assumes the ball keeps moving with the same velocity as before, and just updates the position. 
Q = [ 0,0 ; 0,vSigma^2] This is the "process noise". This one of the matrices you tune to make the filter preform well. In your system, velocity can change at any time, but position will never change without the velocity being what changed it. This is confusing. The value should be the standard deviation of what those velocity changes might be. 
z  = [position in x or y] This is your measurement
H = [1,0 ; 0,0] This is how your measurement gets applied to your current state. Since you are only measuring position, you only have a 1 in the first row. 
R = [?] I think you will only need a scalar for R, which is the error in your measurement. 

有了这些矩阵,您应该能够将它们插入到卡尔曼滤波器无处不在的公式中。

一些值得阅读的好东西:
卡尔曼滤波演示
另一个精彩内容,请阅读链接到的页面在第三段中

In the case that you can find the ball exactly in every frame, you don't need a Kalman filter. Just because you find some blog which is likely the ball, it doesn't mean that the center of that blob will be the perfect center of the ball. Think of that as your measurement error. Also, if you happen to pick out the wrong blog, using a Kalman filter would help prevent you from trusting that one wrong measurement. Like you said before, if you can't find the ball in a frame, you can also use the filter to estimate where it is likely to be.

Here are some of the matrices you will need, and my guess at what they would be for you. Since the x and y position of the ball is independent, I think it is easier to have two filters, one for each. Both would look kinda like this:

x = [position ; velocity] //This is the output of the filter
P = [1, 0 ; 0 ,1] //This is the uncertainty of the estimation, I am not quite sure what you should have to start, but it will converge once the filter is running.
F = [ 1,dt ; 0,1] when you do x*F this will predict the next location of the ball. Notice that this assumes the ball keeps moving with the same velocity as before, and just updates the position. 
Q = [ 0,0 ; 0,vSigma^2] This is the "process noise". This one of the matrices you tune to make the filter preform well. In your system, velocity can change at any time, but position will never change without the velocity being what changed it. This is confusing. The value should be the standard deviation of what those velocity changes might be. 
z  = [position in x or y] This is your measurement
H = [1,0 ; 0,0] This is how your measurement gets applied to your current state. Since you are only measuring position, you only have a 1 in the first row. 
R = [?] I think you will only need a scalar for R, which is the error in your measurement. 

With those matrices you should be able to plug them into the formulas that are everywhere for Kalman filters.

Some good things to read:
Kalman filtering demo
Another great into, read the page linked to in the third paragraph

維他命╮ 2024-10-20 22:43:29

在视觉应用中,通常使用每帧的结果作为测量,例如每帧中球的位置就是很好的测量。

In vision application , it is common to use your results at each frame as measurement, for example location of ball in each frame is good measurement.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文