检测并修复溢出

发布于 2024-07-22 14:11:45 字数 624 浏览 14 评论 0原文

我们有一个粒子探测器硬连线以使用 16 位和 8 位缓冲区。 时不时地,会有某些[预测的]粒子通量峰值穿过它; 没关系。 不好的是这些通量通常会达到高于缓冲区存储容量的程度; 因此,就会发生溢出。 在图表上,它们看起来像是通量突然下降并再次开始增长。 您能否提出一种[大部分]准确的方法来检测溢出的数据点?

PS 探测器在物理上无法访问,因此通过更换缓冲区来以“正确的方式”修复它似乎不是一个选择。

更新:根据要求进行了一些澄清。 我们在数据处理设施中使用Python; 探测器本身使用的技术相当晦涩(将其视为是由完全不相关的第三方开发的),但它绝对不复杂,即不运行“真正的”操作系统,只是一些低级的东西来记录探测器读数并响应电源循环等远程命令。 内存损坏和其他问题现在不是问题。 发生溢出的原因很简单,探测器的设计者使用16位缓冲区来计算粒子通量,有时通量超过每秒65535个粒子。

更新2:正如几位读者所指出的,预期的解决方案将与分析通量分布有关,以检测急剧下降(例如,按一个数量级),试图将它们与正常情况分开波动。 另一个问题出现了:可以通过简单地针对恢复的(通过x轴)通量分布运行校正程序来检测恢复(原始通量下降到溢出水平以下的点)吗?

we have a particle detector hard-wired to use 16-bit and 8-bit buffers. Every now and then, there are certain [predicted] peaks of particle fluxes passing through it; that's okay. What is not okay is that these fluxes usually reach magnitudes above the capacity of the buffers to store them; thus, overflows occur. On a chart, they look like the flux suddenly drops and begins growing again. Can you propose a [mostly] accurate method of detecting points of data suffering from an overflow?

P.S. The detector is physically inaccessible, so fixing it the 'right way' by replacing the buffers doesn't seem to be an option.

Update: Some clarifications as requested. We use python at the data processing facility; the technology used in the detector itself is pretty obscure (treat it as if it was developed by a completely unrelated third party), but it is definitely unsophisticated, i.e. not running a 'real' OS, just some low-level stuff to record the detector readings and to respond to remote commands like power cycle. Memory corruption and other problems are not an issue right now. The overflows occur simply because the designer of the detector used 16-bit buffers for counting the particle flux, and sometimes the flux exceeds 65535 particles per second.

Update 2: As several readers have pointed out, the intended solution would have something to do with analyzing the flux profile to detect sharp declines (e.g. by an order of magnitude) in an attempt to separate them from normal fluctuations. Another problem arises: can restorations (points where the original flux drops below the overflowing level) be detected by simply running the correction program against the reverted (by the x axis) flux profile?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(6

染墨丶若流云 2024-07-29 14:11:45
int32[] unwrap(int16[] x)
{
   // this is pseudocode
   int32[] y = new int32[x.length];
   y[0] = x[0];
   for (i = 1:x.length-1)
   {
      y[i] = y[i-1] + sign_extend(x[i]-x[i-1]);
      // works fine as long as the "real" value of x[i] and x[i-1]
      // differ by less than 1/2 of the span of allowable values
      // of x's storage type (=32768 in the case of int16)
      // Otherwise there is ambiguity.
   }
   return y;
}

int32 sign_extend(int16 x)
{
   return (int32)x; // works properly in Java and in most C compilers
}

// exercise for the reader to write similar code to unwrap 8-bit arrays
// to a 16-bit or 32-bit array
int32[] unwrap(int16[] x)
{
   // this is pseudocode
   int32[] y = new int32[x.length];
   y[0] = x[0];
   for (i = 1:x.length-1)
   {
      y[i] = y[i-1] + sign_extend(x[i]-x[i-1]);
      // works fine as long as the "real" value of x[i] and x[i-1]
      // differ by less than 1/2 of the span of allowable values
      // of x's storage type (=32768 in the case of int16)
      // Otherwise there is ambiguity.
   }
   return y;
}

int32 sign_extend(int16 x)
{
   return (int32)x; // works properly in Java and in most C compilers
}

// exercise for the reader to write similar code to unwrap 8-bit arrays
// to a 16-bit or 32-bit array
逐鹿 2024-07-29 14:11:45

当然,理想情况下,您应该将检测器软件修复为最大值 65535,以防止造成您悲伤的那种环绕。 我知道这并不总是可能的,或者至少并不总是能够很快做到。

当粒子通量超过65535时,是速度很快,还是通量逐渐增加然后逐渐减少? 这会影响您可以使用什么算法来检测此问题。 例如,如果通量上升得足够慢:

true flux     measurement  
 5000           5000
10000          10000
30000          30000
50000          50000
70000           4465
90000          24465
60000          60000
30000          30000
10000          10000

那么当溢出时,有时会出现负下降。 负跌幅比任何其他时间都要大得多。 这可以作为你已经溢出的信号。 要找到溢出时间段的结束点,您可以查找距 65535 不太远的值的大跳跃。

所有这些都取决于可能的最大真实通量以及通量上升和下降的速度。 例如,是否有可能在一个测量周期内获得超过 128k 的计数? 是否有可能一次测量为 5000,下一次测量为 50000? 如果数据表现得不够好,你可能只能对何时溢出做出统计判断。

Of course, ideally you'd fix the detector software to max out at 65535 to prevent wraparound of the sort that is causing your grief. I understand that this isn't always possible, or at least isn't always possible to do quickly.

When the particle flux exceeds 65535, does it do so quickly, or does the flux gradually increase and then gradually decrease? This makes a difference in what algorithm you might use to detect this. For example, if the flux goes up slowly enough:

true flux     measurement  
 5000           5000
10000          10000
30000          30000
50000          50000
70000           4465
90000          24465
60000          60000
30000          30000
10000          10000

then you'll tend to have a large negative drop at times when you have overflowed. A much larger negative drop than you'll have at any other time. This can serve as a signal that you've overflowed. To find the end of the overflow time period, you could look for a large jump to a value not too far from 65535.

All of this depends on the maximum true flux that is possible and on how rapidly the flux rises and falls. For example, is it possible to get more than 128k counts in one measurement period? Is it possible for one measurement to be 5000 and the next measurement to be 50000? If the data is not well-behaved enough, you may be able to make only statistical judgment about when you have overflowed.

太阳哥哥 2024-07-29 14:11:45

您的问题需要提供有关您的实施的更多信息 - 您使用什么语言/框架?

软件中的数据溢出(这就是我认为你正在谈论的)是不好的做法,应该避免。 虽然您看到(奇怪的数据输出)只是遇到数据溢出时可能出现的一种副作用,但这只是您可以看到的各种问题的冰山一角。

您可能很容易遇到更严重的问题,例如内存损坏,这可能会导致程序崩溃,或者更糟,不知不觉

您可以进行任何验证来防止溢出发生吗?

Your question needs to provide more information about your implementation - what language/framework are you using?

Data overflows in software (which is what I think you're talking about) are bad practice and should be avoided. While you are seeing (strange data output) is only one side effect that is possible when experiencing data overflows, but it is merely the tip of the iceberg of the sorts of issues you can see.

You could quite easily experience more serious issues like memory corruption, which can cause programs to crash loudly, or worse, obscurely.

Is there any validation you can do to prevent the overflows from occurring in the first place?

好多鱼好多余 2024-07-29 14:11:45

我真的认为如果不修复底层缓冲区就无法修复它。 您如何区分值序列 (0, 1, 2, 1, 0) 和 (0, 1, 65538, 1, 0) 之间的区别? 你不能。

I really don't think you can fix it without fixing the underlying buffers. How are you supposed to tell the difference between the sequences of values (0, 1, 2, 1, 0) and (0, 1, 65538, 1, 0)? You can't.

煮茶煮酒煮时光 2024-07-29 14:11:45

是否使用隐马尔可夫模型(HMM),其中隐藏状态是是否处于溢出状态,并且排放是观察到的粒子通量?

棘手的部分是提出转变的概率模型(这将基本上对峰值的时间尺度进行编码)和发射的概率模型(如果您知道通量的行为以及溢出如何影响测量,则可以构建该模型)。 这些是特定领域的问题,因此可能没有现成的解决方案。

但一旦有了模型,其他一切——拟合数据、量化不确定性、模拟等——都是例行公事。

How about using an HMM where the hidden state is whether you are in an overflow and the emissions are observed particle flux?

The tricky part would be coming up with the probability models for the transitions (which will basically encode the time-scale of peaks) and for the emissions (which you can build if you know how the flux behaves and how overflow affects measurement). These are domain-specific questions, so there probably aren't ready-made solutions out there.

But one you have the model, everything else---fitting your data, quantifying uncertainty, simulation, etc.---is routine.

遮云壑 2024-07-29 14:11:45

仅当连续值之间的实际跳跃远小于 65536 时,您才能执行此操作。否则,溢出引起的谷伪影与真正的谷无法区分,您只能猜测。 您可以通过同时分析来自右侧和左侧的信号(假设存在可识别的基线)来尝试将溢出与相应的修复体相匹配。

除此之外,您所能做的就是通过使用不同的原始粒子流重复实验来调整您的实验,以便真正的山谷不会移动,但伪像会移动到溢出点。

You can only do this if the actual jumps between successive values are much smaller than 65536. Otherwise, an overflow-induced valley artifact is indistinguishable from a real valley, you can only guess. You can try to match overflows to corresponding restorations, by simultaneously analysing a signal from the right and the left (assuming that there is a recognizable base line).

Other than that, all you can do is to adjust your experiment by repeating it with different original particle flows, so that real valleys will not move, but artifact ones move to the point of overflow.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文