选择围绕给定平均值随机分布的行

发布于 2024-10-09 02:40:57 字数 430 浏览 0 评论 0原文

我有一个包含 value 字段的表。这些记录的值在 0 到 100 之间均匀分布。

我想在该表中查询 n 条记录,给定目标均值 x,以便我收到加权随机结果集,其中 avg(value) 约为 x

我可以轻松地执行类似

SELECT TOP n * FROM table ORDER BY abs(x - value)

...的操作,但是每次运行查询时都会给出相同的结果。

我想要做的是添加某种权重,以便可以选择任何记录,但随着与 x 距离的增加,概率逐渐减小,这样我最终会得到类似围绕我给定平均值的正态分布。

我将不胜感激任何有关我如何实现这一目标的建议。

I have a table that has a value field. The records have values somewhat evenly distributed between 0 and 100.

I want to query this table for n records, given a target mean, x, so that I'll receive a weighted random result set where avg(value) will be approximately x.

I could easily do something like

SELECT TOP n * FROM table ORDER BY abs(x - value)

... but that would give me the same result every time I run the query.

What I want to do is to add weighting of some sort, so that any record may be selected, but with diminishing probability as the distance from x increases, so that I'll end up with something like a normal distribution around my given mean.

I would appreciate any suggestions as to how I can achieve this.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

も让我眼熟你 2024-10-16 02:40:57

为什么不使用 RAND() 函数?

SELECT TOP n * FROM table ORDER BY abs(x - value) + RAND()

编辑

使用 Rand 不起作用,因为在选择中调用 RAND 往往会为大多数行生成相同的数字。 Heximal 使用 NewID 是正确的,但需要直接按顺序使用。

SELECT Top N value  
FROM  table  
ORDER BY
    abs(X - value) + (cast(cast(Newid()  as varbinary) as integer))/10000000000

大除数 10000000000 用于使 avg(value) 更接近 X,同时保持 AVG(x-值)低。

话虽如此,也许在 https://stats.stackexchange.com/ 上提出问题(不带 SQL 位)会让你更好结果。

why not use the RAND() function?

SELECT TOP n * FROM table ORDER BY abs(x - value) + RAND()

EDIT

Using Rand won't work because calls to RAND in a select have a tendency to produce the same number for most of the rows. Heximal was right to use NewID but it needs to be used directly in the order by

SELECT Top N value  
FROM  table  
ORDER BY
    abs(X - value) + (cast(cast(Newid()  as varbinary) as integer))/10000000000

The large divisor 10000000000 is used to keep the avg(value) closer to X while keeping the AVG(x-value) low.

With that all said maybe asking the question (without the SQL bits) on https://stats.stackexchange.com/ will get you better results.

歌枕肩 2024-10-16 02:40:57

尝试

SELECT TOP n * FROM table ORDER BY abs(x - value),  newid()

select * from (
    SELECT TOP n * FROM table ORDER BY abs(x - value)
  ) a order by newid()

try

SELECT TOP n * FROM table ORDER BY abs(x - value),  newid()

or

select * from (
    SELECT TOP n * FROM table ORDER BY abs(x - value)
  ) a order by newid()
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文