随机生成 MySQL 行时确保字段唯一
我正在生成大约 100,000 行随机数据,大约有 10 列,并且我需要所有值都是唯一的。
我能想到的唯一可靠的方法是对每个随机生成进行查询,以确保它尚不存在,但这会增加数十万个查询。另一种选择是在表上创建一个 UNIQUE 约束,但是我需要有某种方法来计算有多少重复项失败,以便我可以重新创建这些行(我需要知道我将生成完全相同数量的随机行)按照指定),或者在插入失败时使用新一代重试。有办法做到这一点吗?或者我是否错过了其他一些方法?
I'm generating somewhere around 100,000 rows of random data, with around 10 columns, and I need to have all of values to be unique.
The only solid way I could think of doing this would be doing a query on each random generation to make sure that it does not exist yet, but that adds hundreds of thousands of queries. The other alternative would be creating a UNIQUE constraint on the table, but then I would need to have some way of counting how many duplicates fail, so that I can recreate those rows (I need to know that I will generate exactly as many random rows as specified), or having something that retries with a new generation when that insert fails. Is there a way to do that? Or am I missing some other approach all together?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
我应该在 python 中生成一个列表中的随机数。
然后,使用 set 来获取唯一值。
现在,您可以生成 200.000 个数字以获得至少 100.000 个唯一的数字,然后截断列表,等等。
我尝试了这个:
它给了我一组长度为 200.000 的数据。生成的随机数中没有重复项!
I should generate random numbers in python, in a list.
Then, use set to get uniques.
Now, you may generate 200.000 numbers to get at least 100.000 unique ones, and truncate the list, or so.
I tried this :
and it gave me a set of length 200.000. There were no dups within the generated random !
您在列中填充什么类型的数据?它必须遵循任何特定格式吗?
无论数据格式如何,我都会离线生成所有数据,然后将其一次性插入数据库中。
例如:
这样您就知道所有字符串都是唯一的,并且随机播放将使顺序(更加)随机。
What sorts of data are you filling the columns with? Does it have to follow any particular format?
Regardless of the data format, I'd generate all the data offline and then insert it into the DB all at once.
For example:
This way you know that all the strings will be unique, and the shuffle will make the order (more) random.