MySQL 查询从 3 个表接收随机组合时遇到一些问题
这是原来的问题:
好吧,这是我的问题,我有两个 表,一个名为名字和 其他姓氏。我是什么 在这里尝试做的是找到 100 个 这些可能的组合 测试数据的名称。名字 表中有 5494 个条目 列,姓氏表有 单列中有 88799 个条目。这 只查询我能够做到的 得出一些结果是:
<前><代码>选择*来自 (SELECT * FROM 名字 ORDER BY rand()) f 左连接 (从姓氏中选择 * ORDER BY rand()) l ON 1=1 limit 10;这段代码的问题是它选择 1 个名字并给出 每个可以与之搭配的姓氏。 虽然这是合理的,但我会 将限制设置为 500000000 为了得到所有的组合 没有可能只有 20 首 名字(我宁愿不杀掉我的 服务器)。然而我只需要100 随机生成条目 测试数据,我将无法 用这段代码得到它。任何人都可以 请给我任何建议吗?
- 上面的问题已经回答了,下面的问题我需要建议。我只是提供它作为背景。
我想将另一个表加入到名为 status 的组合中。该表的一列中有 5 个条目,每当我尝试将其连接到其他两个表时,它最终都会再次重复其他两个表,以便状态标签适合其中的每一个。我唯一取得一点成功的是:
SELECT *
FROM ( SELECT firstnames FROM firstnames ORDER BY RAND( ) LIMIT 5 ) AS First
JOIN ( SELECT lastnames FROM lastnames ORDER BY RAND( ) LIMIT 5 ) as Last
JOIN ( SELECT status FROM status ORDER BY RAND( ) LIMIT 1) AS Status ON 1=1;
虽然名字和姓氏在此查询中不重复,但每个名字只列出一个状态标签。非常感谢你们的帮助!
This was the original question:
Alright, here is my issue, I have two
tables, one named firstnames and the
other named lastnames. What I am
trying to do here is to find 100 of
the possible combinations from these
names for test data. The firstnames
table has 5494 entries in a single
column, and the lastnames table has
88799 entries in a single column. The
only query that I have been able to
come up with that has some results is:SELECT * FROM (SELECT * FROM firstnames ORDER BY rand()) f LEFT JOIN (SELECT * FROM lastnames ORDER BY rand()) l ON 1=1 limit 10; The problem with this code is
that it selects 1 firstname and gives
every lastname that could go with it.
While this is plausible, I will have
to set the limit to 500000000 in
order to get all the combinations
possible without having only 20 first
names(and I'd rather not kill my
server). However, I only need 100
random generations of entries for
test data, and I will not be able to
get that with this code. Can anyone
please give me any advice?
- The above question has already been answered, I need advice on the question below. I am just providing it as a background.
I want to join another table to the mix named status. This table has 5 entries in one column, and whenever I attempt to join it to the other two, it ends up repeating the two other tables over again so that the status tags fit with every one of them. The only one that I am getting mild success with is this:
SELECT *
FROM ( SELECT firstnames FROM firstnames ORDER BY RAND( ) LIMIT 5 ) AS First
JOIN ( SELECT lastnames FROM lastnames ORDER BY RAND( ) LIMIT 5 ) as Last
JOIN ( SELECT status FROM status ORDER BY RAND( ) LIMIT 1) AS Status ON 1=1;
While the first and last names do not repeat in this query, only one status tag is listed with each of them. Thank you guys very much for your help!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
使用MySQL变量,你应该能够通过类似的东西来完成......虽然没有明确测试,但这应该让你得到你想要的。您无法进行简单的笛卡尔交叉连接,因为如您所知,它将获得第一个名字并与所有姓氏连接,然后是下一个名字与所有姓氏...
此查询使用 MySQL 变量。内部查询(分别预先查询名字和姓氏)将预先查询您随机的 10 个名字(或姓氏)。然后,将其连接到名字序列 (@fns) 和姓氏序列 (@lns) 的 @variable。由于两者都只有 10 条记录,并且每条记录的计数器都从 0 开始,因此它们都会产生序列为 1-10 的记录,因此 JOIN 将位于 SEQUENCE 上,其中每个保证值中只会出现一个...与 rand() 相比,它返回一些浮点分数,其中您永远无法保证一个表中的数字与另一个表中的数字相匹配。
Using MySQL Variables, you should be able to accomplish via something like... Although not explicitly tested, this SHOULD get you what you want. You can't do a simple Cartesian cross-join because as you know it will get the first FirstName and join with ALL the Last Names, then the NEXT First Name with all the last names...
This query uses MySQL variables. The inner query (pre respective first and last names) will pre-query your random 10 first (or last) names. Then, join that to a @variable for first name sequence (@fns) and last name sequence (@lns). Since both will have only 10 records, and each will both start with their counter at 0, they will BOTH result in records with a sequence from 1-10 and thus the JOIN will be on the SEQUENCE where only one of each guaranteed value will occur... vs rand() which returns some floating point fraction where you are never guaranteed a number from one table will match that of another.
我首先会在两个表上添加一个(自动)id 字段。
然后我会编写一个存储过程:
创建一个带有 2 个字段(firstid、lastid)的临时表
randomid
。在两个字段中插入 100 行(或您需要的数字),其中包含随机整数(从 1 到 MAX(firstnames.id) 到第一个字段,从 1 到 MAX(lastnames.id) 到第二个字段) .
将
firstnames
和lastnames
表连接到randomid
删除临时表。
I would first add an (automunbered) id field on both tables.
Then I would write a stored procedure that:
Creates a temporary table
randomid
with 2 fields (firstid, lastid).Inserts 100 rows (or the number you need) with random integers in the two fields, (from 1 to MAX(firstnames.id) into the first field and from 1 to MAX(lastnames.id) to the second one).
Joins the
firstnames
andlastnames
tables torandomid
Drops the temp table.
把它反过来怎么样:
内连接比左连接等更快?
How about turning it on it's head:
Inner joins being faster than left joins etc?
我不知道你的状态表做什么,但这会给你随机名称:
别名选择在那里,所以查询在这个生命周期中返回 - 交叉连接中有 10000 行......可管理。如果没有它们,将会有无数的行 - 查询将不会返回您的行数
I have no idea what your status table does, but this will get you random names:
The aliased selects are there so the query returns in this lifetime - there are 10000 rows in the cross join... manageable. Without them there would be gazillions of rows - query wouldn't come back with your row counts
这应该可以解决状态问题:
This should solve the status problem: