在mysql中,通过该表的列之一中的不同值仅获取表中的行的有效方法是什么?
我对 mysql 比较陌生,所以请耐心等待。
我有一个看起来有点像这样的表:
ID | Name | Location
0 | John | Los Angeles
1 | Joe | San Jose
2 | Jane | New York
3 | Sal | Boise
4 | Jay | New York
5 | Kate | San Jose
我需要一个 SELECT 语句来获取所有行,但如果 Location 重复,则该行将被忽略。结果应该如下所示:
0 | John | Los Angeles
1 | Joe | San Jose
2 | Jane | New York
3 | Sal | Boise
重要的是我的表非常非常大,有数十万行。我尝试过的大多数事情都以 select 语句结束,这些语句实际上需要 30 多分钟才能完成!
I'm relatively new to mysql so bear with me.
I have a table that looks a bit like this:
ID | Name | Location
0 | John | Los Angeles
1 | Joe | San Jose
2 | Jane | New York
3 | Sal | Boise
4 | Jay | New York
5 | Kate | San Jose
I need a SELECT
statement that gets all rows, with the exception that if Location is repeated, that row is ignored. The result should look something like this:
0 | John | Los Angeles
1 | Joe | San Jose
2 | Jane | New York
3 | Sal | Boise
The important thing is that my table is very, very large, with hundreds of thousands of rows. Most things I've tried as ended up with select statements that take literally 30+ minutes to complete!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
我就是这样做的:
派生表获取每个位置的最小 id。然后连接到表以提取其余数据。
顺便说一句,ID 是命名 id 字段的一个糟糕选择。请考虑使用 tablenameID 代替。这对于报告不同表中的 id 字段不具有相同的名称很有帮助,并且它大大降低了您意外连接错误并连接到错误表中的 ID 的可能性。在我看来,这也让 PK/FK 关系更容易看出。
This is how I would do it:
The derived table get the minimum id for each location. Then joins to the table to pull the rest of the data.
Incidentally ID is a horrible choice for naming the id field. Please think about using tablenameID instead. It is helpful for reporting not to have the same name for the id fields in differnt tables and it makes if FAR less likely that you will make an accidental join mistake and join tothe ID in the wrong table. It also makes the PK/FK relationships easier to see in my opinion.
您有位置索引吗?这应该有助于大大提高速度。
Do you have an index on Location? That should help improve the speed a lot.
您可以使用
SELECT * FROM tbl GROUP BY Location
在 Location 上创建索引可以极大地帮助减少查询时间。
此外,如果表中的列数较多,则仅指定所需的列而不是使用 * 将进一步提高性能。
you can use
SELECT * FROM tbl GROUP BY Location
creating an index on Location can dramatically help reduce the querying time.
Also if there are more number of columns in your table, specifying only the required columns instead of using * will improve performance further.