如何在三个条件下连接两个表?
希望有人能弄清楚我在这里做错了什么。这项任务看起来很简单,但显然超出了我的能力范围。
我有两个表,我正在尝试附加降水数据。两个表中需要匹配三个条件才能获得每个现场的正确降水数据,例如年份、纬度和经度。我尝试使用以下查询(以及其他我什至不记得的失败查询):
SELECT f.*, g.*
FROM fieldSites f LEFT OUTER JOIN gpcp_precipitation2 g
ON f.date = g.year
AND f.d_lat = g.lat
WHERE f.d_lon = g.lon;
这个超时了
,并且:
SELECT *
FROM fieldSites
INNER JOIN gpcp_precipitation2
ON (fieldSites.d_lon = gpcp_precipitation2.lon
AND fieldSites.d_lat = gpcp_precipitation2.lat
AND fieldSites.date = gpcp_precipitation2.year);
这个对我来说也超时了。
我想转储到 .csv 文件,但现在我只想执行成功的查询。
这是我的表格:
left table: fieldSites
siteId d_lat d_lon year data1 data2 country
1 -13.75 18.75 2009 0.598 0.351 Angola
1 -13.75 18.75 2008 0.654 0.330 Angola
1 -13.75 18.75 2007 0.489 0.381 Angola
1 -13.75 18.75 2006 0.554 0.389 Angola
1 -13.75 18.75 2005 0.321 0.321 Angola
1 -13.75 18.75 2004 0.598 0.351 Angola
1 -13.75 18.75 2003 0.654 0.330 Angola
1 -13.75 18.75 2002 0.489 0.381 Angola
1 -13.75 18.75 2001 0.554 0.389 Angola
2 -78.75 163.75 2009 0.285 0.155 Antarctica
2 -78.75 163.75 2008 0.285 0.155 Antarctica
2 -78.75 163.75 2007 0.285 0.155 Antarctica
2 -78.75 163.75 2006 0.285 0.155 Antarctica
2 -78.75 163.75 2005 0.285 0.155 Antarctica
...1052 sites, 11 years, 11496 rows
right table: gpcp_precipitation2
siteId lat lon year precipitation
1 81.5 1.25 2009 93.36571912
1 81.5 1.25 2008 93.36571912
1 81.5 1.25 2007 93.36571912
1 81.5 1.25 2006 93.36571912
1 81.5 1.25 2005 93.36571912
1 81.5 1.25 2004 93.36571912
1 81.5 1.25 2003 93.36571912
1 81.5 1.25 2002 93.36571912
1 81.5 1.25 2001 93.36571912
1 81.5 1.25 2000 93.36571912
1 81.5 3.75 2009 93.36571912
1 81.5 3.75 2008 93.36571912
1 81.5 3.75 2007 93.36571912
... 92300 rows
我想要的是:
siteId d_lat d_lon year data1 data2 country precipitation
1 13.75 18.75 2009 0.598 0.351 Angola 144.286
1 13.75 18.75 2008 0.654 0.330 Angola 114.970
1 13.75 18.75 2007 0.489 0.381 Angola 70.000
1 13.75 18.75 2006 0.554 0.389 Angola 174.179
1 13.75 18.75 2005 0.321 0.321 Angola 174.743
1 13.75 18.75 2004 0.598 0.351 Angola 70.506
1 13.75 18.75 2003 0.654 0.330 Angola 173.716
1 13.75 18.75 2002 0.489 0.381 Angola 74.162
1 13.75 18.75 2001 0.554 0.389 Angola 139.445
2 78.75 163.75 2009 0.285 0.155 Antarctica 0
2 78.75 163.75 2008 0.285 0.155 Antarctica 0
2 78.75 163.75 2007 0.285 0.155 Antarctica 0
2 78.75 163.75 2006 0.285 0.155 Antarctica 0
我在做一些完全愚蠢的事情吗?我很困惑。 非常感谢您的任何建议。
hope someone can figure out what I'm doing wrong here. The task seems quite simple, but is apparently beyond me.
I have two tables and I am trying to append precipitation data. Three conditions need to match in the two tables to get the correct precipitation data for each field site, e.g. year, latitude, and longitude. I tried by using the following queries (among other failed queries that I can't even remember any more):
SELECT f.*, g.*
FROM fieldSites f LEFT OUTER JOIN gpcp_precipitation2 g
ON f.date = g.year
AND f.d_lat = g.lat
WHERE f.d_lon = g.lon;
this one timed out
and:
SELECT *
FROM fieldSites
INNER JOIN gpcp_precipitation2
ON (fieldSites.d_lon = gpcp_precipitation2.lon
AND fieldSites.d_lat = gpcp_precipitation2.lat
AND fieldSites.date = gpcp_precipitation2.year);
this one also timed out on me.
I'd like to dump to a .csv file, but right now I just want to execute a successful query.
Here are my tables:
left table: fieldSites
siteId d_lat d_lon year data1 data2 country
1 -13.75 18.75 2009 0.598 0.351 Angola
1 -13.75 18.75 2008 0.654 0.330 Angola
1 -13.75 18.75 2007 0.489 0.381 Angola
1 -13.75 18.75 2006 0.554 0.389 Angola
1 -13.75 18.75 2005 0.321 0.321 Angola
1 -13.75 18.75 2004 0.598 0.351 Angola
1 -13.75 18.75 2003 0.654 0.330 Angola
1 -13.75 18.75 2002 0.489 0.381 Angola
1 -13.75 18.75 2001 0.554 0.389 Angola
2 -78.75 163.75 2009 0.285 0.155 Antarctica
2 -78.75 163.75 2008 0.285 0.155 Antarctica
2 -78.75 163.75 2007 0.285 0.155 Antarctica
2 -78.75 163.75 2006 0.285 0.155 Antarctica
2 -78.75 163.75 2005 0.285 0.155 Antarctica
...1052 sites, 11 years, 11496 rows
right table: gpcp_precipitation2
siteId lat lon year precipitation
1 81.5 1.25 2009 93.36571912
1 81.5 1.25 2008 93.36571912
1 81.5 1.25 2007 93.36571912
1 81.5 1.25 2006 93.36571912
1 81.5 1.25 2005 93.36571912
1 81.5 1.25 2004 93.36571912
1 81.5 1.25 2003 93.36571912
1 81.5 1.25 2002 93.36571912
1 81.5 1.25 2001 93.36571912
1 81.5 1.25 2000 93.36571912
1 81.5 3.75 2009 93.36571912
1 81.5 3.75 2008 93.36571912
1 81.5 3.75 2007 93.36571912
... 92300 rows
What I want is this:
siteId d_lat d_lon year data1 data2 country precipitation
1 13.75 18.75 2009 0.598 0.351 Angola 144.286
1 13.75 18.75 2008 0.654 0.330 Angola 114.970
1 13.75 18.75 2007 0.489 0.381 Angola 70.000
1 13.75 18.75 2006 0.554 0.389 Angola 174.179
1 13.75 18.75 2005 0.321 0.321 Angola 174.743
1 13.75 18.75 2004 0.598 0.351 Angola 70.506
1 13.75 18.75 2003 0.654 0.330 Angola 173.716
1 13.75 18.75 2002 0.489 0.381 Angola 74.162
1 13.75 18.75 2001 0.554 0.389 Angola 139.445
2 78.75 163.75 2009 0.285 0.155 Antarctica 0
2 78.75 163.75 2008 0.285 0.155 Antarctica 0
2 78.75 163.75 2007 0.285 0.155 Antarctica 0
2 78.75 163.75 2006 0.285 0.155 Antarctica 0
Am I doing something completely stupid? I am stumped.
Thanks so much for any advice.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
如果该查询超时,则说明存在索引问题,而不一定是查询问题。这为您在 where 子句中提供了几个要过滤的谓词,因此它应该会大大减少您的联接,但您可能需要一个在两个表上都包含 siteId、year、lat 和 lon 的索引。
If that query is timing out, you have an indexing problem, not necessarily a query problem. This gives you several predicates in the where clause to filter on, so it should reduce your joins quite a bit, but you may need an index that includes siteId, year, lat, and lon on both tables.
将
(date, d_lat, d_lon)
索引添加到第一个表,并将(year, lat, lon)
索引添加到第二个表。然后,尝试连接。根据您的评论,我建议您使用第二个查询:
既然您已经添加了一些索引,您还可以发布上述查询计划吗? (使用
EXPLAIN SELECT ...
)Add an index on
(date, d_lat, d_lon)
to the first table and a(year, lat, lon)
index to the second table. Then, try the joins.From your comments, I suggest you use the second query:
Can you also post the query plan for the above, now that you have added some indexes? (use
EXPLAIN SELECT ...
)最后一行:
根据您的表格,这应该是 fieldSites.year
这是拼写错误还是错误?
The last line:
According to your tables this should be fieldSites.year
Is that a typo or the error?