查询大量数据进行网页展示
我是 Oracle 新手(了解一些基本 SQL),但任务不知何故分配给了我。
我的任务是优化查询。它连接一些大表(查询比较耗时)并将结果返回到网页(可以使用分页)。
我的问题是我应该朝什么方向看?如果能提供链接就更好了。
以下是一些详细信息:
采用的优化方法
index on dayNum, timeNum, entityID columns
行数:
- time_seq: 100,000
- Indicator_set: 50,000,000
- tity amount: 1,000
查询:
select *
from time_seq seq,
(
select entityID,
dayNum,
timeNum,
sum(ind1) sum_ind1,
avg(ind2) avg_ind2,
max(ind3) max_ind3
from indicator_set
group by entityID, dayNum, timeNum
-- the dayNum timeNum are grouped to the time sequence standard
-- in time_seq table
) sum
where seq.entityID=sum.entityID
and seq.dayNum=sum.dayNum
and seq.timeNum=sum.timeNum
表 time_seq
entityID, dayNum, timeNum
--------------------------------
object1, 20110818, 220000
object1, 20110818, 223000
object1, 20100818, 230000
object1, 20110819, 220000
object1, 20110819, 223000
object1, 20100819, 230000
object2, 20110818, 220000
object2, 20110818, 223000
object2, 20100818, 230000
object2, 20110819, 220000
object2, 20110819, 223000
object2, 20100819, 230000
表格indicator_set
entityID, dayNum, timeNUm, ind1, ind2, ind3
--------------------------------------------
object1, 20110818, 220000, 23,34,23
object1, 20110818, 220500, 23,54,543
object1, 20110818, 220530, 23,54,543
object1, 20110818, 220610, 23,54,543
object1, 20110818, 222900, 23,54,543
...
PS:抱歉,我没有提供表格和查询的确切格式/详细信息。就是太复杂了。
干杯~
I am new to Oracle (know some basic SQL), but the task is somehow assigned to me.
My task is to optimize a query. It joins some large tables (query is time consuming) and returns the result to a webpage (paging can be used).
My question is what direction should I look into? It would be better if links can be provided.
Here is some details:
Optimized methods taken
index on dayNum, timeNum, entityID columns
Number of rows:
- time_seq: 100,000
- indicator_set: 50,000,000
- tity amount: 1,000
Query:
select *
from time_seq seq,
(
select entityID,
dayNum,
timeNum,
sum(ind1) sum_ind1,
avg(ind2) avg_ind2,
max(ind3) max_ind3
from indicator_set
group by entityID, dayNum, timeNum
-- the dayNum timeNum are grouped to the time sequence standard
-- in time_seq table
) sum
where seq.entityID=sum.entityID
and seq.dayNum=sum.dayNum
and seq.timeNum=sum.timeNum
Table time_seq
entityID, dayNum, timeNum
--------------------------------
object1, 20110818, 220000
object1, 20110818, 223000
object1, 20100818, 230000
object1, 20110819, 220000
object1, 20110819, 223000
object1, 20100819, 230000
object2, 20110818, 220000
object2, 20110818, 223000
object2, 20100818, 230000
object2, 20110819, 220000
object2, 20110819, 223000
object2, 20100819, 230000
Table indicator_set
entityID, dayNum, timeNUm, ind1, ind2, ind3
--------------------------------------------
object1, 20110818, 220000, 23,34,23
object1, 20110818, 220500, 23,54,543
object1, 20110818, 220530, 23,54,543
object1, 20110818, 220610, 23,54,543
object1, 20110818, 222900, 23,54,543
...
PS: sorry I didn't provide exact format/detail of the table and query. Just too complicated.
cheers~
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
您正在如何处理此查询的结果?假设查询返回大约 100,000 行(我假设联接不会消除 TIME_SEQ 中的大量行),则该查询没有多大意义。您通过网页向用户返回 100,000 个无序行 - 人类不可能翻阅 100,000 行,并且以无序方式呈现这些行(或在中间层)。
为了填充网页而尝试动态聚合 5000 万行数据几乎没有意义。这本质上是一个缓慢的操作——如果用户期望在一两秒内得到响应,那么您就没有时间读取和聚合 5000 万行。您可以在
INDICATOR_SET
上创建一个物化视图,用于预聚合数据。这将使插入和更新速度变慢,但会加快查询速度。当然,即使您将 5000 万行预先聚合为 100,000 行,您仍然尝试连接两个 100,000 行表并向客户端发送 100,000 个无序行,这不太合理。您可以添加一个 ORDER BY 子句,以便行按合理的顺序排列,但这通常会增加查询时间,因为您引入了额外的排序。根据您想要排序的内容,可能有一些方法可以优化它。然而,我仍然会回到一个根本问题,即向人类客户端返回 100,000 行从根本上来说不是正确的方法。
What are you doing with the results of this query? Assuming that the query returns on the order of 100,000 rows (I assume that the join does not eliminate a substantial number of the rows in
TIME_SEQ
), the query doesn't make a lot of sense. You're returning 100,000 unordered rows to a user via a web page-- there is no way that a human would ever page through 100,000 rows and there is no way that it makes sense to present those rows unordered (or to order them in the middle tier).It would very rarely make sense to try to aggregate 50 million rows of data on the fly in order to populate a web page. That's going to be an inherently slow operation-- if users are expecting a response in a second or two, you don't have time to read and aggregate 50 million rows. You could potentially create a materialized view on
INDICATOR_SET
that could be used to pre-aggregate the data. That will make inserts and updates slower but should speed up your query.Of course, even if you pre-aggregate the 50 million rows down to just 100,000 rows, you're still trying to join two 100,000 row tables and send 100,000 unordered rows to the client which is unlikely to be reasonable. You could potentially add an
ORDER BY
clause so that the rows come in a reasonable order but that is generally going to add time to the query since you're introducing an extra sort. There may be ways to optimize that depending on what you intend to sort by. However, I'd still come back to the fundamental problem that returning 100,000 rows to a human client is fundamentally not the right approach.在子查询中链接indicator_set 和time_seq 表是否会限制在大型indicator_set 表中分组的行数?
不过,外部查询仍将确保只从 time_seq 表中返回记录。
这有帮助吗?
编辑:顺便说一下,我会使用比 SUM 更好的表别名,因为它是一个 Oracle 函数。
Would linking the indicator_set and time_seq tables within the subquery limit the amount of rows to be grouped in the large indicator_set table?
The outher query would still ensure that only the records from the time_seq table would be returned from the query though.
Is this helpful?
EDIT: By the way, i'd use a better table alias than SUM as it is an oracle function.
您可以根据查询创建物化视图:
该表会更小。您也可以在此物化视图中创建索引。您的选择将是这样的:
您的物化视图可以在快速模式、正常模式等下刷新...如果您想要更多信息,请看一下:
http://download.oracle.com/docs/cd /B13789_01/server.101/b10759/statements_6002.htm
http://orafaq.com/wiki/Oracle_Materialized_Views
You can create a materialized view based on your query:
This table would be smaller. You can create your indexes also in this materialized view. Your select would be something like this:
Your materialized view could be refresehed in fast mode, normal, etc... If you want some more intormation, take a look:
http://download.oracle.com/docs/cd/B13789_01/server.101/b10759/statements_6002.htm
http://orafaq.com/wiki/Oracle_Materialized_Views