历史记录、缺失记录、填空
我有一个表,其中包含按位置划分的成本历史记录。这些每月更新一次。 例如,
Location1, $500, 01-JAN-2009
Location1, $650, 01-FEB-2009
Location1, $2000, 01-APR-2009
如果我查询 3 月 1 日,我想返回 2 月 1 日的值,因为 3 月 1 日不存在。 我已经使用预言机分析编写了一个查询,但这需要太多时间(对于报告来说这很好,但我们使用它来允许用户通过前面直观地看到数据并切换日期,重新查询需要太长了,因为表大约有 100 万行)。 因此,我的下一个想法是简单地用丢失的数据更新表。在上面的例子中,我只需添加一条与 2009 年 2 月 1 日相同的记录,只不过将日期设置为 2009 年 3 月 1 日。
我想知道你们是否都对如何最好地做到这一点有想法。 我的计划是简单地为某个位置创建一个游标,获取第一个记录,然后获取下一个记录,如果下一个记录不是下个月的记录,则插入缺失月份的记录。
更多信息:
CREATE TABLE MAXIMO.FCIHIST_BY_MONTH
(
LOCATION VARCHAR2(8 BYTE),
PARKALPHA VARCHAR2(4 BYTE),
LO2 VARCHAR2(6 BYTE),
FLO3 VARCHAR2(1 BYTE),
REGION VARCHAR2(4 BYTE),
AVG_DEFCOST NUMBER,
AVG_CRV NUMBER,
FCIDATE DATE
)
然后是我正在使用的查询(系统将传递日期和 Parkalpha)。该表大约有 100 万行,而且,虽然生成报告所需的时间还算合理,但交互式显示所需的时间太长
select location, avg_defcost, avg_crv, fcimonth, fciyear,fcidate from
(select location, avg_defcost, avg_crv, fcimonth, fciyear, fcidate,
max(fcidate) over (partition by location) my_max_date
from FCIHIST_BY_MONTH
where fcidate <='01-DEC-2008'
and parkalpha='SAAN'
)
where fcidate=my_max_date;
I have a table that contains a history of costs by location. These are updated on a monthly basis.
For example
Location1, $500, 01-JAN-2009
Location1, $650, 01-FEB-2009
Location1, $2000, 01-APR-2009
if I query for March 1, I want to return the value for Feb 1, since March 1 does not exist.
I've written a query using an oracle analytic, but that takes too much time (it would be fine for a report, but we are using this to allow the user to see the data visually through the front and and switch dates, requerying takes too long as the table is something like 1 million rows).
So, the next thought I had was to simply update the table with the missing data. In the case above, I'd simply add in a record identical to 01-FEB-2009 except set the date to 01-MAR-2009.
I was wondering if you all had thoughts on how to best do this.
My plan had been to simply create a cursor for a location, fetch the first record, then fetch the next, and if the next record was not for the next month, insert a record for the missing month.
A little more information:
CREATE TABLE MAXIMO.FCIHIST_BY_MONTH
(
LOCATION VARCHAR2(8 BYTE),
PARKALPHA VARCHAR2(4 BYTE),
LO2 VARCHAR2(6 BYTE),
FLO3 VARCHAR2(1 BYTE),
REGION VARCHAR2(4 BYTE),
AVG_DEFCOST NUMBER,
AVG_CRV NUMBER,
FCIDATE DATE
)
And then the query I'm using (the system will pass in the date and the parkalpha). The table is approx 1 million rows, and, again, while it takes a reasonable amount of time for a report, it takes way too long for an interactive display
select location, avg_defcost, avg_crv, fcimonth, fciyear,fcidate from
(select location, avg_defcost, avg_crv, fcimonth, fciyear, fcidate,
max(fcidate) over (partition by location) my_max_date
from FCIHIST_BY_MONTH
where fcidate <='01-DEC-2008'
and parkalpha='SAAN'
)
where fcidate=my_max_date;
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
执行此操作的最佳方法是创建一个 PL/SQL 存储过程,该过程从当前开始向后运行并运行无法返回数据的查询。每个月它无法返回数据,它都会为丢失的数据插入一行。
The best way to do this is to create a PL/SQL stored procedure that works backwards from the present and runs queries that fail to return data. Each month that it fails to return data it inserts a row for the missing data.
填充缺失的数据(如果你小心的话)将使查询变得更简单并且运行得更快。
我还会在表中添加一个标志来指示数据缺少已填写的数据,以便如果
您需要稍后删除它(或创建一个没有它的视图)。
我已经填充了缺失的数据,也填充了虚拟数据,这样就不需要外连接了,从而多次提高了查询性能。它不是“干净”和“完美”,但我遵循 Leflar 的第一定律,“永远选择有效的方法”。
您可以在 Oracle 中创建一个作业,该作业将在非高峰时间自动运行以填充缺失的数据。看看:stackoverflow上的这个问题关于创造就业机会。
Filling in the missing data will (if you are careful) make the queries much simpler and run faster.
I would also add a flag to the table to indicate that the data is missing data filled in so that if
you need to remove it (or create a view without it) later you can.
I have filled in missing data and also filled in dummy data so that outer join were not necessary so as to improve query performance a number of times. It is not "clean" and "perfect" but I follow Leflar's #1 Law, "always go with what works."
You can create a job in Oracle that will automatically run at off-peak times to fill in the missing data. Take a look at: This question on stackoverflow about creating jobs.
您提出此请求的具体用例是什么?
在我开发过的每个系统中,如果应该有 MARCH 的记录,但没有 MARCH 的记录,用户会想知道这个事实。除了任何事情之外,他们可能还想调查为什么 MARCH 记录丢失。
现在,如果这基本上是一个性能问题,那么您应该调整查询。或者,如果它的表示问题 - 您想要生成一个十二行的矩阵,并且如果由于某种原因没有记录,那么这是很困难的 - 那么这是一个不同的问题,有多种可能的解决方案。
但说真的,我认为数据库发明丢失记录的替代品是一种不好的做法。
编辑
我从您最近对问题的评论中看到,这确实是一个性能问题 - 索引解决了问题。所以我觉得自己被证明是正确的。
What is your precise use case underlying this request?
In every system I have worked on, if there is supposed to be a record for MARCH and there isn't a record for MARCH the users would like to know that fact. Apart from anything they might want to investigate why the MARCH record is missing.
Now if this is basically a performance issue then you ought to tune the query. Or if it presentation issue - you want to generate a matrix of twelve rows and that is difficult if a doesn't have a record for some reason - then that is a different matter, with a variety of possible solutions.
But seriously, I think it is a bad practice for the database to invent replacements for missing records.
edit
I see from your recent comment on your question that is did turn out to be a performance issue - indexes fixed the problem. So I feel vindicated.