oracle中一个非常大的表的建表语句
我计划在表中存储 5 TB 左右的数据。以下是我生成的用于创建表的默认脚本:
CREATE TABLE measurements
(
measurementtime DATE NOT NULL,
height number,
offset number
)
PCTFREE 10
PCTUSED
INITRANS 1
MAXTRANS 255
TABLESPACE mytablespace
STORAGE (
INITIAL 262144
NEXT
PCTINCREASE
MINEXTENTS 1
MAXEXTENTS 2147483645
FREELIST GROUPS 0
FREELISTS 0
)
您能告诉我是否需要修改任何参数吗?
I am planning to store like 5 TB data in a table. Following is the default script I have generated to create the table :
CREATE TABLE measurements
(
measurementtime DATE NOT NULL,
height number,
offset number
)
PCTFREE 10
PCTUSED
INITRANS 1
MAXTRANS 255
TABLESPACE mytablespace
STORAGE (
INITIAL 262144
NEXT
PCTINCREASE
MINEXTENTS 1
MAXEXTENTS 2147483645
FREELIST GROUPS 0
FREELISTS 0
)
Can you please tell if I need to modify any of the parameters?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
如果不了解整个系统,就很难给出好的建议。以下想法是根据您的脚本和之前的问题的猜测我如何设计一个存储非常大数据的表?但是不要相信你读到的一切。您需要进行大量测试。您可以轻松地花费很多天来尝试优化这样的表。
PCTFREE 0:如果您有 5 TB 并且没有索引 I
假设你不会做任何事
更新。如果这是真的,你可以
将 PCTFREE 更改为 0 并节省 10%
你的空间。
NOLOGGING:如果您不需要数据
可恢复,并且如果您的数据库是
在归档日志模式下,那么你可以
想要添加NOLOGGING。不记录
使用 APPEND 插入不会
生成重做。 (这可能是
如果您的表空间是不必要的
已设置为 NOLOGGING。)
PARALLEL:您肯定会想要使用
与此表的平行度。自从
这张表可能不寻常
您的系统,可能更好
在语句中定义并行性
或会话级别。但如果你不这样做
可以控制所有的
针对该表运行的语句
您可能需要考虑定义
这里的学位可以确保
语句不会以串行方式运行。
删除未知选项:我认为您应该删除所有
你没有特别提到的选项
设置,还是不明白。如果你
使用 TOAD 等工具或
DBMS_METADATA 生成脚本
他们总是会列出每个选项,
但通常你应该留下大部分
把这些东西拿出来让 Oracle 使用
无论它喜欢什么。
压缩/分区:就像加里提到的,分区
可能非常有用。但在你的
你提到的上一个问题
存储容量高达 200TB,仅 5TB
每天。你用的是穷人的吗
分区;重新命名和
每天重新创建表格?如果
这只是一天的数据
那么压缩应该非常
对测量时间很有帮助。
块大小:我不确定设置是什么
“mytablespace”是,但既然你
没有提到任何关于它的事情我是
猜测您正在使用标准
块大小。 可能值得
为此使用大块大小
表,以便您可以做得更好
压缩(因为压缩是
每个块完成的数据越多
Oracle 可以压缩的块越多)。
我不愿意提出这个建议
因为99%的时候人们
改变它不做的块大小
他们认为应该怎样。但与
如此大量的数据可能
值得考虑。
It's difficult to give good advice without knowing the entire system. The ideas below are guesses based on your script and on your previous question How do I design a table which will store very large data? But don't believe everything you read. You'll need to do a lot of testing. You can easily spend many days trying to optimize a table like this.
PCTFREE 0: If you have 5 TB and no indexes I
assume you won't be doing any
updates. If that's true, you can
change PCTFREE to 0 and save 10% of
your space.
NOLOGGING: If you don't need your data to be
recoverable, and if your database is
in archivelog mode, then you may
want to add NOLOGGING. NOLOGGING
with APPEND inserts will not
generate redo. (This may be
unnecessary if your tablespace is
already set to NOLOGGING.)
PARALLEL: You'll definitely want to use
parallelism with this table. Since
this table is likely unusual for
your system, it's probably better to
define parallelism at the statment
or session level. But if you won't
have control over all of the
statements run against this table
you may want to consider definining
the degree here to make sure that
statements don't run in serial.
Remove unknown options: I think you should remove all of the
options that you didn't specifically
set, or don't understand. If you
used a tool like TOAD or
DBMS_METADATA to generate the script
they will always list every option,
but usually you should leave most of
that stuff out and let Oracle use
whatever it likes.
Compression/Partitioning: Like Gary mentioned, partitioning
may be very useful. But in your
previous question you mentioned
storing up to 200TB and only 5 TB
per day. Are you using a poor man's
partitioning; re-naming and
recreating the table every day? If
this is just a day's worth of data
then compression should be very
helpful with measurementtime.
Block size: I'm not sure what the settings of
"mytablespace" are, but since you
didn't mention anything about it I'm
guessing you're using the standard
block size. It might be worth
using a large block size for this
table so you can get even better
compression (since compression is
done per block, the more data in a
block the more Oracle can compress).
I'm relucant to suggest this,
because 99% of the time when people
change the block size it doesn't do
what they think it should. But with
such a large amount of data it might
be worth considering.