UniData UniQuery - 两个WITH

发布于 2024-12-09 22:09:12 字数 334 浏览 3 评论 0原文

好吧,我对 SQL 语言知之甚少,我想知道 unidata 中两个WITH 与一个WITH 缓慢的可能原因是什么。

数据库大约有 100 万行。

即/

选择某处WITH Column1 =“str”和WITH Column2 =“Int” 5<分钟

SELECT某处WITH Column1 =“str”相比〜1秒

某处被索引(据我所知)

所以我做错了什么吗?

如果需要更多信息,请询问,但不确定要提供什么。

还有WITH和WHERE有什么区别?

Alright I have little to no knowledge of SQL language, and am wondering what are the possible reasons for the slowness of two WITH vs one WITH in unidata.

Database has around ~1 million rows.

Ie/

SELECT somewhere WITH Column1 = "str" AND WITH Column2 = "Int" 5< minutes

Compared to

SELECT somewhere WITH Column1 = "str" ~1 second

somewhere is indexed (from my knowledge)

so is there anything I'm doing wrong?

If more information is required just ask, not sure what to supply.

Also whats the difference between WITH and WHERE?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

奢望 2024-12-16 22:09:12

这不是 SQL,而是 UniQuery。

为了向您澄清这一点,您无法索引该文件(在本例中为某处),只能索引该文件的列文件。您可能会发现 Column1 已建立索引,而 Column2 未建立索引。在某处输入LIST.INDEX以找出哪些列已被索引。

对于您的问题,您仅将 Column1 上的选择与 Column1 和 Column1 上的选择进行了比较。 Column2 并假设响应速度慢得多纯粹是因为您选择了 2 列。您的下一个文本应该是仅在 Column2 上进行选择,然后看看它有多慢。

除了索引之外,还有许多可能的原因可以解释响应的差异。在 UniData 中,列被定义为“字典项”,有不同类型字典项目。最基本的是 D 类型字典项,它只是对记录中字段的直接引用。另一种类型是 I 或 V 类型,它是派生字段。派生字段可以像返回常量一样简单,也可以像执行与另一个文件和/或某种形式的复杂计算的 JOIN 等效操作一样复杂。这应该很容易看出,不同的列可能需要截然不同的处理量来处理。

其他原因包括列在记录中的深度(第一个字段引用将比记录中后面的字段更快)以及可能影响 SELECT 时间的潜在查询缓存。

有关详细信息,请查看 Rocket Software 上的数据库手册。

This isn't SQL, it is UniQuery.

To clarify it for you, you can't index the file (somewhere, in this case), only the columns of the file. You might find Column1 is indexed and Column2 is not. Type in LIST.INDEX somewhere to find out what columns have been indexed.

For your question, you have only compared selecting on Column1 against selecting on Column1 & Column2 and assumed the vastly slower response is purely because you selected on 2 columns. Your next text should have been to select only on Column2 and seen how slow that was.

There are are many possible reasons to explain the difference in response, aside from indexing. In UniData columns are defined as 'dictionary items' There are different types of dictionary items. The most basic is a D-type dictionary item which is just a direct reference to a field in the record. Another type is the I or V-type, which is a derived field. The derived field can be as simple as returning a constant or as complex as performing an equivalent performing a JOIN with another file and/or some form of complex calculation. This this is should be simple to see that different columns can take vastly different amounts of processing to handle.

Other reasons are how deep in the record the column is (first field references will be faster than fields later in the record) as well as potential query caching that can affect the timings of your SELECTs.

For more information, check out the database's manuals at Rocket Software.

再见回来 2024-12-16 22:09:12

索引字段上的单列 SELECT 甚至不需要读取任何数据文件记录。如果你仔细观察,你会发现索引文件是一个普通的哈希文件,单列 SELECT 仅意味着读取带有“str”键的索引文件记录。这可以在不到一秒的时间内返回成千上万个密钥。

添加第二列后,您可能会强制系统读取所有成千上万的记录,即使第二列已建立索引。这将需要相当多的时间。

一般来说,具有少量唯一值的字段上的索引的用途是可疑的。如果第二列包含具有大量可能值的数据,导致每个特定索引值的记录数量较少,则最好安排 SELECT 以使所使用的索引位于第二列上。我不确定,但可能可以简单地颠倒 SELECT 语句中列的顺序来执行此操作。否则,您可能需要连续运行两个 SELECT 语句。

例如,假设文件有 600,000 条记录(Column1 = "str")和 2,000 条记录(Column2 = "int"):

>SELECT somewhere WITH Column2 = "int"
>>SELECT somewhere with Column1 = "str"

将读取 2,000 条记录,并且几乎立即返回。

如果 Column1 和 Column2 的组合是您经常选择的内容,那么您可能需要创建一个组合两者的新字典项,并在其上构建索引。

话虽如此,U2 系统不应该花费 5 分钟来运行包含一百万条记录的文件。文件很有可能已经严重溢出,需要使用更大的模数来调整大小以提高性能。

A single column SELECT on an indexed field will not even require that any data file records are read. If you look under the hood, you'll see that the index file is a normal hash file, and the single column SELECT will simply mean that the index file record with the key "str" is read. This could return thousands and thousands of keys in less than a second.

Once you add the second column, you are probably forcing the system to read all of those thousands and thousands of records, EVEN IF THE SECOND COLUMN IS INDEXED. This is going to take a measurable amount of more time.

In general, an index on a field with a small number of unique values is of dubious use. If the second column contains data that has a large number of possible values, leading to a smaller number of records with each particular index value, then it would be best to arrange the SELECT such that the index used is on the second column. I'm not sure, but it might be possible to simply reverse the order of the columns in the SELECT statement to do this. Otherwise you might need to run two SELECT statements back to back.

As an example, assume that the file has 600,000 records with Column1 = "str", and 2,000 records with Column2 = "int":

>SELECT somewhere WITH Column2 = "int"
>>SELECT somewhere with Column1 = "str"

Will read 2,000 records and should return almost instantly.

If the combination of Column1 and Column2 is something that you'll be SELECTing on frequently, then you might want to create a new dictionary item that combines the two, and build an index on that.

That being said, it shouldn't take a U2 system 5 minutes to run through a file of a million records. There's a very good chance that the file has become badly overflowed, and needs to be resized with a larger modulo to improve performance.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文