数据库索引
我需要开发一个“简单”的数据库索引实现,以便在分布式环境中使用。 我对这个学科几乎一无所知,而且时间有点压力。
我很想听听关于这个主题的一些意见、例子和算法。 我希望能够对我需要实施的内容有一个心理表征。
编辑:我指的是聚集索引
I need to develop a "naive" implementation of database indexes for use in a distributed environment. I know almost nothing about the subject, and I'm a bit pressured by time.
I would love to hear some opinions, examples and algorithms on the subject.
I'd like to be able to have a mental representation of what I need to implement.
EDIT: I'm referring to clustered indexes
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
索引基本上有两种主要类型:
聚集(即数据是物理组织的,如果需要,您可以在每次插入时重新排序)
典型用例:物理组织通常与插入顺序相同,因此重新排序开销不是问题。 例如,连续 UID(数据库上下文中所谓的“IDENTITY”字段)就是这种情况
聚集索引的一个明显缺点是您的数据只能有一个这样的索引。
如果插入顺序恰好是排序顺序,则简单实现:使用列表。
非聚集(即您保留指针数据上,如哈希表中)
典型用例:集群是不合适的,因为它会导致很大的插入开销。
根据您的需要,您最终可能会使用这两种数据结构。
可以使用与索引相关的信息的广泛存储库 这里
There are basically two main types of indexes :
Clustered (ie the data is physically organized, and you re-sort it at each insertion if needed)
Typical use-case : the physical organization is usually the same as the insertion order, so the re-sorting overhead is not a problem. This is for example the case with sequential UID's (the so called "IDENTITY" fields in a database context)
An obvious drawback of clustered indexing is that you can have only one such index on your data.
Naive implementation if the insertion order is exactly the sorting order : use a List.
Unclustered (ie you keep pointers on the data, as in a Hashtable)
Typical use-case : The clustering is not appropriate because it would induce to big an insertion overhead.
Depending on your needs, you'll probably end up using on those two datastructures
An extensive repository of Index-related information is available here
一个非常快速且易于实现、非常简单的索引实现,最适合任何具有本机 关联数组 格式是一种散列,其键是要索引的列的现有值,其值是具有该值的行的行 ID 数组。
A really fast-and-easy-to-implement, really naive index implementation, most appropriate to any language that has a native associative array format, is a hash whose keys are extant values for the column you're indexing and whose values are arrays of row IDs for the rows with that value.