如何用Python实现数据库风格的表

发布于 2024-10-02 18:32:54 字数 707 浏览 1 评论 0原文

我正在实现一个类似于典型数据库表的类：

具有命名列和未命名行，
具有主键，我可以通过主键引用行，
支持按主键检索和分配，并且
可以要求列标题添加唯一或非唯一任何列的索引，允许快速检索在该列中具有给定值的行（或行集），
删除行的速度很快，并且实现为“软删除”：该行在物理上保留，但被标记为删除并且不会出现在任何后续检索操作中
添加列速度很快行
很少添加
列很少被删除

我决定直接实现该类，而不是使用 sqlite 的包装器。

什么是一个好的数据结构？

举个例子，我正在考虑的一种方法是字典。它的键是表的主键列中的值；它的值是通过以下方式之一实现的行：

作为列表。列号映射到列标题（一个方向使用列表，另一个方向使用地图）。这里，检索操作首先将列标题转换为列号，然后在列表中查找相应的元素。
作为字典。列标题是该字典的键。

不确定两者的优缺点。

我想编写自己的代码的原因是：

我需要跟踪行删除。也就是说，我希望能够随时报告哪些行被删除以及出于什么“原因”（“原因”被传递给我的删除方法）。
我在索引期间需要一些报告（例如，在构建非唯一索引时，我想检查某些条件并报告它们是否被违反）

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

素罗衫 2024-10-09 18:32:54

您可能需要考虑创建一个在底层使用内存中 sqlite 表的类：

import sqlite3

class MyTable(object):
    def __init__(self):
        self.conn=sqlite3.connect(':memory:')
        self.cursor=self.conn.cursor()
        sql='''\
            CREATE TABLE foo ...
        '''
        self.execute(sql)
    def execute(self,sql,args):
        self.cursor.execute(sql,args)
    def delete(self,id,reason):
        sql='UPDATE table SET softdelete = 1, reason = %s where tableid = %s'
        self.cursor.execute(sql,(reason,id,))
    def verify(self):
        # Check that certain conditions are true
        # Report (or raise exception?) if violated
    def build_index(self):
        self.verify()
        ...

可以通过使用 softdelete 列（bool 类型）来实现软删除。
同样，您可以有一列来存储删除原因。
取消删除只涉及更新行并更改 softdelete 值。
选择尚未删除的行可以通过 SQL 条件 WHERE softdelete != 1 来实现。

您可以编写一个 verify 方法来验证数据的条件是否得到满足。您可以从 build_index 方法中调用该方法。

另一种选择是使用 numpy 结构化掩码数组。

很难说什么最快。也许唯一可靠的判断方法是为每个数据编写代码，并使用 timeit 对实际数据进行基准测试。

You might want to consider creating a class which uses an in-memory sqlite table under the hood:

import sqlite3

class MyTable(object):
    def __init__(self):
        self.conn=sqlite3.connect(':memory:')
        self.cursor=self.conn.cursor()
        sql='''\
            CREATE TABLE foo ...
        '''
        self.execute(sql)
    def execute(self,sql,args):
        self.cursor.execute(sql,args)
    def delete(self,id,reason):
        sql='UPDATE table SET softdelete = 1, reason = %s where tableid = %s'
        self.cursor.execute(sql,(reason,id,))
    def verify(self):
        # Check that certain conditions are true
        # Report (or raise exception?) if violated
    def build_index(self):
        self.verify()
        ...

Soft-delete can be implemented by having a softdelete column (of bool type).
Similarly, you can have a column to store reason for deletion.
Undeleting would simply involve updating the row and changing the softdelete value.
Selecting rows that have not been deleted could be achieved with the SQL condition WHERE softdelete != 1.

You could write a verify method to verify conditions on your data are satisfied. And you could call that method from within your build_index method.

Another alternative is to use a numpy structured masked array.

It's hard to say what would be fastest. Perhaps the only sure way to tell would be to write code for each and benchmark on real-world data with timeit.

回复收藏 0 原文

心安伴我暖 2024-10-09 18:32:54

我会考虑构建一个带有元组或列表键的字典。例如： my_dict(("col_2", "row_24")) 会得到这个元素。从那里开始，编写“get_col”和“get_row”方法以及前面两个方法中的“get_row_slice”和“get_col_slice”来访问您的数据库将非常容易（如果对于非常大的数据库来说不是非常快）方法。

使用这样的完整字典有两个优点。 1）获取单个元素会比你提出的两种方法更快； 2）如果您想在列中具有不同数量的元素（或缺少的元素），这将使其变得非常简单且内存高效。

只是一个想法:)我很好奇人们会建议什么包！

干杯

回复收藏 0 原文