如何在Python中设计一个类似sqlite类的字典，它可以使用不同的字段作为“键”？

发布于 2024-09-14 04:03:49 字数 1236 浏览 3 评论 0原文

我有一个这样的数据结构，

"ID  NAME  BIRTH     AGE    SEX"
=================================
1   Joe    01011980  30     M
2   Rose   12111986  24     F
3   Tom    31121965  35     M
4   Joe    15091990  20     M

我想使用python + sqlite以简单的方式存储和查询数据。我正在尝试设计一个类似字典的对象来存储和检索这些信息，数据库也可以以简单的方式与其他应用程序共享。（只是其他应用程序的普通数据库表，那么 pickle 和 ySerial 之类的对象不适合它。）

例如：

d = mysqlitedict.open('student_table')  
 d['1'] = ["Joe","01011980","30","M"]    
 d['2'] = ["Rose","12111986","24","F"]

这可能是合理的，因为我可以使用 __setitem__() 来搭车如果“ID”作为键，其余部分作为该字典类对象的值。

问题是，如果我想使用其他字段作为语义上的键，以“NAME”为例：

 d['Joe'] = ["1","01011980","30","M"]

这将是一个问题，因为类似字典的对象在语义上应该有一个键/值对，因为现在“ID”是键，“NAME”不能作为此处的覆盖键。

那么我的问题是，我可以设计我的课程吗？然后我可以这样做吗？

 d[key="NAME", "Joe"] = ["1","01011980","30","M"] 
 d[key="ID",'1'] = ["Joe","01011980","30","M"]  

 d.update(key = "ID", {'1':["Joe","01011980","30","M"]})

>>>d[key="NAME", 'Joe']
["1","Joe","01011980","30","M"]
["1","Joe","15091990","20","M"]

>>>d.has_key(key="NAME", 'Joe']
True

如有任何回复，我将不胜感激！

肯尼迪

原文

I have a such a data structure,

"ID  NAME  BIRTH     AGE    SEX"
=================================
1   Joe    01011980  30     M
2   Rose   12111986  24     F
3   Tom    31121965  35     M
4   Joe    15091990  20     M

I want to use python + sqlite to store and query data in a easy way. I am in trying to design a dict like object to store and retrieve those information, also the database can be shared with other application in an easy way.(just a plain database table for other application, then the pickle and ySerial like object should not fit for it.)

For example:

d = mysqlitedict.open('student_table')  
 d['1'] = ["Joe","01011980","30","M"]    
 d['2'] = ["Rose","12111986","24","F"]

This can be reasonable because I can use __setitem__() to get ride of that if "ID" as the key and rest part as the value of that dict like object.

The problem is if I want to use other field either as key semantically, takes "NAME" for example:

 d['Joe'] = ["1","01011980","30","M"]

That will be a problem, because a dict like object should have a key/value pair semantically, as now "ID" is the key, "NAME" can not as overrode key here.

Then my question is, can I design my class then I may do like this?

 d[key="NAME", "Joe"] = ["1","01011980","30","M"] 
 d[key="ID",'1'] = ["Joe","01011980","30","M"]  

 d.update(key = "ID", {'1':["Joe","01011980","30","M"]})

>>>d[key="NAME", 'Joe']
["1","Joe","01011980","30","M"]
["1","Joe","15091990","20","M"]

>>>d.has_key(key="NAME", 'Joe']
True

I will be appreciated for any reply!

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

冷了相思 2024-09-21 04:03:49

sqlite 是一个 SQL 数据库，在使用时效果最好（封装在 SQLAlchemy 或者其他什么，如果你真的坚持的话；-)。

像 d[key="NAME", 'Joe'] 这样的语法根本就是非法的 Python，不管你做了多少换行和吹气。围绕数据库连接的简单类包装很容易，但它永远不会为您提供该语法 - 像 d.fetch('Joe', key='Name') 这样的东西相当容易实现，但索引与函数调用的语法非常不同，即使在后者中，命名参数也必须位于位置参数之后。

如果您愿意放弃雄心勃勃的语法梦想，转而采用合理的 Python 语法，并且需要帮助设计一个类来实现后者，当然，请随时询问（我很快就要上床睡觉了，但我当然，其他睡得较晚的人也会很乐意提供帮助；-)。

编辑：鉴于OP的澄清（在评论中），看起来 set_key 方法对于维护Python可接受的语法是可以接受的（尽管语义当然仍然是一个有点关闭，因为OP想要一个“类似字典”的对象，它可能有非唯一键——Python中没有这样的东西，真的......但是，我们至少可以近似它一点）。

所以，这是第一个草图（需要 Python 2.6 或更高版本 - 只是因为我使用 collections.MutableMapping 来获取其他类似字典的方法和 .format 来格式化字符串；如果你陷入 2.5，字符串的 % 格式和 UserDict.DictMixin 将起作用）：

import collections
import sqlite3

class SqliteDict(collections.MutableMapping):
  @classmethod
  def create(cls, path, columns):
    conn = sqlite3.connect(path)
    conn.execute('DROP TABLE IF EXISTS SqliteDict')
    conn.execute('CREATE TABLE SqliteDict ({0})'.format(','.join(columns.split())))
    conn.commit()
    return cls(conn)

  @classmethod
  def open(cls, path):
    conn = sqlite3.connect(path)
    return cls(conn)

  def __init__(self, conn):
    # looks like for sime weird reason you want str, not unicode, when feasible, so...:
    conn.text_factory = sqlite3.OptimizedUnicode
    c = conn.cursor()
    c.execute('SELECT * FROM SqliteDict LIMIT 0')
    self.cols = [x[0] for x in c.description]
    self.conn = conn
    # start with a keyname (==column name) of `ID`
    self.set_key('ID')

  def set_key(self, key):
    self.i = self.cols.index(key)
    self.kn = key

  def __len__(self):
    c = self.conn.cursor()
    c.execute('SELECT COUNT(*) FROM SqliteDict')
    return c.fetchone()[0]

  def __iter__(self):
    c = self.conn.cursor()
    c.execute('SELECT * FROM SqliteDict')
    while True:
      result = c.fetchone()
      if result is None: break
      k = result.pop(self.i)
      return k, result

  def __getitem__(self, k):
    c = self.conn.cursor()
    # print 'doing:', 'SELECT * FROM SqliteDict WHERE {0}=?'.format(self.kn)
    # print ' with:', repr(k)
    c.execute('SELECT * FROM SqliteDict WHERE {0}=?'.format(self.kn), (k,))
    result = [list(r) for r in c.fetchall()]
    # print ' resu:', repr(result)
    for r in result: del r[self.i]
    return result

  def __contains__(self, k):
    c = self.conn.cursor()
    c.execute('SELECT * FROM SqliteDict WHERE {0}=?'.format(self.kn), (k,))
    return c.fetchone() is not None

  def __delitem__(self, k):
    c = self.conn.cursor()
    c.execute('DELETE FROM SqliteDict WHERE {0}=?'.format(self.kn), (k,))
    self.conn.commit()

  def __setitem__(self, k, v):
    r = list(v)
    r.insert(self.i, k)
    if len(r) != len(self.cols):
      raise ValueError, 'len({0}) is {1}, must be {2} instead'.format(r, len(r), len(self.cols))
    c = self.conn.cursor()
    # print 'doing:', 'REPLACE INTO SqliteDict VALUES({0})'.format(','.join(['?']*len(r)))
    # print ' with:', r
    c.execute('REPLACE INTO SqliteDict VALUES({0})'.format(','.join(['?']*len(r))), r)
    self.conn.commit()

  def close(self):
    self.conn.close()


def main():
  d = SqliteDict.create('student_table', 'ID NAME BIRTH AGE SEX')
  d['1'] = ["Joe", "01011980", "30", "M"]    
  d['2'] = ["Rose", "12111986", "24", "F"]
  print len(d), 'items in table created.'
  print d['2']
  print d['1']
  d.close()

  d = SqliteDict.open('student_table')
  d.set_key('NAME')
  print len(d), 'items in table opened.'
  print d['Joe']


if __name__ == '__main__':
  main()

该类并不意味着直接实例化（尽管可以通过将打开的 sqlite3 连接传递到数据库来实现）一个适当的 SqliteDict 表），但通过两个类方法 create （创建一个新的数据库或清除现有的数据库）和 open，这似乎比替代方案更符合OP的愿望（让__init__采用数据库文件路径和描述如何打开它的选项字符串，就像gdbm等模块采用 - - 'r' 打开只读，'c' 创建或擦除，'w' 打开读写 --当然很容易调整）。在传递给create的列（作为空格分隔的字符串）中，必须有一个名为ID的列（我没有太在意）为构建和使用此类实例时可能发生的许多用户错误提出“正确的”错误；错误将在所有不正确的使用中发生，但不一定对用户来说是明显的）。

一旦打开（或创建）实例，它的行为就尽可能接近字典，除了所有设置的值必须是长度正确的列表，而返回的值是列表的列表（由于奇怪的“非唯一密钥”问题）。例如，上面的代码在运行时会打印

2 items in table created.
[['Rose', '12111986', '24', 'F']]
[['Joe', '01011980', '30', 'M']]
2 items in table opened.
[['1', '01011980', '30', 'M']]

“Pythonically荒谬”的行为是d[x] = d[x]will失败——因为右手side 是一个列表，例如具有单个项目（它是列值的列表），而项目分配绝对需要一个具有例如四个项目（列值）的列表。这种荒谬存在于OP请求的语义中，并且只能通过再次彻底改变这种荒谬的所需语义来改变（例如，强制项目分配在RHS上具有列表列表，并使用executemany代替普通执行）。

键的非唯一性也使得无法猜测是否 d[x] = v，对于对应于某个数字 n 的键 k表条目的数量，旨在替换一个（如果是的话，哪个一个？！）或所有这些条目，或者添加另一个新条目。在上面的代码中，我采用了“添加另一个条目”解释，但使用 SQL 语句 REPLACE 时，如果更改 CREATE TABLE 来指定一些唯一性约束，如果违反唯一性约束，则会将某些语义从“添加条目”更改为“替换条目”。

我会让你们所有人都玩一下这段代码，并反映出 Python 映射和关系表之间的语义差距有多大，OP 迫切希望弥合这一差距（显然这是他“使用更好的语法”的冲动的副作用）比 SQL 提供的更多——我想知道他是否已经按照我推荐的方式查看了 SqlAlchemy）。

我认为，最后，重要的教训是我在开头所说的，在我昨天写的答案部分的第一段中，我自引......：

sqlite 是一个 SQL 数据库并且可以工作
到目前为止，这样使用时效果最好（包装
在 SQLAlchemy 或其他任何地方，如果您
真的坚持;-)。

sqlite is a SQL database and works by far best when used as such (wrapped in SQLAlchemy or whatever if you really insist;-).

Syntax such as d[key="NAME", 'Joe'] is simply illegal Python, no matter how much wrapping and huffing and puffing you may do. A simple class wrapper around the DB connection is easy, but it will never give you that syntax -- something like d.fetch('Joe', key='Name') is reasonably easy to achieve, but indexing has very different syntax from function calls, and even in the latter named arguments must come after positional ones.

If you're willing to renounce your ambitious syntax dreams in favor of sensible Python syntax, and need help designing a class to implement the latter, feel free to ask, of course (I'm off to bed pretty soon, but I'm sure other, later-sleepers will be eager to help;-).

Edit: given the OP's clarifications (in a comment), it looks like a set_key method is acceptable to maintain Python-acceptable syntax (though the semantics of course will still be a tad off, since the OP wants a "dict-like" object which may have non unique keys -- no such thing in Python, really... but, we can approximate it a bit, at least).

So, here's a very first sketch (requires Python 2.6 or better -- just because I've used collections.MutableMapping to get other dict-like methods and .format to format strings; if you're stuck in 2.5, %-formatting of strings and UserDict.DictMixin will work instead):

import collections
import sqlite3

class SqliteDict(collections.MutableMapping):
  @classmethod
  def create(cls, path, columns):
    conn = sqlite3.connect(path)
    conn.execute('DROP TABLE IF EXISTS SqliteDict')
    conn.execute('CREATE TABLE SqliteDict ({0})'.format(','.join(columns.split())))
    conn.commit()
    return cls(conn)

  @classmethod
  def open(cls, path):
    conn = sqlite3.connect(path)
    return cls(conn)

  def __init__(self, conn):
    # looks like for sime weird reason you want str, not unicode, when feasible, so...:
    conn.text_factory = sqlite3.OptimizedUnicode
    c = conn.cursor()
    c.execute('SELECT * FROM SqliteDict LIMIT 0')
    self.cols = [x[0] for x in c.description]
    self.conn = conn
    # start with a keyname (==column name) of `ID`
    self.set_key('ID')

  def set_key(self, key):
    self.i = self.cols.index(key)
    self.kn = key

  def __len__(self):
    c = self.conn.cursor()
    c.execute('SELECT COUNT(*) FROM SqliteDict')
    return c.fetchone()[0]

  def __iter__(self):
    c = self.conn.cursor()
    c.execute('SELECT * FROM SqliteDict')
    while True:
      result = c.fetchone()
      if result is None: break
      k = result.pop(self.i)
      return k, result

  def __getitem__(self, k):
    c = self.conn.cursor()
    # print 'doing:', 'SELECT * FROM SqliteDict WHERE {0}=?'.format(self.kn)
    # print ' with:', repr(k)
    c.execute('SELECT * FROM SqliteDict WHERE {0}=?'.format(self.kn), (k,))
    result = [list(r) for r in c.fetchall()]
    # print ' resu:', repr(result)
    for r in result: del r[self.i]
    return result

  def __contains__(self, k):
    c = self.conn.cursor()
    c.execute('SELECT * FROM SqliteDict WHERE {0}=?'.format(self.kn), (k,))
    return c.fetchone() is not None

  def __delitem__(self, k):
    c = self.conn.cursor()
    c.execute('DELETE FROM SqliteDict WHERE {0}=?'.format(self.kn), (k,))
    self.conn.commit()

  def __setitem__(self, k, v):
    r = list(v)
    r.insert(self.i, k)
    if len(r) != len(self.cols):
      raise ValueError, 'len({0}) is {1}, must be {2} instead'.format(r, len(r), len(self.cols))
    c = self.conn.cursor()
    # print 'doing:', 'REPLACE INTO SqliteDict VALUES({0})'.format(','.join(['?']*len(r)))
    # print ' with:', r
    c.execute('REPLACE INTO SqliteDict VALUES({0})'.format(','.join(['?']*len(r))), r)
    self.conn.commit()

  def close(self):
    self.conn.close()


def main():
  d = SqliteDict.create('student_table', 'ID NAME BIRTH AGE SEX')
  d['1'] = ["Joe", "01011980", "30", "M"]    
  d['2'] = ["Rose", "12111986", "24", "F"]
  print len(d), 'items in table created.'
  print d['2']
  print d['1']
  d.close()

  d = SqliteDict.open('student_table')
  d.set_key('NAME')
  print len(d), 'items in table opened.'
  print d['Joe']


if __name__ == '__main__':
  main()

The class is not meant to be instantiated directly (though it's OK to do so by passing an open sqlite3 connection to a DB with an appropriate SqliteDict table) but through the two class methods create (to make a new DB or wipe out an existing one) and open, which seems to match the OP's desires better than the alternative (have __init__ take a DB file path an an option string describing how to open it, just like modules such as gdbm take -- 'r' to open read-only, 'c' to create or wipe out, 'w' to open read-write -- easy to adjust of course). Among the columns passed (as a whitespace-separated string) to create, there must be one named ID (I haven't given much care to raising "the right" errors for any of the many, many user errors that can occur on building and using instances of this class; errors will occur on all incorrect usage, but not necessarily ones obvious to the user).

Once an instance is opened (or created), it behaves as closely to a dict as possible, except that all values set must be lists of exactly the right length, while the values returned are lists of lists (due to the weird "non-unique key" issue). For example, the above code, when run, prints

2 items in table created.
[['Rose', '12111986', '24', 'F']]
[['Joe', '01011980', '30', 'M']]
2 items in table opened.
[['1', '01011980', '30', 'M']]

The "Pythonically absurd" behavior is that d[x] = d[x] will fail -- because the right hand side is a list e.g. with a single item (which is a list of the column values) while the item assignment absolutely requires a list with e.g. four items (the column values). This absurdity is in the OP's requested semantics, and could be altered only by drastically changing such absurd required semantics again (e.g., forcing item assignment to have a list of lists on the RHS, and using executemany in lieu of plain execute).

Non-uniqueness of keys also makes it impossible to guess if d[x] = v, for a key k which corresponds to some number n of table entries, is meant to replace one (and if so, which one?!) or all of those entries, or add another new entry instead. In the code above I've taken the "add another entry" interpretation, but with a SQL statement REPLACE that, should the CREATE TABLE be changed to specify some uniqueness constraints, will change some semantics from "add entry" to "replace entries" if and when uniqueness constraints would otherwise be violated.

I'll let you all to play with this code, and reflect how huge the semantic gap is between Python mappings and relational tables, that the OP is desperately keen to bridge (apparently as a side effect of his urge to "use nicer syntax" than SQL affords -- I wonder if he has looked at SqlAlchemy as I recommended).

I think, in the end, the important lesson is what I stated right at the start, in the first paragraph of the part of the answer I wrote yesterday, and I self-quote...: