Python 中的小表?

发布于 2024-08-05 11:43:25 字数 379 浏览 7 评论 0原文

假设我没有超过一两打具有不同属性的对象,例如:

UID、名称、值、颜色、类型、位置

我希望能够调用 Location =“Boston”的所有对象,或类型 =“主要”。经典的数据库查询类型的东西。

大多数表解决方案(pytables、*sql)对于这么小的数据集来说确实是大材小用。我是否应该简单地迭代所有对象并为每个数据列创建一个单独的字典(在添加新对象时向字典添加值)?

这将创建如下字典:

{'Boston' : [234, 654, 234], 'Chicago' : [324, 765, 342] } - 其中这 3 位数字条目代表诸如 UID 之类的内容。

正如你所看到的,查询这个会有点痛苦。

有什么替代方案吗?

Let's say I don't have more than one or two dozen objects with different properties, such as the following:

UID, Name, Value, Color, Type, Location

I want to be able to call up all objects with Location = "Boston", or Type = "Primary". Classic database query type stuff.

Most table solutions (pytables, *sql) are really overkill for such a small set of data. Should I simply iterate over all the objects and create a separate dictionary for each data column (adding values to dictionaries as I add new objects)?

This would create dicts like this:

{'Boston' : [234, 654, 234], 'Chicago' : [324, 765, 342] } - where those 3 digit entries represent things like UID's.

As you can see, querying this would be a bit of a pain.

Any thoughts of an alternative?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

So要识趣 2024-08-12 11:43:25

对于小的关系问题,我喜欢使用 Python 的内置 集合

对于 location = 'Boston' OR type = 'Primary' 的示例,如果您有以下数据:

users = {
   1: dict(Name="Mr. Foo", Location="Boston", Type="Secondary"),
   2: dict(Name="Mr. Bar", Location="New York", Type="Primary"),
   3: dict(Name="Mr. Quux", Location="Chicago", Type="Secondary"),
   #...
}

您可以执行 WHERE ... OR ... 查询,如下所示:

set1 = set(u for u in users if users[u]['Location'] == 'Boston')
set2 = set(u for u in users if users[u]['Type'] == 'Primary')
result = set1.union(set2)

或者仅使用一个表达式:

result = set(u for u in users if users[u]['Location'] == 'Boston'
                              or users[u]['Type'] == 'Primary')

您还可以使用 itertools 中的函数来创建相当有效的数据查询。例如,如果您想要执行类似于 GROUP BY city 的操作:

cities = ('Boston', 'New York', 'Chicago')
cities_users = dict(map(lambda city: (city, ifilter(lambda u: users[u]['Location'] == city, users)), cities))

您还可以手动构建索引(构建将位置映射到用户 ID 的 dict)以加快速度。如果这变得太慢或笨重,那么我可能会切换到 sqlite,它现在已包含在内在 Python (2.5) 标准库中。

For small relational problems I love using Python's builtin sets.

For the example of location = 'Boston' OR type = 'Primary', if you had this data:

users = {
   1: dict(Name="Mr. Foo", Location="Boston", Type="Secondary"),
   2: dict(Name="Mr. Bar", Location="New York", Type="Primary"),
   3: dict(Name="Mr. Quux", Location="Chicago", Type="Secondary"),
   #...
}

You can do the WHERE ... OR ... query like this:

set1 = set(u for u in users if users[u]['Location'] == 'Boston')
set2 = set(u for u in users if users[u]['Type'] == 'Primary')
result = set1.union(set2)

Or with just one expression:

result = set(u for u in users if users[u]['Location'] == 'Boston'
                              or users[u]['Type'] == 'Primary')

You can also use the functions in itertools to create fairly efficient queries of the data. For example if you want to do something similar to a GROUP BY city:

cities = ('Boston', 'New York', 'Chicago')
cities_users = dict(map(lambda city: (city, ifilter(lambda u: users[u]['Location'] == city, users)), cities))

You could also build indexes manually (build a dict mapping Location to User ID) to speed things up. If this becomes too slow or unwieldy then I would probably switch to sqlite, which is now included in the Python (2.5) standard library.

坐在坟头思考人生 2024-08-12 11:43:25

我不认为 sqlite 会“大材小用”——它从 2.5 开始就带有标准的 Python,所以不需要安装东西,它可以在内存或本地磁盘文件中创建和处理数据库。真是的,怎么可能更简单呢……?如果您想要内存中的所有内容(包括初始值),并且想要使用字典来表达这些初始值,例如...:

import sqlite3

db = sqlite3.connect(':memory:')
db.execute('Create table Users (Name, Location, Type)')
db.executemany('Insert into Users values(:Name, :Location, :Type)', [
   dict(Name="Mr. Foo", Location="Boston", Type="Secondary"),
   dict(Name="Mr. Bar", Location="New York", Type="Primary"),
   dict(Name="Mr. Quux", Location="Chicago", Type="Secondary"),
   ])
db.commit()
db.row_factory = sqlite3.Row

现在您的内存中微小的“db”已准备就绪。当然,在磁盘文件中创建数据库和/或从文本文件、CSV 等读取初始值并不困难。

查询特别灵活、简单和贴心,例如,您可以随意混合字符串插入和参数替换...:

def where(w, *a):
  c = db.cursor()
  c.execute('Select * From Users where %s' % w, *a)
  return c.fetchall()

print [r["Name"] for r in where('Type="Secondary"')]

发出 [u'Mr.福,你先生。 Quux'],就像更优雅但等效的查询一样

print [r["Name"] for r in where('Type=?', ["Secondary"])]

,您想要的查询只是:

print [r["Name"] for r in where('Location="Boston" or Type="Primary"')]

等等。说真的——有什么不喜欢的呢?

I do not think sqlite would be "overkill" -- it comes with standard Python since 2.5, so no need to install stuff, and it can make and handle databases in either memory or local disk files. Really, how could it be simpler...? If you want everything in-memory including the initial values, and want to use dicts to express those initial values, for example...:

import sqlite3

db = sqlite3.connect(':memory:')
db.execute('Create table Users (Name, Location, Type)')
db.executemany('Insert into Users values(:Name, :Location, :Type)', [
   dict(Name="Mr. Foo", Location="Boston", Type="Secondary"),
   dict(Name="Mr. Bar", Location="New York", Type="Primary"),
   dict(Name="Mr. Quux", Location="Chicago", Type="Secondary"),
   ])
db.commit()
db.row_factory = sqlite3.Row

and now your in-memory tiny "db" is ready to go. It's no harder to make a DB in a disk file and/or read the initial values from a text file, a CSV, and so forth, of course.

Querying is especially flexible, easy and sweet, e.g., you can mix string insertion and parameter substitution at will...:

def where(w, *a):
  c = db.cursor()
  c.execute('Select * From Users where %s' % w, *a)
  return c.fetchall()

print [r["Name"] for r in where('Type="Secondary"')]

emits [u'Mr. Foo', u'Mr. Quux'], just like the more elegant but equivalent

print [r["Name"] for r in where('Type=?', ["Secondary"])]

and your desired query's just:

print [r["Name"] for r in where('Location="Boston" or Type="Primary"')]

etc. Seriously -- what's not to like?

┾廆蒐ゝ 2024-08-12 11:43:25

如果数据量确实很小,我就不会费心索引,可能只是编写一个辅助函数:

users = [
   dict(Name="Mr. Foo", Location="Boston", Type="Secondary"),
   dict(Name="Mr. Bar", Location="New York", Type="Primary"),
   dict(Name="Mr. Quux", Location="Chicago", Type="Secondary"),
   ]

def search(dictlist, **kwargs):
   def match(d):
      for k,v in kwargs.iteritems():
         try: 
            if d[k] != v: 
               return False
         except KeyError:
            return False
      return True

   return [d for d in dictlist if match(d)] 

这将允许像这样的漂亮查询:

result = search(users, Type="Secondary")

If it's really a small amount of data, I'd not bother with an index and probably just write a helper function:

users = [
   dict(Name="Mr. Foo", Location="Boston", Type="Secondary"),
   dict(Name="Mr. Bar", Location="New York", Type="Primary"),
   dict(Name="Mr. Quux", Location="Chicago", Type="Secondary"),
   ]

def search(dictlist, **kwargs):
   def match(d):
      for k,v in kwargs.iteritems():
         try: 
            if d[k] != v: 
               return False
         except KeyError:
            return False
      return True

   return [d for d in dictlist if match(d)] 

Which will allow nice looking queries like this:

result = search(users, Type="Secondary")
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文