Python搜索包含对象的对象列表,部分匹配

发布于 2024-10-15 02:04:30 字数 2629 浏览 2 评论 0原文

我正在尝试为一个小型网站构建一个简单的搜索引擎。我最初的想法是避免使用较大的包,例如 Solr、Haystack 等,因为我的搜索需求过于简单。

我希望通过一些指导,我可以使我的代码更加 Pythonic、高效,最重要的是能够正常运行。

预期功能:根据 item_number、产品名称或类别名称的完全或部分匹配返回产品结果(目前没有实现类别匹配)

一些代码:


import pymssql
import utils #My utilities  

class Product(object):  
   def __init__(self, item_number, name, description, category, msds):
        self.item_number = str(item_number).strip()
        self.name = name
        self.description = description
        self.category = category
        self.msds = str(msds).strip()

class Category(object):  
    def __init__(self, name, categories):
        self.name = name
        self.categories = categories
        self.slug = utils.slugify(name)
        self.products = []

categories = (
    Category('Food', ('123', '12A')),
    Category('Tables', ('354', '35A', '310', '31G')),
    Category('Chemicals', ('845', '85A', '404', '325'))
)

products = []

conn = pymssql.connect(...)
curr = conn.cursor()

for Category in categories:
    for c in Category.categories:
        curr.execute('SELECT item_number, name, CAST(description as text), category, msds from tblProducts WHERE category=%s', c)
        for row in curr:
            product = Product(row[0], row[1], row[2], row[3], row[4])
            products.append(product)
            Category.products.append(product)

conn.close()

def product_search(*params):
    results = []
    for product in products:
        for param in params:
            name = str(product.name)
            if (name.find(param.capitalize())) != -1:
                results.append(product)
            item_number = str(product.item_number)
            if (item.number.find(param.upper())) != -1:
                results.append(product)
    print results

product_search('something')


带有我无法更改的表和字段的 MS SQL 数据库。
最多我会拉200个左右的产品。

有些事情让我突然想到。嵌套 for 循环。产品搜索中存在两个不同的 if 语句,可能会导致将重复的产品添加到结果中。

我的想法是,如果我将产品存储在内存中(产品很少会改变),我可以缓存它们,减少数据库依赖性并可能提供有效的搜索。

...现在发布...将回来并添加更多想法

编辑: 我有一个 Category 对象保存产品列表的原因是我想显示按类别组织的产品的 html 页面。此外,实际的类别数字将来可能会发生变化,持有元组似乎是简单无痛的解决方案。那我对数据库有只读访问权限。

单独列出产品列表的原因有点作弊。我有一个页面,显示所有能够查看 MSDS(安全表)的产品。此外,它还减少了搜索时需要遍历的一层。

编辑2:


def product_search(*params):
    results = []
    lowerParams = [ param.lower() for param in params ]

    for product in products:
        item_number = (str(product.item_number)).lower()
        name = (str(product.name)).lower()
        for param in lowerParams:
           if param in item_number or param in name:
               results.append(product)
    print results

I'm trying to build a simple search engine for a small website. My initial thought is to avoid using larger packages such as Solr, Haystack, etc. because of the simplistic nature of my search needs.

My hope is that with some guidance I can make my code more pythonic, efficient, and most importantly function properly.

Intended functionality: return product results based on full or partial matches of item_number, product name, or category name (currently no implementation of category matching)

Some code:


import pymssql
import utils #My utilities  

class Product(object):  
   def __init__(self, item_number, name, description, category, msds):
        self.item_number = str(item_number).strip()
        self.name = name
        self.description = description
        self.category = category
        self.msds = str(msds).strip()

class Category(object):  
    def __init__(self, name, categories):
        self.name = name
        self.categories = categories
        self.slug = utils.slugify(name)
        self.products = []

categories = (
    Category('Food', ('123', '12A')),
    Category('Tables', ('354', '35A', '310', '31G')),
    Category('Chemicals', ('845', '85A', '404', '325'))
)

products = []

conn = pymssql.connect(...)
curr = conn.cursor()

for Category in categories:
    for c in Category.categories:
        curr.execute('SELECT item_number, name, CAST(description as text), category, msds from tblProducts WHERE category=%s', c)
        for row in curr:
            product = Product(row[0], row[1], row[2], row[3], row[4])
            products.append(product)
            Category.products.append(product)

conn.close()

def product_search(*params):
    results = []
    for product in products:
        for param in params:
            name = str(product.name)
            if (name.find(param.capitalize())) != -1:
                results.append(product)
            item_number = str(product.item_number)
            if (item.number.find(param.upper())) != -1:
                results.append(product)
    print results

product_search('something')


MS SQL database with tables and fields I cannot change.
At most I will pull in about 200 products.

Some things that jump out at me. Nested for loops. Two different if statements in the product search which could result in duplicate products being added to the results.

My thought was that if I had the products in memory (the products will rarely change) I could cache them, reducing database dependence and possibly providing an efficient search.

...posting for now... will come back and add more thoughts

Edit:
The reason I have a Category object holding a list of Products is that I want to show html pages of Products organized by Category. Also, the actual category numbers may change in the future and holding a tuple seemed like simple painless solution. That and I have read-only access to the database.

The reason for a separate list of products was somewhat of a cheat. I have a page that shows all products with the ability to view MSDS (safety sheets). Also it provided one less level to traverse while searching.

Edit 2:


def product_search(*params):
    results = []
    lowerParams = [ param.lower() for param in params ]

    for product in products:
        item_number = (str(product.item_number)).lower()
        name = (str(product.name)).lower()
        for param in lowerParams:
           if param in item_number or param in name:
               results.append(product)
    print results

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

给妤﹃绝世温柔 2024-10-22 02:04:30

如果不需要子字符串的位置,请准备循环外部的所有变量并使用 in 而不是 .find

def product_search(*params):
    results = []
    upperParams = [ param.upper() for param in params ]

    for product in products:
        name = str(product.name).upper()
        item_number = str(product.item_number).upper()
        for upperParam in upperParams:
            if upperParam in name or upperParam in item_number:
                results.append(product)
    print results

Prepare all variables outside of the loops and use in instead of .find if you don't need the position of the substring:

def product_search(*params):
    results = []
    upperParams = [ param.upper() for param in params ]

    for product in products:
        name = str(product.name).upper()
        item_number = str(product.item_number).upper()
        for upperParam in upperParams:
            if upperParam in name or upperParam in item_number:
                results.append(product)
    print results
笨死的猪 2024-10-22 02:04:30

如果名称和编号均与搜索参数匹配,则该产品将在结果列表中出现两次。

由于产品数量很少,我建议构建一个 SELECT 查询,例如:

def search(*args):
    import operator
    cats = reduce(operator.add, [list(c.categories) for c in categories], [])

    query = "SELECT * FROM tblProducts WHERE category IN (" + ','.join('?' * len(cats)) + ") name LIKE '%?%' or CAST(item_number AS TEXT) LIKE '%?%' ..."
    curr.execute(query, cats + list(args)) # Not actual code
    return list(curr)

If both the name and number matches the search parameters, the product will appear twice on the result list.

Since the products count is a small number, I recommend constructing a SELECT query like:

def search(*args):
    import operator
    cats = reduce(operator.add, [list(c.categories) for c in categories], [])

    query = "SELECT * FROM tblProducts WHERE category IN (" + ','.join('?' * len(cats)) + ") name LIKE '%?%' or CAST(item_number AS TEXT) LIKE '%?%' ..."
    curr.execute(query, cats + list(args)) # Not actual code
    return list(curr)
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文