Python：扩展复杂的树数据结构

发布于 2024-12-14 04:02:20 字数 5113 浏览 3 评论 0原文

我正在探索一种数据结构，它可以扩展为子元素并解析为最终元素。但我只想存储前两层。

示例：假设我从纽约开始，它分为布朗克斯、国王、纽约、皇后区和里士满作为县，但最后不知何故他们决定到美国。

我不确定这是否是一个很好的例子，但只是为了弄清楚这里是对问题的更清晰的解释。

A (expands to) B,C,D -> B (expands to) K,L,M -> K resolves to Z

我最初以一系列 for 循环编写它，然后使用递归，但在递归中我丢失了一些扩展的元素，因此我不会深入研究每个扩展的元素。我已经放置了递归版本和非递归版本。我正在寻找有关构建此数据结构的一些建议，以及最好的方法是什么。

我对扩展版本中的每个元素调用数据库查询，该查询返回项目列表。直到它解析为单个元素。如果没有递归，我不会一直松懈钻探，直到其他人解决的最后一个元素。但对于递归来说就不一样了。我也是 python 新手，所以希望在这样的网站上问这不是一个坏问题。

returnCategoryQuery 是一种通过调用数据库查询返回项目列表的方法。

不带递归

#Dictionary to save initial category with the rest of cl_to
baseCategoryTree = {};
#categoryResults = [];

# query get all the categories a category is linked to
categoryQuery = "select cl_to from categorylinks cl left join page p on cl.cl_from = p.page_id where p.page_namespace=14 and p.page_title ='";
cursor = db.cursor(cursors.SSDictCursor);

    for key, value in idTitleDictionary.iteritems():
        for startCategory in value[0]:
            #print startCategory + "End of Query";
            categoryResults = [];
            try:
                categoryRow = "";
                baseCategoryTree[startCategory] = [];
                print categoryQuery + startCategory + "'";
                cursor.execute(categoryQuery + startCategory + "'");
                done = False;
                while not done:
                    categoryRow = cursor.fetchone();
                    if not categoryRow:
                        done = True;
                        continue;
                    categoryResults.append(categoryRow['cl_to']);
                for subCategoryResult in categoryResults:
                    print startCategory.encode('ascii') + " - " +  subCategoryResult;
                    for item in returnCategoryQuery(categoryQuery + subCategoryResult + "'"):
                        print startCategory.encode('ascii') + " - " + subCategoryResult + " - "  + item;
                        for subItem in returnCategoryQuery(categoryQuery + item + "'"):
                            print startCategory.encode('ascii') + " - " + subCategoryResult + " - "  + item + " - " + subItem;
                            for subOfSubItem in returnCategoryQuery(categoryQuery + subItem + "'"):
                                 print startCategory.encode('ascii') + " - " + subCategoryResult + " - "  + item + " - " + subItem + " - " + subOfSubItem;
                                 for sub_1_subOfSubItem in returnCategoryQuery(categoryQuery + subOfSubItem + "'"):
                                      print startCategory.encode('ascii') + " - " + subCategoryResult + " - "  + item + " - " + subItem + " - " + subOfSubItem + " - " + sub_1_subOfSubItem;
                                      for sub_2_subOfSubItem in returnCategoryQuery(categoryQuery + sub_1_subOfSubItem + "'"):
                                          print startCategory.encode('ascii') + " - " + subCategoryResult + " - "  + item + " - " + subItem + " - " + subOfSubItem + " - " + sub_1_subOfSubItem + " - " + sub_2_subOfSubItem;
            except Exception, e:
                traceback.print_exc();

使用递归

def crawlSubCategory(subCategoryList):
    level = 1;
    expandedList = [];
    for eachCategory in subCategoryList:
        level = level + 1
        print "Level  " + str(level) + " " + eachCategory;
        #crawlSubCategory(returnCategoryQuery(categoryQuery + eachCategory + "'"));
        for subOfEachCategory in returnCategoryQuery(categoryQuery + eachCategory + "'"):
            level = level + 1
            print "Level  " + str(level) + " " + subOfEachCategory;
            expandedList.append(crawlSubCategory(returnCategoryQuery(categoryQuery + subOfEachCategory + "'")));
    return expandedList;


#Dictionary to save initial category with the rest of cl_to
baseCategoryTree = {};
#categoryResults = [];

# query get all the categories a category is linked to
categoryQuery = "select cl_to from categorylinks cl left join page p on cl.cl_from = p.page_id where p.page_namespace=14 and p.page_title ='";
cursor = db.cursor(cursors.SSDictCursor);

for key, value in idTitleDictionary.iteritems():
    for startCategory in value[0]:
        #print startCategory + "End of Query";
        categoryResults = [];
        try:
            categoryRow = "";
            baseCategoryTree[startCategory] = [];
            print categoryQuery + startCategory + "'";
            cursor.execute(categoryQuery + startCategory + "'");
            done = False;
            while not done:
                categoryRow = cursor.fetchone();
                if not categoryRow:
                    done = True;
                    continue;
                categoryResults.append(categoryRow['cl_to']);
            #crawlSubCategory(categoryResults);
        except Exception, e:
            traceback.print_exc();
        #baseCategoryTree[startCategory].append(categoryResults);
        baseCategoryTree[startCategory].append(crawlSubCategory(categoryResults));

原文

I am exploring a data structure which get expands to sub-elements and resolves to a final element. But I only want to store top two levels.

Example: Lets say I start with New York which breaks into Bronx, Kings, New York, Queens, and Richmond as counties but then finally somehow they resolve to USA.

I am not sure if this is a good example but just to make it clear here is more clear explanation of the problem.

A (expands to) B,C,D -> B (expands to) K,L,M -> K resolves to Z

I initially wrote it in series of for loops and then use the recursion but in recursion I am loosing some of the elements that get expand and due to that I don't drill down each of the expanded element. I have put the both recursive version and non-recursive. I am looking for some advise on building this data structure, and what is the best way to do.

I call a data base query for every element in the expanded version which returns a list of items. Go until it resolves to single element. With out recursion I don't loose drilling all the way till the final element that others resolve to. But with recursion its not the same. I am also new to python so hopefully this is not a bad question to ask in a site like this.

returnCategoryQuery is a method that returns list of items by calling the database query.

With out recursion

#Dictionary to save initial category with the rest of cl_to
baseCategoryTree = {};
#categoryResults = [];

# query get all the categories a category is linked to
categoryQuery = "select cl_to from categorylinks cl left join page p on cl.cl_from = p.page_id where p.page_namespace=14 and p.page_title ='";
cursor = db.cursor(cursors.SSDictCursor);

    for key, value in idTitleDictionary.iteritems():
        for startCategory in value[0]:
            #print startCategory + "End of Query";
            categoryResults = [];
            try:
                categoryRow = "";
                baseCategoryTree[startCategory] = [];
                print categoryQuery + startCategory + "'";
                cursor.execute(categoryQuery + startCategory + "'");
                done = False;
                while not done:
                    categoryRow = cursor.fetchone();
                    if not categoryRow:
                        done = True;
                        continue;
                    categoryResults.append(categoryRow['cl_to']);
                for subCategoryResult in categoryResults:
                    print startCategory.encode('ascii') + " - " +  subCategoryResult;
                    for item in returnCategoryQuery(categoryQuery + subCategoryResult + "'"):
                        print startCategory.encode('ascii') + " - " + subCategoryResult + " - "  + item;
                        for subItem in returnCategoryQuery(categoryQuery + item + "'"):
                            print startCategory.encode('ascii') + " - " + subCategoryResult + " - "  + item + " - " + subItem;
                            for subOfSubItem in returnCategoryQuery(categoryQuery + subItem + "'"):
                                 print startCategory.encode('ascii') + " - " + subCategoryResult + " - "  + item + " - " + subItem + " - " + subOfSubItem;
                                 for sub_1_subOfSubItem in returnCategoryQuery(categoryQuery + subOfSubItem + "'"):
                                      print startCategory.encode('ascii') + " - " + subCategoryResult + " - "  + item + " - " + subItem + " - " + subOfSubItem + " - " + sub_1_subOfSubItem;
                                      for sub_2_subOfSubItem in returnCategoryQuery(categoryQuery + sub_1_subOfSubItem + "'"):
                                          print startCategory.encode('ascii') + " - " + subCategoryResult + " - "  + item + " - " + subItem + " - " + subOfSubItem + " - " + sub_1_subOfSubItem + " - " + sub_2_subOfSubItem;
            except Exception, e:
                traceback.print_exc();

With Recursion

def crawlSubCategory(subCategoryList):
    level = 1;
    expandedList = [];
    for eachCategory in subCategoryList:
        level = level + 1
        print "Level  " + str(level) + " " + eachCategory;
        #crawlSubCategory(returnCategoryQuery(categoryQuery + eachCategory + "'"));
        for subOfEachCategory in returnCategoryQuery(categoryQuery + eachCategory + "'"):
            level = level + 1
            print "Level  " + str(level) + " " + subOfEachCategory;
            expandedList.append(crawlSubCategory(returnCategoryQuery(categoryQuery + subOfEachCategory + "'")));
    return expandedList;


#Dictionary to save initial category with the rest of cl_to
baseCategoryTree = {};
#categoryResults = [];

# query get all the categories a category is linked to
categoryQuery = "select cl_to from categorylinks cl left join page p on cl.cl_from = p.page_id where p.page_namespace=14 and p.page_title ='";
cursor = db.cursor(cursors.SSDictCursor);

for key, value in idTitleDictionary.iteritems():
    for startCategory in value[0]:
        #print startCategory + "End of Query";
        categoryResults = [];
        try:
            categoryRow = "";
            baseCategoryTree[startCategory] = [];
            print categoryQuery + startCategory + "'";
            cursor.execute(categoryQuery + startCategory + "'");
            done = False;
            while not done:
                categoryRow = cursor.fetchone();
                if not categoryRow:
                    done = True;
                    continue;
                categoryResults.append(categoryRow['cl_to']);
            #crawlSubCategory(categoryResults);
        except Exception, e:
            traceback.print_exc();
        #baseCategoryTree[startCategory].append(categoryResults);
        baseCategoryTree[startCategory].append(crawlSubCategory(categoryResults));

分享到QQ

分享到微博