使用大量测试数据文件运行鼻子

发布于 2024-10-04 17:52:34 字数 638 浏览 6 评论 0原文

我想为网络爬虫编写一些测试。我想使用大量的测试网页，但我不确定如何让nose（或另一个单元测试框架）在没有大量重复代码的情况下完成我需要的操作。

我的问题是我想测试很多不同的页面，但我不知道如何使用鼻子来做到这一点。这大致就是我想要做的：

class TestPage(object):
    def setup(self):
        with open('test_data/page.html', 'r') as f:
            html = f.read()
        self.scraper = Scraper(html)

如果我想测试的唯一页面是“page.html”，那就没问题了。但我有数百页要测试。我可以复制该类，并且每次都更改该类的名称和路径的文件名，但这显然是荒谬的。

我想在设置中添加代码，为每个页面创建单独的 Scraper 对象，并将它们存储在测试对象的列表中。然后我可以让测试方法在每个 Scraper 对象上运行。但我认为我会遇到保持每个测试隔离并从鼻子获取单独消息的问题。

我还尝试对基测试类进行子类化并将路径传递给init，但这会给nose带来问题。

我很感激任何有关如何使用鼻子解决此问题的建议、另一种方法或任何可能有用的阅读。

原文

I want to write some tests for a web scraper. I want to use a lot of test web pages, but I'm not sure exactly how to get nose (or another unit testing framework) to do what I need without a huge amount of duplicate code.

My problem is that I want to test a lot of different pages and I'm not sure how to do this using nose. This is roughly what I want to do:

class TestPage(object):
    def setup(self):
        with open('test_data/page.html', 'r') as f:
            html = f.read()
        self.scraper = Scraper(html)

This would be fine if the only page I wanted to test were 'page.html'. But I have hundreds of pages to test. I could duplicate the class and each time change both the class's name and the filename of the path, but this would obviously be ridiculous.

I thought of putting code in setup to create separate Scraper objects for each page and store them in a list in the test object. I could then have the test methods operate on each Scraper object. But I think I'd run into problems with keeping each test isolated and getting separate messages from nose.

I also tried to subclass a base test class and pass the path to init, but this creates problems for nose.

I'd appreciate any advice on how to solve this using nose, another approach to take, or any reading that might be useful.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

风蛊 2024-10-11 17:52:34

根据您的代码示例，您只需要一个类工厂：

def make_test(page)
    class TestPage(object):
        def setup(self):
            with open(page, 'r') as f:
                html = f.read()
            self.scraper = Scraper(html)
     return TestPage

现在您可以运行一系列页面并为每个页面进行一个测试：

for page in list_of_pages:
    Test = make_test(page)
    Test().run()

我不确定这是否是您运行鼻子测试的方式，但这将是一个完整的测试成熟的课程，这样你就可以做任何你通常会做的事情。

您可以将所有测试页面存储在一个目录中，这样您就可以循环遍历该目录中的文件并以这种方式获取页面列表。创建新测试所需要做的就是将 html 保存在给定目录中。这就是您正在寻找的东西吗？

based on your code sample, you just need a class factory:

def make_test(page)
    class TestPage(object):
        def setup(self):
            with open(page, 'r') as f:
                html = f.read()
            self.scraper = Scraper(html)
     return TestPage

Now you can just run over a list of pages and make one test for each one:

for page in list_of_pages:
    Test = make_test(page)
    Test().run()

I'm not sure if that's how you run a nose test but It would be a full fledged class so you could do whatever you would normally do with it.

You could store all of your tests pages in one directory so that you could just loop over the files in the directory and get your list of pages that way. All you would have to do to create a new test is save the html in the given directory. Is that about what you were looking for?

回复收藏 0 原文