字典或 If 语句、Jython

发布于 2024-07-08 19:59:11 字数 816 浏览 4 评论 0原文

我现在正在编写一个脚本,它将使用 dom4j 从 HTML 中获取某些信息。

由于 Python/Jython 没有本机 switch 语句,我决定使用一大堆 if 语句来调用适当的方法,如下所示:

if type == 'extractTitle':
    extractTitle(dom)
if type == 'extractMetaTags':
    extractMetaTags(dom)

我将添加更多取决于我想从 HTML 中提取哪些信息,并考虑采用我在本网站其他地方找到的字典方法,示例如下:

{
    'extractTitle':    extractTitle,
    'extractMetaTags': extractMetaTags
}[type](dom)

我知道每次运行脚本时都会构建字典,但同时如果我如果要使用 if 语句,脚本必须检查所有这些语句,直到找到正确的语句。 我真正想知道的是,哪一种表现更好,或者通常更适合使用?

更新: @Brian - 感谢您的精彩回复。 我有一个问题,如果任何提取方法需要多个对象,例如,

handle_extractTag(self, dom, anotherObject)
# Do something

您将如何对 handle 方法进行适当的更改来实现这一点? 希望你明白我的意思:)

干杯

I am writing a script at the moment that will grab certain information from HTML using dom4j.

Since Python/Jython does not have a native switch statement I decided to use a whole bunch of if statements that call the appropriate method, like below:

if type == 'extractTitle':
    extractTitle(dom)
if type == 'extractMetaTags':
    extractMetaTags(dom)

I will be adding more depending on what information I want to extract from the HTML and thought about taking the dictionary approach which I found elsewhere on this site, example below:

{
    'extractTitle':    extractTitle,
    'extractMetaTags': extractMetaTags
}[type](dom)

I know that each time I run the script the dictionary will be built, but at the same time if I were to use the if statements the script would have to check through all of them until it hits the correct one. What I am really wondering, which one performs better or is generally better practice to use?

Update: @Brian - Thanks for the great reply. I have a question, if any of the extract methods require more than one object, e.g.

handle_extractTag(self, dom, anotherObject)
# Do something

How would you make the appropriate changes to the handle method to implemented this? Hope you know what I mean :)

Cheers

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

街角迷惘 2024-07-15 19:59:11

为了避免在字典中指定标签和处理程序,您可以只使用处理程序类,其方法命名为与类型匹配。 例如

class  MyHandler(object):
    def handle_extractTitle(self, dom):
        # do something

    def handle_extractMetaTags(self, dom):
        # do something

    def handle(self, type, dom):
        func = getattr(self, 'handle_%s' % type, None)
        if func is None:
            raise Exception("No handler for type %r" % type)
        return func(dom)

用法:

 handler = MyHandler()
 handler.handle('extractTitle', dom)

更新:

当您有多个参数时,只需更改句柄函数即可获取这些参数并将它们传递给函数。 如果你想让它更通用(这样你就不必在更改参数签名时同时更改处理函数和句柄方法),可以使用 *args 和 **kwargs 语法来传递所有收到的参数。 那么handle方法就变成了:

def handle(self, type, *args, **kwargs):
    func = getattr(self, 'handle_%s' % type, None)
    if func is None:
        raise Exception("No handler for type %r" % type)
    return func(*args, **kwargs)

To avoid specifying the tag and handler in the dict, you could just use a handler class with methods named to match the type. Eg

class  MyHandler(object):
    def handle_extractTitle(self, dom):
        # do something

    def handle_extractMetaTags(self, dom):
        # do something

    def handle(self, type, dom):
        func = getattr(self, 'handle_%s' % type, None)
        if func is None:
            raise Exception("No handler for type %r" % type)
        return func(dom)

Usage:

 handler = MyHandler()
 handler.handle('extractTitle', dom)

Update:

When you have multiple arguments, just change the handle function to take those arguments and pass them through to the function. If you want to make it more generic (so you don't have to change both the handler functions and the handle method when you change the argument signature), you can use the *args and **kwargs syntax to pass through all received arguments. The handle method then becomes:

def handle(self, type, *args, **kwargs):
    func = getattr(self, 'handle_%s' % type, None)
    if func is None:
        raise Exception("No handler for type %r" % type)
    return func(*args, **kwargs)
隔纱相望 2024-07-15 19:59:11

使用您的代码,您运行的函数都会被调用。

handlers = {
'extractTitle': extractTitle, 
'extractMetaTags': extractMetaTags
}

handlers[type](dom)

会像你原来的 if 代码一样工作。

With your code you're running your functions all get called.

handlers = {
'extractTitle': extractTitle, 
'extractMetaTags': extractMetaTags
}

handlers[type](dom)

Would work like your original if code.

不忘初心 2024-07-15 19:59:11

这取决于我们正在讨论的 if 语句的数量; 如果它是一个非常小的数字,那么它会比使用字典更有效。

然而,与往常一样,我强烈建议您采取任何使代码看起来更干净的措施,直到经验和分析告诉您特定的代码块需要优化。

It depends on how many if statements we're talking about; if it's a very small number, then it will be more efficient than using a dictionary.

However, as always, I strongly advice you to do whatever makes your code look cleaner until experience and profiling tell you that a specific block of code needs to be optimized.

亢潮 2024-07-15 19:59:11

你对字典的使用不太正确。 在您的实现中,将调用所有方法并丢弃所有无用的方法。 通常所做的更像是:

switch_dict = {'extractTitle': extractTitle, 
               'extractMetaTags': extractMetaTags}
switch_dict[type](dom)

如果您有大量(或可变)的项目,这种方式更有效且更可扩展。

Your use of the dictionary is not quite correct. In your implementation, all methods will be called and all the useless one discarded. What is usually done is more something like:

switch_dict = {'extractTitle': extractTitle, 
               'extractMetaTags': extractMetaTags}
switch_dict[type](dom)

And that way is facter and more extensible if you have a large (or variable) number of items.

三月梨花 2024-07-15 19:59:11

效率问题几乎无关紧要。 字典查找是通过一种简单的散列技术完成的,必须一次评估一个 if 语句。 字典往往更快。

我建议您实际上拥有从 DOM 中提取的多态对象。

目前尚不清楚 type 是如何设置的,但看起来它确实可能是一系列相关对象,而不是一个简单的字符串。

class ExtractTitle( object ):
    def process( dom ):
        return something

class ExtractMetaTags( object ):
    def process( dom ):
        return something

您可以这样做,而不是设置 type="extractTitle"。

type= ExtractTitle() # or ExtractMetaTags() or ExtractWhatever()
type.process( dom )

那么,您就不会构建这个特定的字典或 if 语句。

The efficiency question is barely relevant. The dictionary lookup is done with a simple hashing technique, the if-statements have to be evaluated one at a time. Dictionaries tend to be quicker.

I suggest that you actually have polymorphic objects that do extractions from the DOM.

It's not clear how type gets set, but it sure looks like it might be a family of related objects, not a simple string.

class ExtractTitle( object ):
    def process( dom ):
        return something

class ExtractMetaTags( object ):
    def process( dom ):
        return something

Instead of setting type="extractTitle", you'd do this.

type= ExtractTitle() # or ExtractMetaTags() or ExtractWhatever()
type.process( dom )

Then, you wouldn't be building this particular dictionary or if-statement.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文