使用Python并行调用Azure函数中的方法

发布于 2025-02-10 04:11:32 字数 3254 浏览 0 评论 0 原文

Python版本: 3.7.3

资源: azure函数

azure计划:消费

目标:提高Azure的速度函数

大家好,

我有以下代码来分类邮件:

init .py

### Importing libraries above  

def main(req: func.HttpRequest,
            context: func.Context) -> func.HttpResponse:
   
    try:
        req_body = req.get_json()
    except ValueError:
        pass

    if req_body:
        try:
                                        
            ### Some code in between to load the variables classes, selected models, .... Those variables are JSOn objects
                
            prediction = cfp.predict(req_body['text_original'], 
                                    req_body['text_cleaned'], 
                                    selected_models, thresholds)
                        
            return func.HttpResponse(json.dumps(prediction).encode('utf-8'),
                                status_code = 200,
                                mimetype = 'application/json')
        except Exception as e:
            return func.HttpResponse(json.dumps({'status': 'fehler', 'comment': str(e), 'stack_trace': traceback.format_exc()}),
                                    status_code = 400,
                                    mimetype = 'application/json')

    else:
        return func.HttpResponse(json.dumps({'status': 'fehler', 'comment': 'Die Eingabedaten wurden falsch angegeben', 'stack_trace': ''}),
                                status_code = 400,
                                mimetype = 'application/json')

cfp.py

def classify_mail(model_typ, scenario_name, X, vectorizer_parameters, modelFolderPath):

   ### ... some code in between 

    model = joblib.load(modelFolderPath)
    vec = TfidfVectorizer(**vectorizer_parameters)
    X_features = vec.fit_transform(X)
    result['prediction'] = model.predict(X_features)[0]

  return result

def predict(mail, mail_cleaned,
            selected_models, thresholds, vectorizer_parameters):
model_folder_path = "model"
    
    prediction={}       
    results = []

    for m in selected_models.keys():
        for s in selected_models[m]['scenarios']:
                            
            result = {}
            result['name'] = m + '_' + s

            X = [mail_cleaned]
            
            selected_vectorizer_parameters = vectorizer_parameters[s]

            result.update(classify_mail(m,s,X, selected_vectorizer_parameters, model_folder_path))
            results.append(result)
            
### some code after
return prediction

the方法预测呼叫10倍clastify_mail的10倍(这是两个for-loops)。每个呼叫持续20秒,我想知道如何在10次并行拨打该方法分类,我可以减少执行Azure函数的时间。由于在消费计划中功能的超时,我正在遇到以下错误:

Badrequest。 HTTP请求失败:服务器未在 超时限制。请参阅逻辑应用程序限制 config#http-limits

更新1:

我发现此资源 async方法python 。但是,对于我来说,在我的特定用例中如何实现它并不清楚。

Python version: 3.7.3

Resource: Azure function

Azure plan: Consumption

Goal: Improve the speed of an azure function

Hi everyone,

I have the following code to classify mails:

init.py

### Importing libraries above  

def main(req: func.HttpRequest,
            context: func.Context) -> func.HttpResponse:
   
    try:
        req_body = req.get_json()
    except ValueError:
        pass

    if req_body:
        try:
                                        
            ### Some code in between to load the variables classes, selected models, .... Those variables are JSOn objects
                
            prediction = cfp.predict(req_body['text_original'], 
                                    req_body['text_cleaned'], 
                                    selected_models, thresholds)
                        
            return func.HttpResponse(json.dumps(prediction).encode('utf-8'),
                                status_code = 200,
                                mimetype = 'application/json')
        except Exception as e:
            return func.HttpResponse(json.dumps({'status': 'fehler', 'comment': str(e), 'stack_trace': traceback.format_exc()}),
                                    status_code = 400,
                                    mimetype = 'application/json')

    else:
        return func.HttpResponse(json.dumps({'status': 'fehler', 'comment': 'Die Eingabedaten wurden falsch angegeben', 'stack_trace': ''}),
                                status_code = 400,
                                mimetype = 'application/json')

cfp.py

def classify_mail(model_typ, scenario_name, X, vectorizer_parameters, modelFolderPath):

   ### ... some code in between 

    model = joblib.load(modelFolderPath)
    vec = TfidfVectorizer(**vectorizer_parameters)
    X_features = vec.fit_transform(X)
    result['prediction'] = model.predict(X_features)[0]

  return result

def predict(mail, mail_cleaned,
            selected_models, thresholds, vectorizer_parameters):
model_folder_path = "model"
    
    prediction={}       
    results = []

    for m in selected_models.keys():
        for s in selected_models[m]['scenarios']:
                            
            result = {}
            result['name'] = m + '_' + s

            X = [mail_cleaned]
            
            selected_vectorizer_parameters = vectorizer_parameters[s]

            result.update(classify_mail(m,s,X, selected_vectorizer_parameters, model_folder_path))
            results.append(result)
            
### some code after
return prediction

The method predict call 10 times the method classify_mail (This is the two for-loops). Each call lasts 20 seconds and I would like to know how to call the method classify in parallel for those 10 times and I can reduce the time of the execution of my azure function. I am getting the following error because of the timeout of the function in a consumption plan:

BadRequest. Http request failed: the server did not respond within the
timeout limit. Please see logic app limits at
https://aka.ms/logic-apps-limits-and-config#http-limits.

Update 1:

I found this resource Async method with python. However, it is not clear for me how to implement it in my specific use case.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

凯凯我们等你回来 2025-02-17 04:11:32

我想知道如何在10次并行调用该方法分类,并且可以减少执行Azure函数的时间。

解决方法之一是,您可以设置最高学位对于通过并行执行任务的执行时间将比执行时间少的函数。

参考文献:

  1. imran 的Azure耐用功能并行化任务

I would like to know how to call the method classify in parallel for those 10 times and I can reduce the time of the execution of my azure function.

One of the workarounds is that you can Set a max degree of parallelism for the function where the execution time will be less than usual by performing the tasks in parallel.

REFERENCES:

  1. Bindings for Durable Functions (Azure Functions) - MSFT Docs
  2. Parallelize tasks with Azure Durable Functions by Imran
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文