将用户许可授予bigquery.datasets。

发布于 2025-01-24 14:06:34 字数 4544 浏览 3 评论 0原文

我有一个笔记本,其中我可以通过API访问数据,使用数据并将结果发送到BigQuery表以及GCS存储桶中。当笔记本手动运行时,一切都可以正常工作。

但是,安排时,它会随着以下消息打破:(

/opt/conda/lib/python3.7/site-packages/pandas/core/frame.py in to_gbq(self, destination_table, project_id, chunksize, reauth, if_exists, auth_local_webserver, table_schema, location, progress_bar, credentials)
   1938             location=location,
   1939             progress_bar=progress_bar,
-> 1940             credentials=credentials,
   1941         )
   1942 

/opt/conda/lib/python3.7/site-packages/pandas/io/gbq.py in to_gbq(dataframe, destination_table, project_id, chunksize, reauth, if_exists, auth_local_webserver, table_schema, location, progress_bar, credentials)
    221         location=location,
    222         progress_bar=progress_bar,
--> 223         credentials=credentials,
    224     )

/opt/conda/lib/python3.7/site-packages/pandas_gbq/gbq.py in to_gbq(dataframe, destination_table, project_id, chunksize, reauth, if_exists, auth_local_webserver, table_schema, location, progress_bar, credentials, api_method, verbose, private_key)
   1155             credentials=connector.credentials,
   1156         )
-> 1157         table_connector.create(table_id, table_schema)
   1158     else:
   1159         original_schema = pandas_gbq.schema.to_pandas_gbq(table.schema)

/opt/conda/lib/python3.7/site-packages/pandas_gbq/gbq.py in create(self, table_id, schema)
   1311             _Dataset(
   1312                 self.project_id, credentials=self.credentials, location=self.location,
-> 1313             ).create(self.dataset_id)
   1314 
   1315         table_ref = TableReference(

/opt/conda/lib/python3.7/site-packages/pandas_gbq/gbq.py in create(self, dataset_id)
   1413             self.client.create_dataset(dataset)
   1414         except self.http_error as ex:
-> 1415             self.process_http_error(ex)

/opt/conda/lib/python3.7/site-packages/pandas_gbq/gbq.py in process_http_error(ex)
    384             raise QueryTimeout("Reason: {0}".format(ex))
    385 
--> 386         raise GenericGBQException("Reason: {0}".format(ex))
    387 
    388     def download_table(

GenericGBQException: Reason: 403 POST https://bigquery.googleapis.com/bigquery/v2/projects/****************/datasets?prettyPrint=false: Access Denied: Project ******************: User does not have bigquery.datasets.create permission in project *****************.

恒星是项目-ID(顺便说一句:这不是我选择的项目-ID)。)

在脚本的以下部分:

Table_grouped_to_bg.to_gbq('retailer_accuracy.testbin', Context.default().project_id,chunksize=10000,if_exists='append') #
                    Table_grouped_to_bg.to_csv('gs://retailer-sectionalized-labels-csv/'+'Table_grouped_' +str(Store_name)+'_'+str(Date)+'.csv',sep=';' ,encoding='utf-8-sig')
                    ax1 =Table_grouped.plot.bar(x="distance_bin", y="percentage", rot=70, title="Percentage in each bin,{}".format(Date), figsize=(15, 5));
                    x_offset = -0.01
                    y_offset = 0.02
                    for p in ax1.patches:
                        b = p.get_bbox()
                        val = "{:+.2f}".format(b.y1 + b.y0)        
                        ax1.annotate(val, ((b.x0 + b.x1)/2 + x_offset, b.y1 + y_offset))

                    fig = ax1.get_figure() 
                    
                    
                    def saving_figure(path_logdir):
                        fig = ax1.get_figure() 
                        fig_to_upload = plt.gcf()

                        # Save figure image to a bytes buffer
                        buf = io.BytesIO()
                        fig_to_upload.savefig(buf, format='png')
                        buf.seek(0)
                        image_as_a_string = base64.b64encode(buf.read())

                        # init GCS client and upload buffer contents
                        client = Storage.Client()
                        bucket = client.get_bucket('retailer-sectionalized-statistics-plot-png')
                        blob = bucket.blob('Accuracy_barplot_'+str(Store_name)+'_'+str(Date)+'.png')  # This defines the path where the file will be stored in the bucket
                        your_file_contents = blob.upload_from_string(image_as_a_string, content_type='image/png')
                    plt.show(block=True)
                    plt.show(block=True) 
....................saving_figure('Accuracy_barplot_'+str(Store_name)+'_'+str(Date)+'.png')

其中星星为项目ID。

我知道,计划在计划时不再运行该项目。然后,问题是,如何通过GCP-Instance运行笔记本的许可?

I have a notebook in which I access data through APIs, work with the data and send the results to a BigQuery table as well as to a GCS bucket. Everything works as it should when the notebook is run manually.

However, when scheduled, it breaks with the following message:

/opt/conda/lib/python3.7/site-packages/pandas/core/frame.py in to_gbq(self, destination_table, project_id, chunksize, reauth, if_exists, auth_local_webserver, table_schema, location, progress_bar, credentials)
   1938             location=location,
   1939             progress_bar=progress_bar,
-> 1940             credentials=credentials,
   1941         )
   1942 

/opt/conda/lib/python3.7/site-packages/pandas/io/gbq.py in to_gbq(dataframe, destination_table, project_id, chunksize, reauth, if_exists, auth_local_webserver, table_schema, location, progress_bar, credentials)
    221         location=location,
    222         progress_bar=progress_bar,
--> 223         credentials=credentials,
    224     )

/opt/conda/lib/python3.7/site-packages/pandas_gbq/gbq.py in to_gbq(dataframe, destination_table, project_id, chunksize, reauth, if_exists, auth_local_webserver, table_schema, location, progress_bar, credentials, api_method, verbose, private_key)
   1155             credentials=connector.credentials,
   1156         )
-> 1157         table_connector.create(table_id, table_schema)
   1158     else:
   1159         original_schema = pandas_gbq.schema.to_pandas_gbq(table.schema)

/opt/conda/lib/python3.7/site-packages/pandas_gbq/gbq.py in create(self, table_id, schema)
   1311             _Dataset(
   1312                 self.project_id, credentials=self.credentials, location=self.location,
-> 1313             ).create(self.dataset_id)
   1314 
   1315         table_ref = TableReference(

/opt/conda/lib/python3.7/site-packages/pandas_gbq/gbq.py in create(self, dataset_id)
   1413             self.client.create_dataset(dataset)
   1414         except self.http_error as ex:
-> 1415             self.process_http_error(ex)

/opt/conda/lib/python3.7/site-packages/pandas_gbq/gbq.py in process_http_error(ex)
    384             raise QueryTimeout("Reason: {0}".format(ex))
    385 
--> 386         raise GenericGBQException("Reason: {0}".format(ex))
    387 
    388     def download_table(

GenericGBQException: Reason: 403 POST https://bigquery.googleapis.com/bigquery/v2/projects/****************/datasets?prettyPrint=false: Access Denied: Project ******************: User does not have bigquery.datasets.create permission in project *****************.

(where the stars are the project-id (btw: this is not a project-id I have chosen).)

for the following part of the script:

Table_grouped_to_bg.to_gbq('retailer_accuracy.testbin', Context.default().project_id,chunksize=10000,if_exists='append') #
                    Table_grouped_to_bg.to_csv('gs://retailer-sectionalized-labels-csv/'+'Table_grouped_' +str(Store_name)+'_'+str(Date)+'.csv',sep=';' ,encoding='utf-8-sig')
                    ax1 =Table_grouped.plot.bar(x="distance_bin", y="percentage", rot=70, title="Percentage in each bin,{}".format(Date), figsize=(15, 5));
                    x_offset = -0.01
                    y_offset = 0.02
                    for p in ax1.patches:
                        b = p.get_bbox()
                        val = "{:+.2f}".format(b.y1 + b.y0)        
                        ax1.annotate(val, ((b.x0 + b.x1)/2 + x_offset, b.y1 + y_offset))

                    fig = ax1.get_figure() 
                    
                    
                    def saving_figure(path_logdir):
                        fig = ax1.get_figure() 
                        fig_to_upload = plt.gcf()

                        # Save figure image to a bytes buffer
                        buf = io.BytesIO()
                        fig_to_upload.savefig(buf, format='png')
                        buf.seek(0)
                        image_as_a_string = base64.b64encode(buf.read())

                        # init GCS client and upload buffer contents
                        client = Storage.Client()
                        bucket = client.get_bucket('retailer-sectionalized-statistics-plot-png')
                        blob = bucket.blob('Accuracy_barplot_'+str(Store_name)+'_'+str(Date)+'.png')  # This defines the path where the file will be stored in the bucket
                        your_file_contents = blob.upload_from_string(image_as_a_string, content_type='image/png')
                    plt.show(block=True)
                    plt.show(block=True) 
....................saving_figure('Accuracy_barplot_'+str(Store_name)+'_'+str(Date)+'.png')

where the stars are the project-id.

I understand that it is no longer I who is running the project when it is scheduled. The question is then, how do I pass the permission for the notebook to be run by a GCP-instance?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

べ繥欢鉨o。 2025-01-31 14:06:34

该问题与权限错误有关。根据您的要求,您可以在项目中创建一个服务帐户。服务帐户是用于调用API并用作应用程序标识的特殊帐户类型。您可以将用户ID分配给服务帐户并授予 bigquery数据编辑 and notebook admin 有关更多信息,您可以检查此文档

The issue is related to permission error. For your requirement, you can create a service account in the project. Service Accounts are special types of accounts that are used to call API’s and used as identification of the application. You can assign your user id to the service account and grant BigQuery Data Editor and Notebook Admin role to the service account.For more information, you can check this documentation.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文