哪个GCP组件用于从API获取数据

发布于 2025-01-23 22:13:12 字数 382 浏览 0 评论 0原文

我在GCP组件之间有点困惑,这是我的用例:
每天,我需要从外部API(API返回JSON数据)获取数据,将其存储在GCS中,然后将其加载到BigQuery中,
我已经创建了Python脚本,将数据获取并将其存储在GCS中,并且我是我的混淆了用于部署的哪个组件:

  • 云运行:从文档中用于部署服务,因此我认为它是一个不好的选择
  • 云功能:我认为它有效,但它甚至用于基于基于的处理(通过单一目的功能。 。
  • ​引擎:由于有更好的
  • 应用引擎,我认为它不是最好的选择:我认为这不是一个好主意,因为它用于部署和扩展Web应用程序...

(如果我是纠正我我所说的错误,)
所以我的问题是:用于此类任务的GCP组件是什么

I'm a little bit confused between gcp components, here is my use case :
daily, I need to fetch data from an external API (the API return json data), store it in GCS then load it in Bigquery,
I already created the python script fetching the data and store it in GCS and i'm confused which component to use for deployment :

  • Cloud run : from the doc it is used for deploying services, so I think its a bad choose
  • Cloud function: I think it works, but it is used for even based processing (through single purpose function...)
  • composer :(I'll use composer to orchestrate tasks, such as preprocessing of files in GCS, load them to BQ, transfert them to an archive Bucket) through kubernetesPodOperator, create a task that trigger the script to get the data
  • compute engine: I don't think that its the best chose since there are better ones
  • app engine: also I don't think it a good idea since it is used to deploy and scale web app ...

(correcte me if i'm wrong in what I said, )
so my question is : what is the GCP component used for this kind of task

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

︶ ̄淡然 2025-01-30 22:13:12
  • 云运行:从文档中,它用于部署服务
  • 应用引擎:我也认为这不是一个好主意,因为它用于部署和扩展Web应用程序...

我认为您已经误解了。 Cloud Run和Google App Engine(GAE)都是Google Cloud的无服务器产品。您将代码部署到它们中的任何一个,并且可以调用其URL,从而导致您的代码执行并执行诸如从某个地方获取数据并将其保存在某个地方之类的事情。

Google App Engine的超时比云运行短(不记得云运行是否有超时)。因此,如果您的代码需要很长时间才能运行,则不想使用Google App Engine(除非您将其作为背景任务),并且如果您不需要UI,则不需要GAE。

对于特定情况,您可以将代码部署到云运行并使用云调度程序在特定时间调用。我们在类似的情况下运行该体系结构(我们的任务每天运行一次;它已部署到云运行; Google调度程序调用端点,运行并将数据保存到链接到应用程序引擎应用的数据存储中)。我们写了一个博客文章关于部署到云运行和另一个确保您的云运行(基于我们在早期所述方案中的经验)

gae超时:

对Google App Engine的每个请求(标准)必须在1-10分钟内完成自动缩放最多24小时以进行基本缩放(请参阅文档)。对于Google App Engine灵活,超时为60分钟(文档) 。

  • Cloud run : from the doc it is used for deploying services
  • app engine: also I don't think it a good idea since it is used to deploy and scale web app ...

I think you've misunderstood. Both Cloud run and Google App Engine (GAE) are serverless offerings from Google Cloud. You deploy your code to any of them and you can invoke their urls which in turn will cause your code to execute and do stuff like go fetch data from somewhere and save it somewhere.

Google App Engine has a shorter timeout than Cloud Run (can't remember if Cloud Run has time out). So, if your code will take a long time to run, you don't want to use Google App Engine (unless you make it a background task) and if you don't need a UI, then you don't need GAE.

For your specific scenario, you can deploy your code to Cloud Run and use Cloud Scheduler to schedule it to be invoked at specific times. We have that architecture running in a similar scenario (we have a task that runs once daily; it's deployed to Cloud Run; Google Scheduler invokes the endpoint, it runs and saves data to datastore linked to an App Engine App). We wrote a blog article on deploying to Cloud Run and another on securing your cloud run (based off our experience in the earlier described scenario)

GAE Timeout:

Every request to a Google App Engine (Standard) must complete within 1 - 10 minutes for automatic scaling and up to 24 hours for basic scaling (see documentation). For Google App Engine Flexible, the timeout is 60 minutes (documentation).

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文