google-cloud-platform google-cloud-logging

如何从日志资源管理器中重新删除GCP日志？

发布于 2025-01-24 23:16:33 字数 906 浏览 0 评论 0原文

我正在使用GCP日志资源管理器来存储管道中的记录消息。我需要通过查看特定事件的日志来调试问题。此错误的消息是相同的，除了最后一个事件ID。

因此，例如，错误消息是

事件ID不存在：foo

我知道我可以使用以下语法来构造一个查询，该查询将使用此特定消息结构返回

resource.type="some_resource"
resource.labels.project_id="some_project"
resource.labels.job_id="some_id"
severity=WARNING
jsonPayload.message:"Event ID does not exist:"

日志然后，查询将返回消息具有该字符串的每个日志。

我最终得到了这样的结果

Event ID does not exist: 1A
Event ID does not exist: 2A
Event ID does not exist: 2A
Event ID does not exist: 3A

，所以我希望将其重复起来，以最终得到，

Event ID does not exist: 1A
Event ID does not exist: 2A
Event ID does not exist: 3A

但我在语言docs

由于行的数量，我也无法下载界定的日志文件。是否可以重复排列行？

原文

I am using GCP Logs explorer to store logging messages from my pipeline.
I need to debug an issue by looking at logs from a specific event. The message of this error is identical except for an event ID at the end.

So for example, the error message is

event ID does not exist: foo

I know that I can use the following syntax to construct a query that will return the logs with this particular message structure

resource.type="some_resource"
resource.labels.project_id="some_project"
resource.labels.job_id="some_id"
severity=WARNING
jsonPayload.message:"Event ID does not exist:"

The last line in that query will then return every log where the message has that string.

I end up with a result like this

Event ID does not exist: 1A
Event ID does not exist: 2A
Event ID does not exist: 2A
Event ID does not exist: 3A

so I wish to deduplicate that to end up with only

Event ID does not exist: 1A
Event ID does not exist: 2A
Event ID does not exist: 3A

But I don't see support for this type of deduplication in the language docs

Due to the amount of rows, I also cannot download a delimited log file.
Is it possible to deduplicate the amount of rows?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

烛影斜 2025-01-31 23:16:33

要使用BigQuery重复编写记录，请按照以下步骤：

确定您的数据集是否包含重复项。
创建选择查询，该查询使用一个
子句组的组。
使用创建或替换表[TablEname]为[Select语句]将结果实现为新表格。

您可以在此链接中查看完整的教程。

要分析大量日志，您可以将它们路由到BigQuery并使用Fluentd分析日志。

Fluentd具有一个输出插件，可以将BigQuery用作存储收集的日志的目的地。使用该插件，您可以将日志直接直接从许多服务器实时加载到BigQuery中。

在此链接，您可以找到有关如何使用FluentD和fluentd and and and andd and and and andd and and and的完整教程Bigquery。

要将日志路由到BigQuery，首先是必须创建一个水槽并将其路由到BigQuery。

下沉控制云日志路由如何日志。使用水槽，您可以
将您的一些或全部日志路由到受支持的目的地。
sinks属于给定的Google云资源：云项目，计费
帐户，文件夹和组织。当资源收到日志时
条目，它根据所包含的水槽路由日志条目
那个资源。日志条目发送到关联目的地
每个匹配的水槽。
您可以使用水槽将日志条目从云日志记录到BigQuery。
创建一个接收器时，将一个BigQuery数据集定义为
目的地。日志记录将匹配接收器规则的日志条目与
在该BigQuery数据集中为您创建的分区表。
1）在云控制台中，转到日志路由器页面：
2）选择一个现有的云项目。
3）选择创建水槽。
4）在水槽详细信息面板中，输入以下详细信息：
接收器名称：为接收器提供标识符；请注意，创建水槽后，您无法重命名水槽，但可以将其删除，并且
创建一个新的水槽。
接收器描述（可选）：描述接收器的目的或用例。
5）在水槽目标面板中，选择“接收器服务”和“目的地：
”
选择接收器服务：选择要路由日志的服务。根据您选择的服务，您可以从
以下目的地：
bigquery表：选择或创建特定数据集以接收路由日志。您还可以选择使用分区表。
例如，如果您的接收器目的地是一个大数据集，则接收器
目的地将是以下内容：
  bigquery.googleapis.com/projects/project_id/datasets/dataset_id
 
请注意，如果您要在云项目之间进行路由日志，您仍然
需要适当的目的地权限。
6）在“选择日志中要包含在水槽面板中”中，请执行以下操作：
在构建包含过滤器字段中，输入过滤器表达式
匹配您要包含的日志条目。如果您不设置
过滤器，您所选资源的所有日志都将
目的地。
要验证您输入正确的过滤器，请选择“预览日志”。这
在带有过滤器的新选项卡中打开日志资源管理器。
。
7）（可选）在选择日志中以滤掉水槽面板，请执行以下操作：
在“排除过滤器名称”字段中，输入名称。
在构建一个排除过滤器字段中，输入过滤器表达式
匹配您要排除的日志条目。您也可以使用
示例功能选择排除日志条目的一部分。你
每个水槽最多可以创建50个排除过滤器。请注意长度
过滤器的字符不能超过20,000个字符。
8）选择创建水槽。

更多有关配置和管理接收器的信息 在这里。

要查看详细信息，格式和规则在将日志条目从云日志记录到BigQuery时适用的规则，请遵循此链接。

To deduplicate records with BigQuery, follow these steps:

Identify whether your dataset contains duplicates.
Create a SELECT query that aggregates the desired column using a
GROUP BY clause.
Materialize the result to a new table using CREATE OR REPLACE TABLE [tablename] AS [SELECT STATEMENT].

You can review the full tutorial in this link.

To analyze a big quantity of logs, you could route them to BigQuery and analyze the logs using Fluentd.

Fluentd has an output plugin that can use BigQuery as a destination for storing the collected logs. Using the plugin, you can directly load logs into BigQuery in near real time from many servers.

In this link, you can find a complete tutorial on how to Analyze logs using Fluentd and BigQuery.

To route your logs to BigQuery, first it is necessary to create a sink and route it to BigQuery.

Sinks control how Cloud Logging routes logs. Using sinks, you can
route some or all of your logs to supported destinations.
Sinks belong to a given Google Cloud resource: Cloud projects, billing
accounts, folders, and organizations. When the resource receives a log
entry, it routes the log entry according to the sinks contained by
that resource. The log entry is sent to the destination associated
with each matching sink.
You can route log entries from Cloud Logging to BigQuery using sinks.
When you create a sink, you define a BigQuery dataset as the
destination. Logging sends log entries that match the sink's rules to
partitioned tables that are created for you in that BigQuery dataset.
1) In the Cloud console, go to the Logs Router page:
2) Select an existing Cloud project.
3) Select Create sink.
4) In the Sink details panel, enter the following details:
Sink name: Provide an identifier for the sink; note that after you create the sink, you can't rename the sink but you can delete it and
create a new sink.
Sink description (optional): Describe the purpose or use case for the sink.
5) In the Sink destination panel, select the sink service and destination:
Select sink service: Select the service where you want your logs routed. Based on the service that you select, you can select from the
following destinations:
BigQuery table: Select or create the particular dataset to receive the routed logs. You also have the option to use partitioned tables.
For example, if your sink destination is a BigQuery dataset, the sink
destination would be the following:
bigquery.googleapis.com/projects/PROJECT_ID/datasets/DATASET_ID
Note that if you are routing logs between Cloud projects, you still
need the appropriate destination permissions.
6) In the Choose logs to include in sink panel, do the following:
In the Build inclusion filter field, enter a filter expression that
matches the log entries you want to include. If you don't set a
filter, all logs from your selected resource are routed to the
destination.
To verify you entered the correct filter, select Preview logs. This
opens the Logs Explorer in a new tab with the filter prepopulated.
7) (Optional) In the Choose logs to filter out of sink panel, do the following:
In the Exclusion filter name field, enter a name.
In the Build an exclusion filter field, enter a filter expression that
matches the log entries you want to exclude. You can also use the
sample function to select a portion of the log entries to exclude. You
can create up to 50 exclusion filters per sink. Note that the length
of a filter can't exceed 20,000 characters.
8) Select Create sink.