Windows Azure - 清理 WADLogsTable

发布于 2024-11-27 12:41:15 字数 192 浏览 1 评论 0 原文

我读过关于 Windows Azure 中的 DiagnosticMonitor 使用的 WADLogsTable 表是否会自动删除旧日志条目的相互冲突的信息。

我猜它不会,反而会永远增长——让我花钱。 :)

如果是这种情况,是否有人有一个很好的代码示例来说明如何手动从此表中清除旧日志条目?也许基于时间戳?我会定期从辅助角色运行此代码。

I've read conflicting information as to whether or not the WADLogsTable table used by the DiagnosticMonitor in Windows Azure will automatically prune old log entries.

I'm guessing it doesn't, and will instead grow forever - costing me money. :)

If that's the case, does anybody have a good code sample as to how to clear out old log entries from this table manually? Perhaps based on timestamp? I'd run this code from a worker role periodically.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(7

青柠芒果 2024-12-04 12:41:16

Windows Azure 诊断创建的表中的数据不会自动删除。

但是,Windows Azure PowerShell Cmdlet 包含专门针对这种情况的 cmdlet。

PSD:\>帮助清除-WindowsAzureLog

姓名
清除-WindowsAzureLog

概要
从存储帐户中删除 Windows Azure 跟踪日志数据。

语法
Clear-WindowsAzureLog [-DeploymentId ] [-From ] [-To ] [-StorageAccountName ] [-StorageAccountKey ] [-UseD
开发存储] [-StorageAccountCredentials] []

Clear-WindowsAzureLog [-DeploymentId ] [-FromUtc ] [-ToUt
c <日期时间>] [-StorageAccountName <字符串>] [-StorageAccountKey <字符串>]
[-UseDevelopmentStorage] [-StorageAccountCredentials ] [<通用参数>]

您需要指定-ToUtc参数,该日期之前的所有日志都将被删除。

如果需要在辅助角色内的 Azure 上执行清理任务,则可以重用 C# cmdlet 代码。 PowerShell Cmdlet 是根据许可的 MS 公共许可证发布的。

基本上,只需要 3 个文件,没有其他外部依赖项:DiagnosticsOperationException.cs、WadTableExtensions.cs、WadTableServiceEntity.cs。

The data in tables created by Windows Azure Diagnostics isn't deleted automatically.

However, Windows Azure PowerShell Cmdlets contain cmdlets specifically for this case.

PS D:\> help Clear-WindowsAzureLog

NAME
Clear-WindowsAzureLog

SYNOPSIS
Removes Windows Azure trace log data from a storage account.

SYNTAX
Clear-WindowsAzureLog [-DeploymentId ] [-From ] [-To ] [-StorageAccountName ] [-StorageAccountKey ] [-UseD
evelopmentStorage] [-StorageAccountCredentials ] []

Clear-WindowsAzureLog [-DeploymentId <String>] [-FromUtc <DateTime>] [-ToUt
c <DateTime>] [-StorageAccountName <String>] [-StorageAccountKey <String>]
[-UseDevelopmentStorage] [-StorageAccountCredentials <StorageCredentialsAcc
ountAndKey>] [<CommonParameters>]

You need to specify -ToUtc parameter, and all logs before that date will be deleted.

If cleanup task needs to be performed on Azure within the worker role, C# cmdlets code can be reused. PowerShell Cmdlets are published under permissive MS Public License.

Basically, there are only 3 files needed without other external dependencies: DiagnosticsOperationException.cs, WadTableExtensions.cs, WadTableServiceEntity.cs.

埖埖迣鎅 2024-12-04 12:41:16

更新了Chriseyre2000的功能。对于需要删除数千条记录的情况,这提供了更高的性能:通过 PartitionKey 搜索和分块分步过程。请记住,最好的选择是在存储附近(在云服务中)运行它。

public static void TruncateDiagnostics(CloudStorageAccount storageAccount, 
    DateTime startDateTime, DateTime finishDateTime, Func<DateTime,DateTime> stepFunction)
{
        var cloudTable = storageAccount.CreateCloudTableClient().GetTableReference("WADLogsTable");

        var query = new TableQuery();
        var dt = startDateTime;
        while (true)
        {
            dt = stepFunction(dt);
            if (dt>finishDateTime)
                break;
            var l = dt.Ticks;
            string partitionKey =  "0" + l;
            query.FilterString = TableQuery.GenerateFilterCondition("PartitionKey", QueryComparisons.LessThan, partitionKey);
            query.Select(new string[] {});
            var items = cloudTable.ExecuteQuery(query).ToList();
            const int chunkSize = 200;
            var chunkedList = new List<List<DynamicTableEntity>>();
            int index = 0;
            while (index < items.Count)
            {
                var count = items.Count - index > chunkSize ? chunkSize : items.Count - index;
                chunkedList.Add(items.GetRange(index, count));
                index += chunkSize;
            }
            foreach (var chunk in chunkedList)
            {
                var batches = new Dictionary<string, TableBatchOperation>();
                foreach (var entity in chunk)
                {
                    var tableOperation = TableOperation.Delete(entity);
                    if (batches.ContainsKey(entity.PartitionKey))
                        batches[entity.PartitionKey].Add(tableOperation);
                    else
                        batches.Add(entity.PartitionKey, new TableBatchOperation {tableOperation});
                }

                foreach (var batch in batches.Values)
                    cloudTable.ExecuteBatch(batch);
            }
        }
}

Updated function of Chriseyre2000. This provides much more performance for those cases where you need to delete many thousands records: search by PartitionKey and chunked step-by-step process. And remember that the best choice it is to run it near storage (in cloud service).

public static void TruncateDiagnostics(CloudStorageAccount storageAccount, 
    DateTime startDateTime, DateTime finishDateTime, Func<DateTime,DateTime> stepFunction)
{
        var cloudTable = storageAccount.CreateCloudTableClient().GetTableReference("WADLogsTable");

        var query = new TableQuery();
        var dt = startDateTime;
        while (true)
        {
            dt = stepFunction(dt);
            if (dt>finishDateTime)
                break;
            var l = dt.Ticks;
            string partitionKey =  "0" + l;
            query.FilterString = TableQuery.GenerateFilterCondition("PartitionKey", QueryComparisons.LessThan, partitionKey);
            query.Select(new string[] {});
            var items = cloudTable.ExecuteQuery(query).ToList();
            const int chunkSize = 200;
            var chunkedList = new List<List<DynamicTableEntity>>();
            int index = 0;
            while (index < items.Count)
            {
                var count = items.Count - index > chunkSize ? chunkSize : items.Count - index;
                chunkedList.Add(items.GetRange(index, count));
                index += chunkSize;
            }
            foreach (var chunk in chunkedList)
            {
                var batches = new Dictionary<string, TableBatchOperation>();
                foreach (var entity in chunk)
                {
                    var tableOperation = TableOperation.Delete(entity);
                    if (batches.ContainsKey(entity.PartitionKey))
                        batches[entity.PartitionKey].Add(tableOperation);
                    else
                        batches.Add(entity.PartitionKey, new TableBatchOperation {tableOperation});
                }

                foreach (var batch in batches.Values)
                    cloudTable.ExecuteBatch(batch);
            }
        }
}
jJeQQOZ5 2024-12-04 12:41:16

您可以根据时间戳执行此操作,但这会非常低效,因为需要扫描整个表。下面是一个代码示例,可能有助于生成分区键以防止“全”表扫描。 http://blogs.msdn.com/b/avkashchauhan/archive/2011/06/24/linq-code-to-query-windows-azure-wadlogstable-to-get-rows-which-存储在特定日期时间之后.aspx

You could just do it based on the timestamp but that would be very inefficient since the whole table would need to be scanned. Here is a code sample that might help where the partition key is generated to prevent a "full" table scan. http://blogs.msdn.com/b/avkashchauhan/archive/2011/06/24/linq-code-to-query-windows-azure-wadlogstable-to-get-rows-which-are-stored-after-a-specific-datetime.aspx

余罪 2024-12-04 12:41:16

这是一个根据时间戳截断的解决方案。 (针对 SDK 2.0 进行测试)

它确实使用表扫描来获取数据,但如果每天运行一次就不会太痛苦:

    /// <summary>
    /// TruncateDiagnostics(storageAccount, DateTime.Now.AddHours(-1));
    /// </summary>
    /// <param name="storageAccount"></param>
    /// <param name="keepThreshold"></param>
    public void TruncateDiagnostics(CloudStorageAccount storageAccount, DateTime keepThreshold)
    {
        try
        {

            CloudTableClient tableClient = storageAccount.CreateCloudTableClient();

            CloudTable cloudTable = tableClient.GetTableReference("WADLogsTable");

            TableQuery query = new TableQuery();
            query.FilterString = string.Format("Timestamp lt datetime'{0:yyyy-MM-ddTHH:mm:ss}'", keepThreshold);
            var items = cloudTable.ExecuteQuery(query).ToList();

            Dictionary<string, TableBatchOperation> batches = new Dictionary<string, TableBatchOperation>();
            foreach (var entity in items)
            {
                TableOperation tableOperation = TableOperation.Delete(entity);

                if (!batches.ContainsKey(entity.PartitionKey))
                {
                    batches.Add(entity.PartitionKey, new TableBatchOperation());
                }

                batches[entity.PartitionKey].Add(tableOperation);
            }

            foreach (var batch in batches.Values)
            {
                cloudTable.ExecuteBatch(batch);
            }

        }
        catch (Exception ex)
        {
            Trace.TraceError(string.Format("Truncate WADLogsTable exception {0}", ex), "Error");
        }
    }

Here is a solution that trunctates based upon a timestamp. (Tested against SDK 2.0)

It does use a table scan to get the data but if run say once per day would not be too painful:

    /// <summary>
    /// TruncateDiagnostics(storageAccount, DateTime.Now.AddHours(-1));
    /// </summary>
    /// <param name="storageAccount"></param>
    /// <param name="keepThreshold"></param>
    public void TruncateDiagnostics(CloudStorageAccount storageAccount, DateTime keepThreshold)
    {
        try
        {

            CloudTableClient tableClient = storageAccount.CreateCloudTableClient();

            CloudTable cloudTable = tableClient.GetTableReference("WADLogsTable");

            TableQuery query = new TableQuery();
            query.FilterString = string.Format("Timestamp lt datetime'{0:yyyy-MM-ddTHH:mm:ss}'", keepThreshold);
            var items = cloudTable.ExecuteQuery(query).ToList();

            Dictionary<string, TableBatchOperation> batches = new Dictionary<string, TableBatchOperation>();
            foreach (var entity in items)
            {
                TableOperation tableOperation = TableOperation.Delete(entity);

                if (!batches.ContainsKey(entity.PartitionKey))
                {
                    batches.Add(entity.PartitionKey, new TableBatchOperation());
                }

                batches[entity.PartitionKey].Add(tableOperation);
            }

            foreach (var batch in batches.Values)
            {
                cloudTable.ExecuteBatch(batch);
            }

        }
        catch (Exception ex)
        {
            Trace.TraceError(string.Format("Truncate WADLogsTable exception {0}", ex), "Error");
        }
    }
走过海棠暮 2024-12-04 12:41:16

这是我的 @Chriseyre2000 解决方案的稍微不同的版本,使用异步操作和 PartitionKey 查询。在我的例子中,它被设计为在辅助角色中连续运行。如果您有很多条目需要清理,这可能会更容易记忆。

static class LogHelper
{
    /// <summary>
    /// Periodically run a cleanup task for log data, asynchronously
    /// </summary>
    public static async void TruncateDiagnosticsAsync()
    {
        while ( true )
        {
            try
            {
                // Retrieve storage account from connection-string
                CloudStorageAccount storageAccount = CloudStorageAccount.Parse(
                    CloudConfigurationManager.GetSetting( "CloudStorageConnectionString" ) );

                CloudTableClient tableClient = storageAccount.CreateCloudTableClient();

                CloudTable cloudTable = tableClient.GetTableReference( "WADLogsTable" );

                // keep a weeks worth of logs
                DateTime keepThreshold = DateTime.UtcNow.AddDays( -7 );

                // do this until we run out of items
                while ( true )
                {
                    TableQuery query = new TableQuery();
                    query.FilterString = string.Format( "PartitionKey lt '0{0}'", keepThreshold.Ticks );
                    var items = cloudTable.ExecuteQuery( query ).Take( 1000 );

                    if ( items.Count() == 0 )
                        break;

                    Dictionary<string, TableBatchOperation> batches = new Dictionary<string, TableBatchOperation>();
                    foreach ( var entity in items )
                    {
                        TableOperation tableOperation = TableOperation.Delete( entity );

                        // need a new batch?
                        if ( !batches.ContainsKey( entity.PartitionKey ) )
                            batches.Add( entity.PartitionKey, new TableBatchOperation() );

                        // can have only 100 per batch
                        if ( batches[entity.PartitionKey].Count < 100)
                            batches[entity.PartitionKey].Add( tableOperation );
                    }

                    // execute!
                    foreach ( var batch in batches.Values )
                        await cloudTable.ExecuteBatchAsync( batch );

                    Trace.TraceInformation( "WADLogsTable truncated: " + query.FilterString );
                }
            }
            catch ( Exception ex )
            {
                Trace.TraceError( "Truncate WADLogsTable exception {0}", ex.Message );
            }

            // run this once per day
            await Task.Delay( TimeSpan.FromDays( 1 ) );
        }
    }
}

要启动该流程,只需从辅助角色的 OnStart 方法中调用此方法即可。

// start the periodic cleanup
LogHelper.TruncateDiagnosticsAsync();

Here's my slightly different version of @Chriseyre2000's solution, using asynchronous operations and PartitionKey querying. It's designed to run continuously within a Worker Role in my case. This one may be a bit easier on memory if you have a lot of entries to clean up.

static class LogHelper
{
    /// <summary>
    /// Periodically run a cleanup task for log data, asynchronously
    /// </summary>
    public static async void TruncateDiagnosticsAsync()
    {
        while ( true )
        {
            try
            {
                // Retrieve storage account from connection-string
                CloudStorageAccount storageAccount = CloudStorageAccount.Parse(
                    CloudConfigurationManager.GetSetting( "CloudStorageConnectionString" ) );

                CloudTableClient tableClient = storageAccount.CreateCloudTableClient();

                CloudTable cloudTable = tableClient.GetTableReference( "WADLogsTable" );

                // keep a weeks worth of logs
                DateTime keepThreshold = DateTime.UtcNow.AddDays( -7 );

                // do this until we run out of items
                while ( true )
                {
                    TableQuery query = new TableQuery();
                    query.FilterString = string.Format( "PartitionKey lt '0{0}'", keepThreshold.Ticks );
                    var items = cloudTable.ExecuteQuery( query ).Take( 1000 );

                    if ( items.Count() == 0 )
                        break;

                    Dictionary<string, TableBatchOperation> batches = new Dictionary<string, TableBatchOperation>();
                    foreach ( var entity in items )
                    {
                        TableOperation tableOperation = TableOperation.Delete( entity );

                        // need a new batch?
                        if ( !batches.ContainsKey( entity.PartitionKey ) )
                            batches.Add( entity.PartitionKey, new TableBatchOperation() );

                        // can have only 100 per batch
                        if ( batches[entity.PartitionKey].Count < 100)
                            batches[entity.PartitionKey].Add( tableOperation );
                    }

                    // execute!
                    foreach ( var batch in batches.Values )
                        await cloudTable.ExecuteBatchAsync( batch );

                    Trace.TraceInformation( "WADLogsTable truncated: " + query.FilterString );
                }
            }
            catch ( Exception ex )
            {
                Trace.TraceError( "Truncate WADLogsTable exception {0}", ex.Message );
            }

            // run this once per day
            await Task.Delay( TimeSpan.FromDays( 1 ) );
        }
    }
}

To start the process, just call this from the OnStart method in your worker role.

// start the periodic cleanup
LogHelper.TruncateDiagnosticsAsync();
甜是你 2024-12-04 12:41:16

如果您不关心任何内容,只需删除该表即可。 Azure 诊断只会重新创建它。

If you don't care about any of the contents, just delete the table. Azure Diagnostics will just recreate it.

耶耶耶 2024-12-04 12:41:16

稍微更新了 Chriseyre2000 的代码:

  • 使用 ExecuteQuerySegmented 而不是 ExecuteQuery

  • 观察 TableBatchOperation 100 次操作的限制

  • 清除所有 Azure 表

    public static void TruncateAllAzureTables(CloudStorageAccount storageAccount, DateTime keepThreshold)
    {
       TruncateAzureTable(storageAccount, "WADLogsTable", keepThreshold);
       TruncateAzureTable(storageAccount, "WADCrashDump", keepThreshold);
       TruncateAzureTable(storageAccount, "WADDiagnosticInfrastructureLogsTable", keepThreshold);
       TruncateAzureTable(storageAccount, "WADPerformanceCountersTable", keepThreshold);
       TruncateAzureTable(storageAccount, "WADWindowsEventLogsTable", keepThreshold);
    }
    
    公共静态无效TruncateAzureTable(CloudStorageAccount storageAccount,字符串aTableName,DateTime keepThreshold)
    {
       常量 int maxOperationsInBatch = 100;
       var tableClient = storageAccount.CreateCloudTableClient();
    
       var cloudTable = tableClient.GetTableReference(aTableName);
    
       var query = new TableQuery { FilterString = $"Timestamp lt datetime'{keepThreshold:yyyy-MM-ddTHH:mm:ss}'" };
       TableContinuationToken 延续Token = null;
       做
       {
          var queryResult = cloudTable.ExecuteQuerySegmented(query, continuationToken);
          ContinuationToken = queryResult.ContinuationToken;
    
          var items = queryResult.ToList();
          var batches = new Dictionary>();
          foreach(项目中的var实体)
          {
             var tableOperation = TableOperation.Delete(entity);
    
             if (!batches.TryGetValue(entity.PartitionKey, out var batchOperationList))
             {
                batchOperationList = new List();
                批次.Add(entity.PartitionKey,batchOperationList);
             }
    
             var batchOperation = batchOperationList.FirstOrDefault(bo => bo.Count < maxOperationsInBatch);
             if (batchOperation == null)
             {
                批处理操作 = new TableBatchOperation();
                批量操作列表.Add(批量操作);
             }
             批量操作.Add(表操作);
          }
    
          foreach (var 批量批量.Values.SelectMany(l => l))
          {
             cloudTable.ExecuteBatch(批处理);
          }
       while (continuationToken!= null);
    }
    

Slightly updated Chriseyre2000's code:

  • using ExecuteQuerySegmented instead of ExecuteQuery

  • observing TableBatchOperation limit of 100 operations

  • purging all Azure tables

    public static void TruncateAllAzureTables(CloudStorageAccount storageAccount, DateTime keepThreshold)
    {
       TruncateAzureTable(storageAccount, "WADLogsTable", keepThreshold);
       TruncateAzureTable(storageAccount, "WADCrashDump", keepThreshold);
       TruncateAzureTable(storageAccount, "WADDiagnosticInfrastructureLogsTable", keepThreshold);
       TruncateAzureTable(storageAccount, "WADPerformanceCountersTable", keepThreshold);
       TruncateAzureTable(storageAccount, "WADWindowsEventLogsTable", keepThreshold);
    }
    
    public static void TruncateAzureTable(CloudStorageAccount storageAccount, string aTableName, DateTime keepThreshold)
    {
       const int maxOperationsInBatch = 100;
       var tableClient = storageAccount.CreateCloudTableClient();
    
       var cloudTable = tableClient.GetTableReference(aTableName);
    
       var query = new TableQuery { FilterString = 
    quot;Timestamp lt datetime'{keepThreshold:yyyy-MM-ddTHH:mm:ss}'" };
       TableContinuationToken continuationToken = null;
       do
       {
          var queryResult = cloudTable.ExecuteQuerySegmented(query, continuationToken);
          continuationToken = queryResult.ContinuationToken;
    
          var items = queryResult.ToList();
          var batches = new Dictionary<string, List<TableBatchOperation>>();
          foreach (var entity in items)
          {
             var tableOperation = TableOperation.Delete(entity);
    
             if (!batches.TryGetValue(entity.PartitionKey, out var batchOperationList))
             {
                batchOperationList = new List<TableBatchOperation>();
                batches.Add(entity.PartitionKey, batchOperationList);
             }
    
             var batchOperation = batchOperationList.FirstOrDefault(bo => bo.Count < maxOperationsInBatch);
             if (batchOperation == null)
             {
                batchOperation = new TableBatchOperation();
                batchOperationList.Add(batchOperation);
             }
             batchOperation.Add(tableOperation);
          }
    
          foreach (var batch in batches.Values.SelectMany(l => l))
          {
             cloudTable.ExecuteBatch(batch);
          }
       } while (continuationToken != null);
    }
    
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文