如何通过网络快速读取日志文件?

发布于 2024-08-07 15:59:12 字数 5409 浏览 15 评论 0原文

我正在使用 Delphi 2007,并且有一个应用程序可以通过内部网络从多个位置读取日志文件并显示异常。这些目录有时包含数千个日志文件。该应用程序可以选择仅读取最近 n 天的日志文件,也可以在任何日期时间范围内。

问题是第一次读取日志目录可能会非常慢(几分钟)。第二次速度要快得多。

我想知道如何优化我的代码以尽可能快地读取日志文件? 我正在使用 vCurrentFile: TStringList 将文件存储在内存中。 这是从 FileStream 更新的,因为我认为这更快。

下面是一些代码:

刷新:读取日志文件的主循环

// In this function the logfiles are searched for exceptions. If a exception is found it is stored in a object.
// The exceptions are then shown in the grid
procedure TfrmMain.Refresh;
var
  FileData : TSearchRec;  // Used for the file searching. Contains data of the file
  vPos, i, PathIndex : Integer;
  vCurrentFile: TStringList;
  vDate: TDateTime;
  vFileStream: TFileStream;
begin
  tvMain.DataController.RecordCount := 0;
  vCurrentFile := TStringList.Create;
  memCallStack.Clear;

  try
    for PathIndex := 0 to fPathList.Count - 1 do                      // Loop 0. This loops until all directories are searched through
    begin
      if (FindFirst (fPathList[PathIndex] + '\*.log', faAnyFile, FileData) = 0) then
      repeat                                                      // Loop 1. This loops while there are .log files in Folder (CurrentPath)
        vDate := FileDateToDateTime(FileData.Time);

        if chkLogNames.Items[PathIndex].Checked and FileDateInInterval(vDate) then
        begin
          tvMain.BeginUpdate;       // To speed up the grid - delays the guichange until EndUpdate

          fPathPlusFile := fPathList[PathIndex] + '\' + FileData.Name;
          vFileStream := TFileStream.Create(fPathPlusFile, fmShareDenyNone);
          vCurrentFile.LoadFromStream(vFileStream);

          fUser := FindDataInRow(vCurrentFile[0], 'User');          // FindData Returns the string after 'User' until ' '
          fComputer := FindDataInRow(vCurrentFile[0], 'Computer');  // FindData Returns the string after 'Computer' until ' '

          Application.ProcessMessages;                  // Give some priority to the User Interface

          if not CancelForm.IsCanceled then
          begin
            if rdException.Checked then
              for i := 0 to vCurrentFile.Count - 1 do
              begin
                vPos := AnsiPos(MainExceptionToFind, vCurrentFile[i]);
                if vPos > 0 then
                  UpdateView(vCurrentFile[i], i, MainException);

                vPos := AnsiPos(ExceptionHintToFind, vCurrentFile[i]);
                if vPos > 0 then
                  UpdateView(vCurrentFile[i], i, HintException);
              end
            else if rdOtherText.Checked then
              for i := 0 to vCurrentFile.Count - 1 do
              begin
                vPos := AnsiPos(txtTextToSearch.Text, vCurrentFile[i]);
                if vPos > 0 then
                  UpdateView(vCurrentFile[i], i, TextSearch)
              end
          end;

          vFileStream.Destroy;
          tvMain.EndUpdate;     // Now the Gui can be updated
        end;
      until(FindNext(FileData) <> 0) or (CancelForm.IsCanceled);     // End Loop 1
    end;                                                          // End Loop 0
  finally
    FreeAndNil(vCurrentFile);
  end;
end;

UpdateView 方法:向显示网格添加一行

{: Update the grid with one exception}
procedure TfrmMain.UpdateView(aLine: string; const aIndex, aType: Integer);
var
  vExceptionText: String;
  vDate: TDateTime;
begin
  if ExceptionDateInInterval(aLine, vDate) then     // Parse the date from the beginning of date
  begin
    if aType = MainException then
      vExceptionText := 'Exception'
    else if aType = HintException then
      vExceptionText := 'Exception Hint'
    else if aType = TextSearch then
      vExceptionText := 'Text Search';

    SetRow(aIndex, vDate, ExtractFilePath(fPathPlusFile), ExtractFileName(fPathPlusFile), fUser, fComputer, aLine, vExceptionText);
  end;
end;

确定行是否在日期范围内的方法:

{: This compare exact exception time against the filters
@desc 2 cases:    1. Last n days
                  2. From - to range}
function TfrmMain.ExceptionDateInInterval(var aTestLine: String; out aDateTime: TDateTime): Boolean;
var
  vtmpDate, vTmpTime: String;
  vDate, vTime: TDateTime;
  vIndex: Integer;
begin
  aDateTime := 0;
  vtmpDate := Copy(aTestLine, 0, 8);
  vTmpTime := Copy(aTestLine, 10, 9);

  Insert(DateSeparator, vtmpDate, 5);
  Insert(DateSeparator, vtmpDate, 8);

  if TryStrToDate(vtmpDate, vDate, fFormatSetting) and TryStrToTime(vTmpTime, vTime) then
    aDateTime := vDate + vTime;

  Result := (rdLatest.Checked and (aDateTime >= (Now - spnDateLast.Value))) or
            (rdInterval.Checked and (aDateTime>= dtpDateFrom.Date) and (aDateTime <= dtpDateTo.Date));

  if Result then
  begin
    vIndex := AnsiPos(']', aTestLine);

    if vIndex > 0 then
      Delete(aTestLine, 1, vIndex + 1);
  end;
end;

测试是否文件日期在范围内:

{: Returns true if the logtime is within filters range
@desc Purpose is to sort out logfiles that are no idea to parse (wrong day)}
function TfrmMain.FileDateInInterval(aFileTimeStamp: TDate): Boolean;
begin
  Result := (rdLatest.Checked and (Int(aFileTimeStamp) >= Int(Now - spnDateLast.Value))) or
            (rdInterval.Checked and (Int(aFileTimeStamp) >= Int(dtpDateFrom.Date)) and (Int(aFileTimeStamp) <= Int(dtpDateTo.Date)));
end;

I'm using Delphi 2007 and have an application that read log-files from several places over an internal network and display exceptions. Those directories sometimes contains thousands of log-files. The application have an option to read only log-files from the n latest days bit it can also be in any datetime range.

The problem is that the first time the log directory is read it can be very slow (several minutes). The second time it is considerably faster.

I wonder how I can optimize my code to read the log-files as fast as possible ?
I'm using a vCurrentFile: TStringList to store the file in memory.
That is updated from a FileStream as I think this is faster.

Here is some code:

Refresh: The main loop to read log-files

// In this function the logfiles are searched for exceptions. If a exception is found it is stored in a object.
// The exceptions are then shown in the grid
procedure TfrmMain.Refresh;
var
  FileData : TSearchRec;  // Used for the file searching. Contains data of the file
  vPos, i, PathIndex : Integer;
  vCurrentFile: TStringList;
  vDate: TDateTime;
  vFileStream: TFileStream;
begin
  tvMain.DataController.RecordCount := 0;
  vCurrentFile := TStringList.Create;
  memCallStack.Clear;

  try
    for PathIndex := 0 to fPathList.Count - 1 do                      // Loop 0. This loops until all directories are searched through
    begin
      if (FindFirst (fPathList[PathIndex] + '\*.log', faAnyFile, FileData) = 0) then
      repeat                                                      // Loop 1. This loops while there are .log files in Folder (CurrentPath)
        vDate := FileDateToDateTime(FileData.Time);

        if chkLogNames.Items[PathIndex].Checked and FileDateInInterval(vDate) then
        begin
          tvMain.BeginUpdate;       // To speed up the grid - delays the guichange until EndUpdate

          fPathPlusFile := fPathList[PathIndex] + '\' + FileData.Name;
          vFileStream := TFileStream.Create(fPathPlusFile, fmShareDenyNone);
          vCurrentFile.LoadFromStream(vFileStream);

          fUser := FindDataInRow(vCurrentFile[0], 'User');          // FindData Returns the string after 'User' until ' '
          fComputer := FindDataInRow(vCurrentFile[0], 'Computer');  // FindData Returns the string after 'Computer' until ' '

          Application.ProcessMessages;                  // Give some priority to the User Interface

          if not CancelForm.IsCanceled then
          begin
            if rdException.Checked then
              for i := 0 to vCurrentFile.Count - 1 do
              begin
                vPos := AnsiPos(MainExceptionToFind, vCurrentFile[i]);
                if vPos > 0 then
                  UpdateView(vCurrentFile[i], i, MainException);

                vPos := AnsiPos(ExceptionHintToFind, vCurrentFile[i]);
                if vPos > 0 then
                  UpdateView(vCurrentFile[i], i, HintException);
              end
            else if rdOtherText.Checked then
              for i := 0 to vCurrentFile.Count - 1 do
              begin
                vPos := AnsiPos(txtTextToSearch.Text, vCurrentFile[i]);
                if vPos > 0 then
                  UpdateView(vCurrentFile[i], i, TextSearch)
              end
          end;

          vFileStream.Destroy;
          tvMain.EndUpdate;     // Now the Gui can be updated
        end;
      until(FindNext(FileData) <> 0) or (CancelForm.IsCanceled);     // End Loop 1
    end;                                                          // End Loop 0
  finally
    FreeAndNil(vCurrentFile);
  end;
end;

UpdateView method: Add one row to the displaygrid

{: Update the grid with one exception}
procedure TfrmMain.UpdateView(aLine: string; const aIndex, aType: Integer);
var
  vExceptionText: String;
  vDate: TDateTime;
begin
  if ExceptionDateInInterval(aLine, vDate) then     // Parse the date from the beginning of date
  begin
    if aType = MainException then
      vExceptionText := 'Exception'
    else if aType = HintException then
      vExceptionText := 'Exception Hint'
    else if aType = TextSearch then
      vExceptionText := 'Text Search';

    SetRow(aIndex, vDate, ExtractFilePath(fPathPlusFile), ExtractFileName(fPathPlusFile), fUser, fComputer, aLine, vExceptionText);
  end;
end;

Method to decide if row is in daterange:

{: This compare exact exception time against the filters
@desc 2 cases:    1. Last n days
                  2. From - to range}
function TfrmMain.ExceptionDateInInterval(var aTestLine: String; out aDateTime: TDateTime): Boolean;
var
  vtmpDate, vTmpTime: String;
  vDate, vTime: TDateTime;
  vIndex: Integer;
begin
  aDateTime := 0;
  vtmpDate := Copy(aTestLine, 0, 8);
  vTmpTime := Copy(aTestLine, 10, 9);

  Insert(DateSeparator, vtmpDate, 5);
  Insert(DateSeparator, vtmpDate, 8);

  if TryStrToDate(vtmpDate, vDate, fFormatSetting) and TryStrToTime(vTmpTime, vTime) then
    aDateTime := vDate + vTime;

  Result := (rdLatest.Checked and (aDateTime >= (Now - spnDateLast.Value))) or
            (rdInterval.Checked and (aDateTime>= dtpDateFrom.Date) and (aDateTime <= dtpDateTo.Date));

  if Result then
  begin
    vIndex := AnsiPos(']', aTestLine);

    if vIndex > 0 then
      Delete(aTestLine, 1, vIndex + 1);
  end;
end;

Test if the filedate is inside range:

{: Returns true if the logtime is within filters range
@desc Purpose is to sort out logfiles that are no idea to parse (wrong day)}
function TfrmMain.FileDateInInterval(aFileTimeStamp: TDate): Boolean;
begin
  Result := (rdLatest.Checked and (Int(aFileTimeStamp) >= Int(Now - spnDateLast.Value))) or
            (rdInterval.Checked and (Int(aFileTimeStamp) >= Int(dtpDateFrom.Date)) and (Int(aFileTimeStamp) <= Int(dtpDateTo.Date)));
end;

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

笙痞 2024-08-14 15:59:12

问题不在于你的逻辑,而在于底层的文件系统。

当您将许多文件放入目录中时,大多数文件系统都会变得非常慢。这对 FAT 来说是非常糟糕的,但 NTFS 也会受到这个问题的影响,特别是当一个目录中有数千个文件时。

您能做的最好的事情就是重新组织这些目录结构,例如按年龄。

然后每个目录中最多有 100 个文件。

——杰罗恩

The problem is not your logic, but the underlying file system.

Most file systems get very slow when you put many files in a directory. This is very bad with FAT, but NTFS also suffers from it, especially if you have thousands of files in a directory.

The best you can do is reorganize those directory structures, for instance by age.

Then have at most a couple of 100 files in each directory.

--jeroen

花之痕靓丽 2024-08-14 15:59:12

你想要多快?如果你想要非常快,那么你需要使用 Windows 网络之外的东西来读取文件。原因是如果您想读取日志文件的最后一行(或自上次读取以来的所有行),那么您需要再次读取整个文件。

在您的问题中,您说问题是枚举目录列表的速度很慢。这是你的第一个瓶颈。如果您想要真正快速,那么您需要切换到 HTTP 或向存储日志文件的计算机添加某种日志服务器。

使用 HTTP 的优点是您可以执行范围请求,并仅获取自上次请求以来添加的日志文件的新行。这确实会提高性能,因为您传输的数据更少(特别是如果启用 HTTP 压缩),并且客户端需要处理的数据也更少。

如果您添加某种类型的日志服务器,那么该服务器可以在服务器端进行处理,它可以对数据进行本机访问,并且仅返回日期范围内的行。一种简单的方法可能是将日志放入某种 SQL 数据库中,然后对其运行查询。

那么,你想走多快?

How fast do you want to be? If you want to be really fast, then you need to use something besides windows networking to read the files. The reason is if you want to read the last line of a log file (or all the lines since the last time you read it) then you need to read the whole file again.

In your question you said the problem is that it is slow to enumerate your directory listing. That is your first bottleneck. If you want to be real fast then you need to either switch to HTTP or add some sort of log server to the machine where the log files are stored.

The advantage of using HTTP is you can do a range request and just get the new lines of the log file that were added since you last requested it. That will really improve performance since you are transferring less data (especially if you enable HTTP compression) and you also have less data to process on the client side.

If you add a log server of some sort, then that server can do the processing on the server side, where it has native access to the data, and only return the rows that are in the date range. A simple way of doing that may be to just put your logs into a SQL database of some sort, and then run queries against it.

So, how fast do you want to go?

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文