如何使用 INDY 获取内部 HTML

发布于 2024-11-19 03:38:56 字数 1112 浏览 9 评论 0原文

我陷入了这个问题:我需要从页面 presnycas.eu 获取时间和日期(以同步)。日期没问题,但我找不到时间。问题是,当我调用 IdHTTP.Get(..) 方法时,我得到了页面的 HTML,但时间丢失了。像这样:

<div class="boxik"> 
<table style="text-align: left; width: 700px; height: 116px;" border="0" cellpadding="2" cellspacing="0"> 
  <tbody> 
    <tr> 
      <td style="width: 400px;" colspan="1" rowspan="5"> 
            <div class="hodinyhlavni"> 

            <span id="servertime"></span> 
              // This is where the time should be - when viewed with 
              // developer tools in Chrome, it does show the time
              // (picture here http://img684.imageshack.us/img684/166/pagem.png)
            </div> 
      </td> 
      <td style="width: 0px;"> &nbsp;      
           07.07.2011
      </td> 

现在我使用一种尴尬的方法 - 我加载一个 TWebBrowser,然后调用

Time:=StrToTime(WebBrowser1.OleObject.Document.GetElementByID('servertime').innerhtml);

,但是,它相当慢,我宁愿根本不使用 TWebBrowser。

那么,如何通过调用函数来获取元素的innerhtml呢?

提前致谢

I got stuck on this problem: I need to get time and date from page presnycas.eu (to sync). Date is fine, but I cannot get the time. The problem is that when I call IdHTTP.Get(..) method, as a result I get the HTML of the page, but the time is missing. Like this:

<div class="boxik"> 
<table style="text-align: left; width: 700px; height: 116px;" border="0" cellpadding="2" cellspacing="0"> 
  <tbody> 
    <tr> 
      <td style="width: 400px;" colspan="1" rowspan="5"> 
            <div class="hodinyhlavni"> 

            <span id="servertime"></span> 
              // This is where the time should be - when viewed with 
              // developer tools in Chrome, it does show the time
              // (picture here http://img684.imageshack.us/img684/166/pagem.png)
            </div> 
      </td> 
      <td style="width: 0px;">        
           07.07.2011
      </td> 

Now I am using an awkward approach - I load a TWebBrowser and then call

Time:=StrToTime(WebBrowser1.OleObject.Document.GetElementByID('servertime').innerhtml);

but well, it's rather slow and I would rather not use the TWebBrowser at all.

So, how can I get the innerhtml of an element with a call of function?

Thanks in advance

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

不可一世的女人 2024-11-26 03:38:56

这个答案最重要的部分是“你需要了解 HTML 和 JavaScript 并弄清楚网站是如何工作的”。打开网站,右键单击并执行“显示源代码”。您会在顶部注意到这一点:

<script type="text/javascript">var currenttime = 'July 07, 2011 11:51:14'</script>

这看起来像时间,就我而言,时间是正确的,但未调整为我的时区。您可以使用 Indy 轻松获取纯 HTML,显然这已经足够了。这个快速代码示例向您展示了如何使用一小段正则表达式获取 HTML 并解析日期和时间。如果您使用的是 Delphi XE,则必须将 TPerlRegEx 类名称和 PerlRegEx 单元名称替换为 XE 想要的任何名称。如果您使用的是较旧的 Delphi,那就没有理由不使用 RegEx!下载 TPerlRegEx,它是免费的并且与 XE 的东西兼容。

program Project29;

{$APPTYPE CONSOLE}

uses
  SysUtils, IdHTTP, PerlRegEx, SysConst;

function ExtractDayTime: TDateTime;
var H: TIdHTTP;
    Response: string;
    RegEx: TPerlRegEx;

    s: string;

    Month, Year, Day, Hour, Minute, Second: Word;
begin
  H := TIdHttp.Create(Nil);
  try
    Response := H.Get('http://presnycas.eu/');
    RegEx := TPerlRegEx.Create;
    try
      RegEx.RegEx := 'var\ currenttime\ \=\ \''(\w+)\ (\d{1,2})\,\ (\d{4})\ (\d{1,2})\:(\d{1,2})\:(\d{1,2})\''';
      RegEx.Subject := Response;
      if RegEx.Match then
        begin

          // Translate month
          s := RegEx.Groups[1];
          if s = SShortMonthNameJan then Month := 1

          else if s = SShortMonthNameFeb then Month := 2
          else if s = SShortMonthNameMar then Month := 3
          else if s = SShortMonthNameApr then Month := 4
          else if s = SShortMonthNameMay then Month := 5
          else if s = SShortMonthNameJun then Month := 6
          else if s = SShortMonthNameJul then Month := 7
          else if s = SShortMonthNameAug then Month := 8
          else if s = SShortMonthNameSep then Month := 9
          else if s = SShortMonthNameOct then Month := 10
          else if s = SShortMonthNameNov then Month := 11
          else if s = SShortMonthNameDec then Month := 12

          else if s = SLongMonthNameJan then Month := 1
          else if s = SLongMonthNameFeb then Month := 2
          else if s = SLongMonthNameMar then Month := 3
          else if s = SLongMonthNameApr then Month := 4
          else if s = SLongMonthNameMay then Month := 5
          else if s = SLongMonthNameJun then Month := 6
          else if s = SLongMonthNameJul then Month := 7
          else if s = SLongMonthNameAug then Month := 8
          else if s = SLongMonthNameSep then Month := 9
          else if s = SLongMonthNameOct then Month := 10
          else if s = SLongMonthNameNov then Month := 11
          else if s = SLongMonthNameDec then Month := 12

          else
            raise Exception.CreateFmt('Don''t know what month is: %s', [s]);

          // Day, Year, Hour, Minute, Second
          Day := StrToInt(RegEx.Groups[2]);
          Year := StrToInt(RegEx.Groups[3]);
          Hour := StrToInt(RegEx.Groups[4]);
          Minute := StrToInt(RegEx.Groups[5]);
          Second := StrToInt(RegEx.Groups[6]);

          Result := EncodeDate(Year, Month, Day) + EncodeTime(Hour, Minute, Second, 0);

        end
      else
        raise Exception.Create('Can''t get time!');
    finally RegEx.Free;
    end;
  finally H.Free;
  end;
end;

begin
  WriteLn(DateTimeToStr(ExtractDayTime));
  ReadLn;
end.

The most important part of this answer would be "you need to understand HTML and JavaScript and figure out how the site works". Open the web site, right-click and do "Show Source". You'll notice this at the top:

<script type="text/javascript">var currenttime = 'July 07, 2011 11:51:14'</script>

That looks like the time, and in my case, the time is correct but not adjusted to MY time zone. You can easily grab the plain HTML using Indy, and apparently that's enough. This quick code sample shows you how to grab the HTML and parse the date and time using a little piece of RegEx. If you're on Delphi XE, you'll have to replace the TPerlRegEx class name and the PerlRegEx unit name to whatever XE wants. If you're on older Delphi, that's no excuse to NOT use RegEx! Download TPerlRegEx, it's free and compatible with the XE stuff.

program Project29;

{$APPTYPE CONSOLE}

uses
  SysUtils, IdHTTP, PerlRegEx, SysConst;

function ExtractDayTime: TDateTime;
var H: TIdHTTP;
    Response: string;
    RegEx: TPerlRegEx;

    s: string;

    Month, Year, Day, Hour, Minute, Second: Word;
begin
  H := TIdHttp.Create(Nil);
  try
    Response := H.Get('http://presnycas.eu/');
    RegEx := TPerlRegEx.Create;
    try
      RegEx.RegEx := 'var\ currenttime\ \=\ \''(\w+)\ (\d{1,2})\,\ (\d{4})\ (\d{1,2})\:(\d{1,2})\:(\d{1,2})\''';
      RegEx.Subject := Response;
      if RegEx.Match then
        begin

          // Translate month
          s := RegEx.Groups[1];
          if s = SShortMonthNameJan then Month := 1

          else if s = SShortMonthNameFeb then Month := 2
          else if s = SShortMonthNameMar then Month := 3
          else if s = SShortMonthNameApr then Month := 4
          else if s = SShortMonthNameMay then Month := 5
          else if s = SShortMonthNameJun then Month := 6
          else if s = SShortMonthNameJul then Month := 7
          else if s = SShortMonthNameAug then Month := 8
          else if s = SShortMonthNameSep then Month := 9
          else if s = SShortMonthNameOct then Month := 10
          else if s = SShortMonthNameNov then Month := 11
          else if s = SShortMonthNameDec then Month := 12

          else if s = SLongMonthNameJan then Month := 1
          else if s = SLongMonthNameFeb then Month := 2
          else if s = SLongMonthNameMar then Month := 3
          else if s = SLongMonthNameApr then Month := 4
          else if s = SLongMonthNameMay then Month := 5
          else if s = SLongMonthNameJun then Month := 6
          else if s = SLongMonthNameJul then Month := 7
          else if s = SLongMonthNameAug then Month := 8
          else if s = SLongMonthNameSep then Month := 9
          else if s = SLongMonthNameOct then Month := 10
          else if s = SLongMonthNameNov then Month := 11
          else if s = SLongMonthNameDec then Month := 12

          else
            raise Exception.CreateFmt('Don''t know what month is: %s', [s]);

          // Day, Year, Hour, Minute, Second
          Day := StrToInt(RegEx.Groups[2]);
          Year := StrToInt(RegEx.Groups[3]);
          Hour := StrToInt(RegEx.Groups[4]);
          Minute := StrToInt(RegEx.Groups[5]);
          Second := StrToInt(RegEx.Groups[6]);

          Result := EncodeDate(Year, Month, Day) + EncodeTime(Hour, Minute, Second, 0);

        end
      else
        raise Exception.Create('Can''t get time!');
    finally RegEx.Free;
    end;
  finally H.Free;
  end;
end;

begin
  WriteLn(DateTimeToStr(ExtractDayTime));
  ReadLn;
end.
笑饮青盏花 2024-11-26 03:38:56

我尝试了您指定的链接(http://presnycas.eu/),从 HTML 中我可以看到实际时间在 HTML 中的另一个位置返回,然后在本地使用 JavaScript 增加,因此您需要“获取“如果您想同步,请定期输入新时间。

在 HTML 中查找(在 HEAD 元素内):

<head>
...
<script type="text/javascript">var currenttime = 'July 07, 2011 12:01:26'</script>
...
</head>

I tried the link you specified (http://presnycas.eu/) and from the HTML I can see that the actual time is returned at another location in the HTML, and then later increased with a JavaScript locally so you need to "fetch" the new time periodically if you want to sync.

Look for this in the HTML (inside the HEAD element):

<head>
...
<script type="text/javascript">var currenttime = 'July 07, 2011 12:01:26'</script>
...
</head>
柳絮泡泡 2024-11-26 03:38:56

如何使用 indy TidHTTP 获取内部 html

var
  Form2: TForm2;
  xpto:tmemorystream;
  xx:string;
  implementation

{$R *.fmx}

procedure TForm2.Button1Click(Sender: TObject);

begin
xpto:=tmemorystream.Create;
idhttp1.Get('http://google.com',xpto);
xpto.Position:=0;

end;


procedure TForm2.IdHTTP1WorkEnd(ASender: TObject; AWorkMode: TWorkMode);
var x:string;
begin

SetString(x, PAnsiChar(xpto.Memory), xpto.Size);

memo1.Lines.add(x);
end;

// 对于 Android Firemonkey 使用,请将 Pansichar 替换为 MarshaledAString

How to get inner html using indy TidHTTP

var
  Form2: TForm2;
  xpto:tmemorystream;
  xx:string;
  implementation

{$R *.fmx}

procedure TForm2.Button1Click(Sender: TObject);

begin
xpto:=tmemorystream.Create;
idhttp1.Get('http://google.com',xpto);
xpto.Position:=0;

end;


procedure TForm2.IdHTTP1WorkEnd(ASender: TObject; AWorkMode: TWorkMode);
var x:string;
begin

SetString(x, PAnsiChar(xpto.Memory), xpto.Size);

memo1.Lines.add(x);
end;

// For Android Firemonkey usage please replace Pansichar with MarshaledAString

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文