Akamai 访问日志的正则表达式帮助 - PHP

发布于 2024-11-29 17:55:37 字数 467 浏览 1 评论 0原文

有人可以帮助我在 PHP 中创建正则表达式来解析 Akamai 访问日志中的不同字段。下面的第一行指定字段名称。谢谢!

#Fields: date time cs-ip cs-method cs-uri sc-status sc-bytes time-taken cs(Referer) cs(User-Agent) cs(Cookie) x-custom
2011-08-08  23:59:52    63.555.254.85   GET /somedomain/images/banner_320x50.jpg    200 10801   0   "http://somerefered.com"    "Mozilla/5.0 (iPhone; U; CPU iPhone OS 4_3_1 like Mac OS X; en-us) AppleWebKit/533.17.9 (KHTML, like Gecko) Mobile/8G4" "-" "-"

Can someone help me out with creating a regex expression in PHP to parse out the different fields within an Akamai access log. The first line below specifies the field names. Thanks!

#Fields: date time cs-ip cs-method cs-uri sc-status sc-bytes time-taken cs(Referer) cs(User-Agent) cs(Cookie) x-custom
2011-08-08  23:59:52    63.555.254.85   GET /somedomain/images/banner_320x50.jpg    200 10801   0   "http://somerefered.com"    "Mozilla/5.0 (iPhone; U; CPU iPhone OS 4_3_1 like Mac OS X; en-us) AppleWebKit/533.17.9 (KHTML, like Gecko) Mobile/8G4" "-" "-"

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

淡淡離愁欲言轉身 2024-12-06 17:55:37

这是我刚刚编写的一个快速小测试程序:

<?php
// Fields: date time cs-ip cs-method cs-uri sc-status sc-bytes time-taken cs(Referer) cs(User-Agent) cs(Cookie) x-custom
$logLine = '2011-08-08  23:59:52    63.555.254.85   GET /somedomain/images/banner_320x50.jpg    200 10801   0   "http://somerefered.com"    "Mozilla/5.0 (iPhone; U; CPU iPhone OS 4_3_1 like Mac OS X; en-us) AppleWebKit/533.17.9 (KHTML, like Gecko) Mobile/8G4" "-" "-"';
$regex = '/^(\d{4}-\d{2}-\d{2})\s+(\d{2}:\d{2}:\d{2})\s+(\d{1,3}(?:\.\d{1,3}){3})\s+([A-Za-z]+)\s+(\S+)\s+(\d{3})\s+(\d+)\s+(\d+)\s+"([^"]*)"\s+"([^"]*)"\s+"([^"]*)"\s+"([^"]*)"$/';

$matches = array();
if (preg_match($regex, $logLine, $matches)) {
    $logParts = array(
        'date' => $matches[1],
        'time' => $matches[2],
        'cs-ip' => $matches[3],
        'cs-method' => $matches[4],
        'cs-uri' => $matches[5],
        'sc-status' => $matches[6],
        'sc-bytes' => $matches[7],
        'time-taken' => $matches[8],
        'cs(Referer)' => $matches[9],
        'cs(User-Agent)' => $matches[10],
        'cs(Cookie)' => $matches[11],
        'x-custom' => $matches[12]
    );
    print_r($logParts);
}
?>

输出:

Array
(
    [date] => 2011-08-08
    [time] => 23:59:52
    [cs-ip] => 63.555.254.85
    [cs-method] => GET
    [cs-uri] => /somedomain/images/banner_320x50.jpg
    [sc-status] => 200
    [sc-bytes] => 10801
    [time-taken] => 0
    [cs(Referer)] => http://somerefered.com
    [cs(User-Agent)] => Mozilla/5.0 (iPhone; U; CPU iPhone OS 4_3_1 like Mac OS X; en-us) AppleWebKit/533.17.9 (KHTML, like Gecko) Mobile/8G4
    [cs(Cookie)] => -
    [x-custom] => -
)

Here is a quick little test program I just wrote:

<?php
// Fields: date time cs-ip cs-method cs-uri sc-status sc-bytes time-taken cs(Referer) cs(User-Agent) cs(Cookie) x-custom
$logLine = '2011-08-08  23:59:52    63.555.254.85   GET /somedomain/images/banner_320x50.jpg    200 10801   0   "http://somerefered.com"    "Mozilla/5.0 (iPhone; U; CPU iPhone OS 4_3_1 like Mac OS X; en-us) AppleWebKit/533.17.9 (KHTML, like Gecko) Mobile/8G4" "-" "-"';
$regex = '/^(\d{4}-\d{2}-\d{2})\s+(\d{2}:\d{2}:\d{2})\s+(\d{1,3}(?:\.\d{1,3}){3})\s+([A-Za-z]+)\s+(\S+)\s+(\d{3})\s+(\d+)\s+(\d+)\s+"([^"]*)"\s+"([^"]*)"\s+"([^"]*)"\s+"([^"]*)"$/';

$matches = array();
if (preg_match($regex, $logLine, $matches)) {
    $logParts = array(
        'date' => $matches[1],
        'time' => $matches[2],
        'cs-ip' => $matches[3],
        'cs-method' => $matches[4],
        'cs-uri' => $matches[5],
        'sc-status' => $matches[6],
        'sc-bytes' => $matches[7],
        'time-taken' => $matches[8],
        'cs(Referer)' => $matches[9],
        'cs(User-Agent)' => $matches[10],
        'cs(Cookie)' => $matches[11],
        'x-custom' => $matches[12]
    );
    print_r($logParts);
}
?>

This outputs:

Array
(
    [date] => 2011-08-08
    [time] => 23:59:52
    [cs-ip] => 63.555.254.85
    [cs-method] => GET
    [cs-uri] => /somedomain/images/banner_320x50.jpg
    [sc-status] => 200
    [sc-bytes] => 10801
    [time-taken] => 0
    [cs(Referer)] => http://somerefered.com
    [cs(User-Agent)] => Mozilla/5.0 (iPhone; U; CPU iPhone OS 4_3_1 like Mac OS X; en-us) AppleWebKit/533.17.9 (KHTML, like Gecko) Mobile/8G4
    [cs(Cookie)] => -
    [x-custom] => -
)
假装爱人 2024-12-06 17:55:37

看起来字段是制表符分隔的。如果是这样,您不需要正则表达式,但可以这样做:

$fieldnames = array('date', 'time', 'cs-ip', 'cs-method', 'cs-uri', 'sc-status', 'sc-bytes', 'time-taken', 'cs(Referer)', 'cs(User-Agent)', 'cs(Cookie)', 'x-custom');

$parsed = array();
foreach($lines as $line) {
    $fields = explode("\t", $line);
    foreach($fields as $index => $field) {
        $tmp = array();
        $tmp[$fieldnames[$index]] = $field;
    }

    $parsed[] = $tmp;
}

现在您将拥有一个漂亮的数组,其中字段名称作为键。

Looks like the fields are tab delimted. If so you don't need regex but just can do:

$fieldnames = array('date', 'time', 'cs-ip', 'cs-method', 'cs-uri', 'sc-status', 'sc-bytes', 'time-taken', 'cs(Referer)', 'cs(User-Agent)', 'cs(Cookie)', 'x-custom');

$parsed = array();
foreach($lines as $line) {
    $fields = explode("\t", $line);
    foreach($fields as $index => $field) {
        $tmp = array();
        $tmp[$fieldnames[$index]] = $field;
    }

    $parsed[] = $tmp;
}

Now you will have a nice array with the fieldnames as keys.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文