PHP 操作码与实际执行的二进制代码有何关系?

发布于 2024-12-19 03:36:16 字数 4311 浏览 2 评论 0 原文

test.php 作为纯文本:

<?php
$x = "a";
echo $x;

test.php 作为操作码:

debian:~ php -d vld.active=1 -d vld.execute=0 -f test.php

Finding entry points
Branch analysis from position: 0
Return found
filename:       /root/test.php
function name:  (null)
number of ops:  5
compiled vars:  !0 = $x
line     # *  op                           fetch          ext  return  operands
---------------------------------------------------------------------------------
   2     0  >   EXT_STMT
         1      ASSIGN                                                   !0, 'a'
   3     2      EXT_STMT
         3      ECHO                                                     !0
   4     4    > RETURN                                                   1

branch: #  0; line:     2-    4; sop:     0; eop:     4
path #1: 0,

test.php 作为二进制表示:

debian:~ php -d apc.stat=0 -r "
  require '/root/test.php'; 
  echo PHP_EOL; 
  echo chunk_split(bin2hex(
    apc_bin_dump(array('/root/test.php'))
  ),64);
"

(跳过 test.php 的回显输出)

    b110000001000000325dedaa64d801bca2f73027abf0d5ab67f3023901000000
    2c0000000a000000871000000300000000000000000000004c0000005b000000
    8a0200008a020000650000002f726f6f742f746573742e7068700002070f9c00
    00000000000000000000000000000000000000000000000000000000000100fa
    000000fe00000005000000050000007c02000001000000100000000100000000
    00000000000000ffffffff0000000000000000000000000000000000000000ff
    ffffffeb00000000000000000000000000000000000000ffffffff0000000000
    00000001000000000000002f726f6f742f746573742e7068700001000000204a
    3308080000000000000000000000000000000000000008000000000000000000
    0000000000000000000008000000000000000000000000000000000000000000
    00000200000065000000204a3308040000000000000001000000000000000000
    00001000000000000000100000000100000006000000010000007a0200000100
    00000100000006000000000000000200000026000000204a3308080000000000
    0000000000000000000000000000080000000000000000000000000000000000
    0000080000000000000000000000000000000000000000000000030000006500
    0000900f34080800000000000000000000000000000000000000100000000000
    0000100000000100000006000000080000000000000000000000000000000000
    0000000000000300000028000000204a33080800000000000000000000000000
    00000000000001000000010000002c70d7b6010000000100d7b6080000000000
    000000000000000000000000000000000000040000003e000000610088020000
    01000000bd795900780000000000000000000000000000000000000000000000
[ ... a lot of lines just containing 0s ... ]
    0000000000000038000000c30000007f0000007a010000830000007c0200008f
    0000003c000000400000004400000008

现在我想了解更多有关操作码如何转换为二进制表示形式的信息。

编辑和澄清的问题:

操作码如何翻译成二进制版本? 你能看到 'a' 到 !0 的 ASSIGN 吗? ECHO 语句及其输出内容是否在某处?

我在二进制版本中发现了一些暗示操作码的逐行表示的模式。

(“2f726f6f742f746573742e706870”是“/root/test.php”的十六进制表示)

编辑

当行长度设置为 4 字节并在不同程序之间进行比较时,十六进制表示显示模式。

...
00000002  // 2 seems to be something like the "line number"
00000065  // seems to increase by 1 for every subsequent statement.
00000040  // 
06330808  // seems to mark the START of a statement
00000000
00000000
00000000
00000000
00000001  //
00000012  // In a program with three echo statements,
03000007  // this block was present three times. With mild
00000001  // changes that seem to represent the spot where
00000006  // the output-string is located.
00000008  //
00000000
00000000
00000000
00000000
00000000
00000002  // 2 seems to be something like the "line number"
00000028  //
00000020  //
4a330808  // seems to mark the END of a statement
00000000
00000000
00000000
00000000
00000008  // repeating between (echo-)statements
00000000
00000000
00000000
00000000
00000008  // repeating between (echo-)statements
...

但我对虚拟机如何在这种级别上工作的知识太薄弱,无法真正正确地分析它并将其链接到 C 代码。

编辑

PHP是否有像这样的虚拟机Java?

是 Zend 引擎可嵌入 PHP 之外吗?

test.php as plain text:

<?php
$x = "a";
echo $x;

test.php as opcode:

debian:~ php -d vld.active=1 -d vld.execute=0 -f test.php

Finding entry points
Branch analysis from position: 0
Return found
filename:       /root/test.php
function name:  (null)
number of ops:  5
compiled vars:  !0 = $x
line     # *  op                           fetch          ext  return  operands
---------------------------------------------------------------------------------
   2     0  >   EXT_STMT
         1      ASSIGN                                                   !0, 'a'
   3     2      EXT_STMT
         3      ECHO                                                     !0
   4     4    > RETURN                                                   1

branch: #  0; line:     2-    4; sop:     0; eop:     4
path #1: 0,

test.php as binary representation:

debian:~ php -d apc.stat=0 -r "
  require '/root/test.php'; 
  echo PHP_EOL; 
  echo chunk_split(bin2hex(
    apc_bin_dump(array('/root/test.php'))
  ),64);
"

(skipping test.php's echo-output)

    b110000001000000325dedaa64d801bca2f73027abf0d5ab67f3023901000000
    2c0000000a000000871000000300000000000000000000004c0000005b000000
    8a0200008a020000650000002f726f6f742f746573742e7068700002070f9c00
    00000000000000000000000000000000000000000000000000000000000100fa
    000000fe00000005000000050000007c02000001000000100000000100000000
    00000000000000ffffffff0000000000000000000000000000000000000000ff
    ffffffeb00000000000000000000000000000000000000ffffffff0000000000
    00000001000000000000002f726f6f742f746573742e7068700001000000204a
    3308080000000000000000000000000000000000000008000000000000000000
    0000000000000000000008000000000000000000000000000000000000000000
    00000200000065000000204a3308040000000000000001000000000000000000
    00001000000000000000100000000100000006000000010000007a0200000100
    00000100000006000000000000000200000026000000204a3308080000000000
    0000000000000000000000000000080000000000000000000000000000000000
    0000080000000000000000000000000000000000000000000000030000006500
    0000900f34080800000000000000000000000000000000000000100000000000
    0000100000000100000006000000080000000000000000000000000000000000
    0000000000000300000028000000204a33080800000000000000000000000000
    00000000000001000000010000002c70d7b6010000000100d7b6080000000000
    000000000000000000000000000000000000040000003e000000610088020000
    01000000bd795900780000000000000000000000000000000000000000000000
[ ... a lot of lines just containing 0s ... ]
    0000000000000038000000c30000007f0000007a010000830000007c0200008f
    0000003c000000400000004400000008

Now I want to find out more about how the opcode translates to the binary representation.

The edited and clarified question:

How is the opcode translated into the binary version?
Can you see there the ASSIGN of 'a' to !0?
Is in there somewhere the ECHO statement and what it outputs?

I found few patterns in the binary version that hint at a line by line representation of the opcode.

("2f726f6f742f746573742e706870" is the hexadecimal representation of "/root/test.php")

EDIT:

the hexadecimal representation reveals patterns when the the line-length is set to 4 bytes and compared between different programs.

...
00000002  // 2 seems to be something like the "line number"
00000065  // seems to increase by 1 for every subsequent statement.
00000040  // 
06330808  // seems to mark the START of a statement
00000000
00000000
00000000
00000000
00000001  //
00000012  // In a program with three echo statements,
03000007  // this block was present three times. With mild
00000001  // changes that seem to represent the spot where
00000006  // the output-string is located.
00000008  //
00000000
00000000
00000000
00000000
00000000
00000002  // 2 seems to be something like the "line number"
00000028  //
00000020  //
4a330808  // seems to mark the END of a statement
00000000
00000000
00000000
00000000
00000008  // repeating between (echo-)statements
00000000
00000000
00000000
00000000
00000008  // repeating between (echo-)statements
...

But my knowledge of how virtual machines work on such a level is too weak to be able to really analyze that propperly and link it to the C code.

EDIT:

Does PHP have a virtual machine like Java?

Is the Zend engine embeddable outside of PHP?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

野鹿林 2024-12-26 03:36:16

好问题...

更新:操作码直接由 PHP 虚拟机(Zend 引擎)执行。看起来好像它们是由 ./Zend/zend_vm_execute.h 中定义的不同处理函数执行的,

请参阅 Zend 引擎的架构,了解有关 Zend 操作码如何执行的更多信息。

这些资源可能会有所帮助:

http://php.net/manual/en /internals2.opcodes.list.php

http://www.php.net/manual/en/internals2.opcodes.ops.php

另外,我将查看 PECL VLD 源以获取更多线索...

http://pecl.php.net/package/vld

http://derickrethans.nl/projects.html#vld

另外,撰写 VLD Pecl 的作者扩展可能有帮助:
Derick Rethans、Andrei Zmievski 或 Marcus Börger

他们的电子邮件地址位于扩展源中 srm_oparray.c 的顶部。

更新:找到更多线索

在 PHP 5.3.8 中,我发现了执行操作码的三个线索:

./Zend/zend_execute.c:1270 
ZEND_API void execute_internal

./Zend/zend.c:1214:ZEND_API int zend_execute_scripts(int type TSRMLS_DC, zval **retval, int file_count, ...)
./Zend/zend.c:1236:                  zend_execute(EG(active_op_array) TSRMLS_CC);

./Zend/zend_vm_gen.php

我找不到 zend_execute() 的定义,但我猜测它可能是用 ./zend_vm_gen.php 生成的

我想我找到了它...

./Zend/zend_vm_execute.h:42
ZEND_API void execute(zend_op_array *op_array TSRMLS_DC)

我可能是错的,但看起来所有操作码处理程序也在 ./Zend/zend_vm_execute.h 中定义。

请参阅 ./Zend/zend_vm_execute.h:2413 查看“整数加法”操作码的示例。

Great question...

UPDATE: opcodes are executed directly by the PHP Virtual Machine (the Zend Engine). It looks as though they're executed by different handler functions defined in ./Zend/zend_vm_execute.h

See the architecture of the Zend Engine for more info on how Zend opcodes are executed.

These resources might help a bit:

http://php.net/manual/en/internals2.opcodes.list.php

http://www.php.net/manual/en/internals2.opcodes.ops.php

Also, I'm going to checkout the PECL VLD Source for more clues...

http://pecl.php.net/package/vld

http://derickrethans.nl/projects.html#vld

Also, writing the authors of the VLD Pecl extension may help:
Derick Rethans, Andrei Zmievski or Marcus Börger

Their email addresses are at the top of srm_oparray.c in the extension source.

UPDATE: found some more clues

In PHP 5.3.8, I found three leads for where the opcodes are executed:

./Zend/zend_execute.c:1270 
ZEND_API void execute_internal

./Zend/zend.c:1214:ZEND_API int zend_execute_scripts(int type TSRMLS_DC, zval **retval, int file_count, ...)
./Zend/zend.c:1236:                  zend_execute(EG(active_op_array) TSRMLS_CC);

./Zend/zend_vm_gen.php

I couldn't find the definition for zend_execute(), but I'm guessing it might be generated with ./zend_vm_gen.php

I think I found it...

./Zend/zend_vm_execute.h:42
ZEND_API void execute(zend_op_array *op_array TSRMLS_DC)

I could be wrong, but it looks like all of the opcode handlers are defined in ./Zend/zend_vm_execute.h too.

See ./Zend/zend_vm_execute.h:2413 for an example of what looks to the be "integer addition" opcode.

忆离笙 2024-12-26 03:36:16

apc_bin_dump() 返回内存中缓存条目的原始表示形式。

它返回 apc_bd_t 的内容结构

该结构是 ap​​c_bd_entry_t 数组,带有一些用于错误检测的校验和。

apc_bd_entry_t 包含 apc_cache_entry_value_t

您可以查看apc_bin_dump和<一个href="http://svn.php.net/viewvc/pecl/apc/trunk/apc_bin.c?revision=307048&view=markup#l648" rel="nofollow">apc_bin_load 内部函数查看如何转储和加载。

apc_bin_dump() returns the raw representation of an in-memory cache entry.

It returns the content of a apc_bd_t struct.

This struct is an array of apc_bd_entry_t with some checksums for error detection.

apc_bd_entry_t contains a apc_cache_entry_value_t.

You can look at apc_bin_dump and apc_bin_load internal functions to see how dump and load are made.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文