Python 字节码反汇编器#
参考:
先看给出函数示例:
def myfunc(alist):
return len(alist)
下面的命令可以用来显示 myfunc()
的反汇编:
import dis
dis.dis(myfunc)
2 0 LOAD_GLOBAL 0 (len)
2 LOAD_FAST 0 (alist)
4 CALL_FUNCTION 1
6 RETURN_VALUE
左上角的 2
是行号。
字节码分析#
字节码分析 API 允许将 Python 代码片段包装在 Bytecode
对象中,以便轻松访问已编译代码的详细信息。
bytecode = dis.Bytecode(myfunc)
for instr in bytecode:
print(instr.opname)
LOAD_GLOBAL
LOAD_FAST
CALL_FUNCTION
RETURN_VALUE
字节码#
使用三方库 bytecode
。
安装:
pip install bytecode
抽象字节码#
下面使用抽象字节码执行 print('Hello World!')
:
from bytecode import Instr, Bytecode
bytecode = Bytecode([Instr("LOAD_NAME", 'print'),
Instr("LOAD_CONST", 'Hello World!'),
Instr("CALL_FUNCTION", 1),
Instr("POP_TOP"),
Instr("LOAD_CONST", None),
Instr("RETURN_VALUE")])
code = bytecode.to_code()
exec(code)
Hello World!
具体字节码#
使用具体字节码执行 print('Hello World!')
的示例:
from bytecode import ConcreteInstr, ConcreteBytecode
bytecode = ConcreteBytecode()
bytecode.names = ['print']
bytecode.consts = ['Hello World!', None]
bytecode.extend([ConcreteInstr("LOAD_NAME", 0),
ConcreteInstr("LOAD_CONST", 0),
ConcreteInstr("CALL_FUNCTION", 1),
ConcreteInstr("POP_TOP"),
ConcreteInstr("LOAD_CONST", 1),
ConcreteInstr("RETURN_VALUE")])
code = bytecode.to_code()
exec(code)
Hello World!
设置编译器标志#
Bytecode
、ConcreteBytecode
和 ControlFlowGraph
实例都有 flags
属性,它是 CompilerFlag
枚举的实例。可以像操作任何二进制标志一样操作该值。
设置 OPTIMIZED
flags:
from bytecode import Bytecode, CompilerFlags
bytecode = Bytecode()
bytecode.flags |= CompilerFlags.OPTIMIZED
清除 OPTIMIZED
flags:
from bytecode import Bytecode, CompilerFlags
bytecode = Bytecode()
bytecode.flags ^= CompilerFlags.OPTIMIZED
可以使用 update_flags
方法根据存储在代码对象中的指令更新 flags。
简单的循环#
for x in (1, 2, 3): print(x)
:
from bytecode import Label, Instr, Bytecode
loop_start = Label()
loop_done = Label()
loop_exit = Label()
code = Bytecode(
[
# Python 3.8 removed SETUP_LOOP
Instr("LOAD_CONST", (1, 2, 3)),
Instr("GET_ITER"),
loop_start,
Instr("FOR_ITER", loop_exit),
Instr("STORE_NAME", "x"),
Instr("LOAD_NAME", "print"),
Instr("LOAD_NAME", "x"),
Instr("CALL_FUNCTION", 1),
Instr("POP_TOP"),
Instr("JUMP_ABSOLUTE", loop_start),
# Python 3.8 removed the need to manually manage blocks in loops
# This is now handled internally by the interpreter
loop_exit,
Instr("LOAD_CONST", None),
Instr("RETURN_VALUE"),
]
)
# The conversion to Python code object resolve jump targets:
# abstract labels are replaced with concrete offsets
code = code.to_code()
exec(code)
1
2
3
条件调整#
print('yes' if test else 'no')
:
from bytecode import Label, Instr, Bytecode
label_else = Label()
label_print = Label()
bytecode = Bytecode([Instr('LOAD_NAME', 'print'),
Instr('LOAD_NAME', 'test'),
Instr('POP_JUMP_IF_FALSE', label_else),
Instr('LOAD_CONST', 'yes'),
Instr('JUMP_FORWARD', label_print),
label_else,
Instr('LOAD_CONST', 'no'),
label_print,
Instr('CALL_FUNCTION', 1),
Instr('LOAD_CONST', None),
Instr('RETURN_VALUE')])
code = bytecode.to_code()
test = 0
exec(code)
test = 1
exec(code)
no
yes
Control Flow Graph (CFG)#
为了分析或优化现有的代码,bytecode
提供了 ControlFlowGraph
类,它是控制流图(CFG)。
控制流图用于在转换为代码时进行堆栈深度分析。因为它比 CPython 更擅长识别死代码,所以它可以减少堆栈大小。
转储条件跳转示例的控制流图:
from bytecode import Label, Instr, Bytecode, ControlFlowGraph, dump_bytecode
label_else = Label()
label_print = Label()
bytecode = Bytecode([Instr('LOAD_NAME', 'print'),
Instr('LOAD_NAME', 'test'),
Instr('POP_JUMP_IF_FALSE', label_else),
Instr('LOAD_CONST', 'yes'),
Instr('JUMP_FORWARD', label_print),
label_else,
Instr('LOAD_CONST', 'no'),
label_print,
Instr('CALL_FUNCTION', 1),
Instr('LOAD_CONST', None),
Instr('RETURN_VALUE')])
blocks = ControlFlowGraph.from_bytecode(bytecode)
dump_bytecode(blocks)
block1:
LOAD_NAME 'print'
LOAD_NAME 'test'
POP_JUMP_IF_FALSE <block3>
-> block2
block2:
LOAD_CONST 'yes'
JUMP_FORWARD <block4>
block3:
LOAD_CONST 'no'
-> block4
block4:
CALL_FUNCTION 1
LOAD_CONST None
RETURN_VALUE
备注
block #1
是开始块,以POP_JUMP_IF_FALSE
条件跳转结束,跟着的是block #2
block #2
以JUMP_FORWARD
无条件跳跃结束block #3
不包含 jump,后面跟着block #4
block #4
是最终的块