逆向工程 - 以编程方式对动态调度方法进行逆向工程 - 吾爱随笔录

以编程方式对动态调度方法进行逆向工程

逆向工程 x86 C++ 静态分析 qt

2021-07-01 21:36:53

背景：我想将 Qt RTTI 转换为符号，以便更轻松地导航大型可执行文件。

如果您还不知道，Qt 是一个基于消息传递系统的 C++ 应用程序框架。由于 C++ 很少有内省和反射的方式，这些是表达性消息传递系统的重要功能，Qt 附带了一个名为 moc（元对象编译器）的工具，它解析您的源文件并构建类、方法的索引, 需要在运行时枚举、解析等的属性等。遗憾的是，moc 生成的元数据针对运行时访问进行了优化（部分原因是 C++ 语言的限制），并且对静态分析略有敌意。特别是，您会期望像这样的方法表（伪代码）：

methods = {
    "frob" -> &frob,
    "fuzz" -> &fuzz
}

但是由于一些无聊的原因， moc 生成了一个 dispatch 方法（再次使用伪代码）：

dispatch(method, args) {
    switch(method) {
        case "frob": return frob(args)
        case "fuzz": return fuzz(args)
        default: return -1
    }
}

不用说，反编译即使是简单的、机器生成的代码也比分析静态数据要困难得多。

从概念上讲，这很简单：找到所有调度函数，并在模拟器中运行它们，直到执行调用或返回；从调用操作码中提取方法实现地址，使用方法名称写入符号/映射文件。我已经有一个快速的 hack 来识别调度函数（一个被黑客入侵的 QtCore4.dll，它挂钩所有对象并转储它们的元数据），但我不知道用什么来反编译它们。

问题：您会推荐哪些（免费）工具以编程方式执行此操作？至少，我需要一个 PE 加载器和一个 x86 模拟器，首选 Python。

我被指出angr，它令人印象深刻，除其他外，将代码转换为独立于平台的 IR，这增加了我实际上可以将代码作为通用工具发布的机会，但 angr 似乎旨在完成完整的与我需要的相反。不仅它的文档稀少且难以理解，而且考虑到它旨在从代码中获取数据（而且我已经有了数据！），它似乎对我的用例过度设计，即使我弄清楚了也可能会慢得难以忍受。

2个回答

哇，我真的想出了足够的 angr 来做到这一点。考虑这个 C 程序（qt_metacall 的一个极其简化的比例模型）：

#include <stdio.h>

int foo(void) {
    return puts("foo");
}

int bar(void) {
    return puts("bar");
}

int baz(void) {
    return puts("baz");
}

int frob(int n) {
    switch (n) {
    case 1:
        return foo();
    case 2:
        return bar();
    case 3:
        return baz();
    default:
        return -1;
    }
}

int main(int argc, char **argv) {
    return frob(argc) < 0;
}

我们编译它，我们使用angr来确定frob的地址，像这样：

import angr
b = angr.Project('a.out', load_options={'auto_load_libs':False})
b.analyses.CFG()

for addr in [f.addr for f in b.kb.functions.values() if f.name == 'frob']:
    print hex(addr)

（在我的用例中，我被黑的 QtCore4.dll 将提供要反汇编的 qt_metacall 方法列表）

然后这个脚本，作为 script.py [可执行文件] [调度程序函数的地址] [方法索引] 调用，将打印具有指定索引的方法的地址：

import angr

def main(argv):
    executable=argv[1]
    dispatcher=int(argv[2], 0)
    method_index=int(argv[3], 0)

    # Load the executable
    b = angr.Project(executable, load_options={'auto_load_libs': False})

    # Prepare a call to the dispatcher function, with the method index as its argument
    state = b.factory.call_state(dispatcher, method_index)

    # Isn't there an easier way to make a closure in Python?!
    class CallAddr:
        value = None

        def on_exit(self, state):
            # When the code performs a call, we've found the method that corresponds to the index
            if state.inspect.exit_jumpkind == 'Ijk_Call':
                # Resolve the address of the exit target, that's our method
                self.value = state.se.any_int(state.inspect.exit_target)

    method_addr = CallAddr()

    # Install breakpoint to analyze "exits" (i.e. jumps)
    state.inspect.b('exit', action=method_addr.on_exit)

    # Step through the dispatcher function
    p = b.factory.path(state)
    p.step()

    # Keep running until either a conditional or a call
    while len(p.successors) == 1 and method_addr.value is None:
        p = p.successors[0]
        p.step()

    # No call was performed, method not found
    if method_addr.value is None:
        return 1

    print hex(method_addr.value)
    return 0

if __name__ == "__main__":
    import sys
    sys.exit(main(sys.argv))

考虑到 angr 有多么强大（以及我使用的它有多么少），感觉就像用核弹杀死蚂蚁，但它确实有效

Daniel Pistelli的文章“Qt Internals & Reversing”描述了 Qt 如何在底层工作，以及如何从元数据（静态）重建槽和方法。就 Qt5 而言，它可能有点过时，但应该是一个很好的起点。

其它你可能感兴趣的问题

上一篇Dylib 可以在没有代码登录的情况下在 iOS 中运行吗？下一篇卡在 x87 FLD 指令中