为什么 darwin 内核在调试进程时在通过 lldb 调用函数时崩溃?

逆向工程 调试 操作系统 内核模式 调用栈 数据库
2021-06-18 08:29:51

我想调试挂起的 ruby​​ 进程,但是每次执行以下步骤时我的内核都会崩溃:

  1. 我正在重新运行挂起的 ruby​​ 脚本(vagrant):

    $ vagrant destroy -f && VAGRANT_LOG=info vagrant up
    ...
    INFO ssh: Setting SSH_AUTH_SOCK remotely: /tmp/ssh-FP9viQfD9J/agent.1425
    (process hanging)
    load: 2.48  cmd: ruby 1384 waiting 2.90u 0.62s
    
  2. 我正在连接lldb到正在运行的进程(在不同的终端上):

    $ lldb -p $(pgrep -fn ruby)
    

    弹出:Developer Tools Access 需要控制另一个进程才能继续调试。输入您的密码。继续。

    (lldb) process attach --pid 1442
    Process 1442 stopped
    Executable module set to "/opt/vagrant/bin/../embedded/bin/ruby".
    Architecture set to: x86_64-apple-macosx.
    
    (lldb) thread list
    Process 1442 stopped
    * thread #1: tid = 0x2427, 0x00007fff84f3f716 libsystem_kernel.dylib`__psynch_cvwait + 10, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP
      thread #2: tid = 0x245d, 0x00007fff84f3f9aa libsystem_kernel.dylib`__select + 10
      thread #3: tid = 0x2a8d, 0x00007fff84f4094a libsystem_kernel.dylib`poll + 10
      thread #4: tid = 0x2a9a, 0x00007fff84f3f716 libsystem_kernel.dylib`__psynch_cvwait + 10
    
  3. 我正在调用rb_backtrace()线程 #1:

    (lldb) call (void)rb_backtrace()
    

    它按预期为我打印 ruby​​ 回溯(在进程前台):

    from /opt/vagrant/embedded/gems/gems/vagrant-1.7.2/plugins/communicators/ssh/communicator.rb:566:in `block in shell_execute'
    from /opt/vagrant/embedded/gems/gems/vagrant-1.7.2/plugins/communicators/ssh/communicator.rb:566:in `loop'
    
  4. 由于 ruby​​ 使用不同的线程,我想对线程 #2(失败)执行相同的操作:

    (lldb) thread select 2
    * thread #2: tid = 0x245d, 0x00007fff84f3f9aa libsystem_kernel.dylib`__select + 10
        frame #0: 0x00007fff84f3f9aa libsystem_kernel.dylib`__select + 10
    libsystem_kernel.dylib`__select + 10:
    -> 0x7fff84f3f9aa:  jae    0x7fff84f3f9b4            ; __select + 20
       0x7fff84f3f9ac:  movq   %rax, %rdi
       0x7fff84f3f9af:  jmp    0x7fff84f3c19a            ; cerror
       0x7fff84f3f9b4:  retq   
    
    (lldb) thread list
    Process 1442 stopped
      thread #1: tid = 0x2427, 0x00007fff84f3f716 libsystem_kernel.dylib`__psynch_cvwait + 10, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP
    * thread #2: tid = 0x245d, 0x00007fff84f3f9aa libsystem_kernel.dylib`__select + 10
      thread #3: tid = 0x2a8d, 0x00007fff84f4094a libsystem_kernel.dylib`poll + 10
      thread #4: tid = 0x2a9a, 0x00007fff84f3f716 libsystem_kernel.dylib`__psynch_cvwait + 10
    
    (lldb) call (void)rb_backtrace()
    

    内核紧急屏幕:您需要重新启动计算机。

  5. 此时内核崩溃,在日志中发现以下回溯:

    panic(cpu 2 caller 0xffffff80022dcc1d): Kernel trap at 0xffffff80025f69e4, type 14=page fault, registers:
    Fault CR2: 0x0000000001f6d695, Error code: 0x0000000000000000, Fault CPU: 0x2
    Backtrace (CPU 2), Frame : Return Address
    0xffffff81f0ca3aa0 : 0xffffff8002223139 
    0xffffff81f0ca3b20 : 0xffffff80022dcc1d 
    ...
    BSD process name corresponding to current thread: ruby
    
    Mac OS version:
    13F1077
    
    Kernel version:
    Darwin Kernel Version 13.4.0: Wed Mar 18 16:20:14 PDT 2015; root:xnu-2422.115.14~1/RELEASE_X86_64
    last loaded kext at 244023643290: com.apple.filesystems.msdosfs 1.9 (addr 0xffffff7f8308c000, size 65536)
    loaded kexts:
    foo.tap 1.0
    foo.tun 1.0
    org.virtualbox.kext.VBoxNetAdp  4.3.12
    org.virtualbox.kext.VBoxNetFlt  4.3.12
    org.virtualbox.kext.VBoxUSB 4.3.12
    org.virtualbox.kext.VBoxDrv 4.3.12
    com.apple.filesystems.msdosfs   1.9
    ...
    

这已经发生了两次。

内核崩溃的原因是什么?这是一个lldb/kernel错误,还是由于做错了什么而导致的预期行为?如果是这样,调用rb_backtrace()不同线程(不会使内核崩溃)的正确安全方法应该是什么

为了澄清起见,运行 ruby​​ 脚本 (vagrant) 时没有sudo,与lldb.

我正在使用 lldb-320.4.160。

0个回答
没有发现任何回复~