我想调试挂起的 ruby 进程,但是每次执行以下步骤时我的内核都会崩溃:
我正在重新运行挂起的 ruby 脚本(vagrant):
$ vagrant destroy -f && VAGRANT_LOG=info vagrant up ... INFO ssh: Setting SSH_AUTH_SOCK remotely: /tmp/ssh-FP9viQfD9J/agent.1425 (process hanging) load: 2.48 cmd: ruby 1384 waiting 2.90u 0.62s
我正在连接
lldb
到正在运行的进程(在不同的终端上):$ lldb -p $(pgrep -fn ruby)
弹出:Developer Tools Access 需要控制另一个进程才能继续调试。输入您的密码。继续。
(lldb) process attach --pid 1442 Process 1442 stopped Executable module set to "/opt/vagrant/bin/../embedded/bin/ruby". Architecture set to: x86_64-apple-macosx. (lldb) thread list Process 1442 stopped * thread #1: tid = 0x2427, 0x00007fff84f3f716 libsystem_kernel.dylib`__psynch_cvwait + 10, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP thread #2: tid = 0x245d, 0x00007fff84f3f9aa libsystem_kernel.dylib`__select + 10 thread #3: tid = 0x2a8d, 0x00007fff84f4094a libsystem_kernel.dylib`poll + 10 thread #4: tid = 0x2a9a, 0x00007fff84f3f716 libsystem_kernel.dylib`__psynch_cvwait + 10
我正在调用
rb_backtrace()
线程 #1:(lldb) call (void)rb_backtrace()
它按预期为我打印 ruby 回溯(在进程前台):
from /opt/vagrant/embedded/gems/gems/vagrant-1.7.2/plugins/communicators/ssh/communicator.rb:566:in `block in shell_execute' from /opt/vagrant/embedded/gems/gems/vagrant-1.7.2/plugins/communicators/ssh/communicator.rb:566:in `loop'
由于 ruby 使用不同的线程,我想对线程 #2(失败)执行相同的操作:
(lldb) thread select 2 * thread #2: tid = 0x245d, 0x00007fff84f3f9aa libsystem_kernel.dylib`__select + 10 frame #0: 0x00007fff84f3f9aa libsystem_kernel.dylib`__select + 10 libsystem_kernel.dylib`__select + 10: -> 0x7fff84f3f9aa: jae 0x7fff84f3f9b4 ; __select + 20 0x7fff84f3f9ac: movq %rax, %rdi 0x7fff84f3f9af: jmp 0x7fff84f3c19a ; cerror 0x7fff84f3f9b4: retq (lldb) thread list Process 1442 stopped thread #1: tid = 0x2427, 0x00007fff84f3f716 libsystem_kernel.dylib`__psynch_cvwait + 10, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP * thread #2: tid = 0x245d, 0x00007fff84f3f9aa libsystem_kernel.dylib`__select + 10 thread #3: tid = 0x2a8d, 0x00007fff84f4094a libsystem_kernel.dylib`poll + 10 thread #4: tid = 0x2a9a, 0x00007fff84f3f716 libsystem_kernel.dylib`__psynch_cvwait + 10 (lldb) call (void)rb_backtrace()
内核紧急屏幕:您需要重新启动计算机。
此时内核崩溃,在日志中发现以下回溯:
panic(cpu 2 caller 0xffffff80022dcc1d): Kernel trap at 0xffffff80025f69e4, type 14=page fault, registers: Fault CR2: 0x0000000001f6d695, Error code: 0x0000000000000000, Fault CPU: 0x2 Backtrace (CPU 2), Frame : Return Address 0xffffff81f0ca3aa0 : 0xffffff8002223139 0xffffff81f0ca3b20 : 0xffffff80022dcc1d ... BSD process name corresponding to current thread: ruby Mac OS version: 13F1077 Kernel version: Darwin Kernel Version 13.4.0: Wed Mar 18 16:20:14 PDT 2015; root:xnu-2422.115.14~1/RELEASE_X86_64 last loaded kext at 244023643290: com.apple.filesystems.msdosfs 1.9 (addr 0xffffff7f8308c000, size 65536) loaded kexts: foo.tap 1.0 foo.tun 1.0 org.virtualbox.kext.VBoxNetAdp 4.3.12 org.virtualbox.kext.VBoxNetFlt 4.3.12 org.virtualbox.kext.VBoxUSB 4.3.12 org.virtualbox.kext.VBoxDrv 4.3.12 com.apple.filesystems.msdosfs 1.9 ...
这已经发生了两次。
内核崩溃的原因是什么?这是一个lldb
/kernel
错误,还是由于做错了什么而导致的预期行为?如果是这样,调用rb_backtrace()
不同线程(不会使内核崩溃)的正确安全方法应该是什么?
为了澄清起见,运行 ruby 脚本 (vagrant) 时没有sudo
,与lldb
.
我正在使用 lldb-320.4.160。