Re: upprobe speedups?


Yonghong Song
 

Frederik,

The issue you are facing on prohibitive uprobe performance (even for
sampling) has been discussed in this forum. The idea is to adopt maybe
similar mechanism like dyninst or kerninst. We (me, Brenden, and maybe
others) are starting to exploring in this space.

Yonghong

On Mon, Jul 17, 2017 at 4:37 PM, Frederik Deweerdt via iovisor-dev
<iovisor-dev@...> wrote:
Hello,

I'm trying to use uprobes in order to be able to keep track of how
long it takes to execute a given function in a shared library used by
tens of thousands of threads on a 32 core machine. Unfortunately, i'm
seeing a 3x slowdown when setting the uprobes (even ones that only do
a `return 0;` in the body).

Looking at a perf record, it looks like lock contention is the
culprit: queued_spin_lock_slowpath accounts for more than 20% of the
workload.

I'm wondering if there's a way around that, for my use case sampling
would be an option, but I haven't found a way to do so without
actually entering the probe (which has a prohibitive cost by itself).

Any thoughts?
Frederik
_______________________________________________
iovisor-dev mailing list
iovisor-dev@...
https://lists.iovisor.org/mailman/listinfo/iovisor-dev

Join {iovisor-dev@lists.iovisor.org to automatically receive all group messages.