Re: Unexpectedly high latencies are showed by self-written script (bpf_ktime_get_ns issue?)


Yonghong Song
 

On Mon, Nov 26, 2018 at 4:29 AM Aleksei Zakharov <zakharov.a.g@...> wrote:

Hi, everyone!
I've written a small script using bcc python lib for qemu block io latency tracing.
I attach two uprobes to block_acct_start and block_acct_done.

On some servers i can see unexpectedly high latencies:
~# ./qemuioslower 40000
Tracing for qemu I/O... Ctrl-C to end
TIME LATENCY SIZE(KB) IO TYPE
0.0 18446744073709551 8192 write
9.7294584 18446744073709551 0 flush
24.815983 18446744073709551 4096 write
25.422378 18446744073709551 0 flush

For me it looks like bpf_ktime_get_ns in stop() function returned time that is less, than bpf_ktime_get_ns() in start(), but why does this happen?
May someone help me with understanding this behavior?
Maybe you want to check with bcc issue
https://github.com/iovisor/bcc/issues/728.
Basically there is a kernel bug related to bpf_ktime_get_ns() and it
is fixed in 4.9. Not sure about your kernel version.


BPF code:
#include <uapi/linux/ptrace.h>
BPF_HASH(req, u64);
enum BlockAcctType {
BLOCK_ACCT_READ,
BLOCK_ACCT_WRITE,
BLOCK_ACCT_FLUSH,
BLOCK_MAX_IOTYPE,
};
struct data_t {
u64 ts;
u64 lat;
u64 bytes;
enum BlockAcctType type;
};
typedef struct BlockAcctCookie {
int64_t bytes;
int64_t start_time_ns;
enum BlockAcctType type;
} BlockAcctCookie;
BPF_PERF_OUTPUT(events);

void start(struct pt_regs *ctx) {
u32 pid = bpf_get_current_pid_tgid();
if (FILTER_PID)
return;
u64 key = 0;
bpf_probe_read(&key, sizeof(key), (void *)PT_REGS_PARM1(ctx));
u64 ts = bpf_ktime_get_ns();
req.update(&key, &ts);
}

void stop(struct pt_regs *ctx,void *first, struct BlockAcctCookie *cookie) {
struct data_t data = {};
u64 key = 0;
bpf_probe_read(&key, sizeof(key), (void *)PT_REGS_PARM1(ctx));
data.ts = bpf_ktime_get_ns();
data.type = cookie->type;
data.bytes = cookie->bytes;
u64 *s_time = req.lookup(&key);
if (s_time != 0) {
data.lat = (data.ts - *s_time);
data.lat /= 1000;
if (data.lat > MIN_US) {
events.perf_submit(ctx, &data, sizeof(data));
}
req.delete(&key);
}
}

Kernel: Linux hostname 4.15.0-36-generic #39~16.04.1-Ubuntu SMP Tue Sep 25 08:59:23 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
bcc packages versions:
bcc-tools 0.7.0-1
libbcc 0.7.0-1
libcc1-0:amd64 5.4.0-6ubuntu1~16.04.10
python-bcc 0.7.0-1


--
Regards,
Aleksei Zakharov



Join iovisor-dev@lists.iovisor.org to automatically receive all group messages.