Hi,
I'm trying to read perf counters using bpf. However, adding BPF_PERF_ARRAY reports error:
bpf: Invalid argument unrecognized bpf_ld_imm64 inns
Is there an example/sample to read perf counters that I can follow? The code below is what I'm trying to execute.
Thanks, Riya
# load BPF program
bpf_text = """
#include <uapi/linux/ptrace.h>
BPF_PERF_ARRAY(my_map, 32);
int start_counting(struct pt_regs *ctx) {
if (!PT_REGS_PARM1(ctx))
return 0;
u64 count;
u32 key = bpf_get_smp_processor_id();
count = bpf_perf_event_read(&my_map, key);
bpf_trace_printk("CPU-%d %llu", key, count);
return 0;
}
"""
|
So I fixed the error above by using "count = my_map.perf_read(key);" as opposed to "count = bpf_perf_event_read(&my_map, key);". However, how do I selectively enable counters (e.g. instructions, cache misses, etc.)?
Thanks, Riya
toggle quoted message
Show quoted text
On Mon, Jul 25, 2016 at 9:58 AM, riya khanna <riyakhanna1983@...> wrote: Hi,
I'm trying to read perf counters using bpf. However, adding BPF_PERF_ARRAY reports error:
bpf: Invalid argument unrecognized bpf_ld_imm64 inns
Is there an example/sample to read perf counters that I can follow? The code below is what I'm trying to execute.
Thanks, Riya
# load BPF program
bpf_text = """
#include <uapi/linux/ptrace.h>
BPF_PERF_ARRAY(my_map, 32);
int start_counting(struct pt_regs *ctx) {
if (!PT_REGS_PARM1(ctx))
return 0;
u64 count;
u32 key = bpf_get_smp_processor_id();
count = bpf_perf_event_read(&my_map, key);
bpf_trace_printk("CPU-%d %llu", key, count);
return 0;
}
"""
|
Brenden Blanco <bblanco@...>
This needs support in bcc. I had a patch laying around that I never finished, you can find the partial support here:
It shouldn't be too hard to finalize that, let me see what I can do.
toggle quoted message
Show quoted text
On Mon, Jul 25, 2016 at 4:11 PM, riya khanna via iovisor-dev <iovisor-dev@...> wrote: So I fixed the error above by using "count = my_map.perf_read(key);"
as opposed to "count = bpf_perf_event_read(&my_map, key);". However,
how do I selectively enable counters (e.g. instructions, cache misses,
etc.)?
Thanks,
Riya
On Mon, Jul 25, 2016 at 9:58 AM, riya khanna < riyakhanna1983@...> wrote:
> Hi,
>
> I'm trying to read perf counters using bpf. However, adding
> BPF_PERF_ARRAY reports error:
>
> bpf: Invalid argument
> unrecognized bpf_ld_imm64 inns
>
> Is there an example/sample to read perf counters that I can follow?
> The code below is what I'm trying to execute.
>
> Thanks,
> Riya
>
> # load BPF program
>
> bpf_text = """
>
> #include <uapi/linux/ptrace.h>
>
> BPF_PERF_ARRAY(my_map, 32);
>
> int start_counting(struct pt_regs *ctx) {
>
> if (!PT_REGS_PARM1(ctx))
>
> return 0;
>
> u64 count;
>
> u32 key = bpf_get_smp_processor_id();
>
> count = bpf_perf_event_read(&my_map, key);
>
> bpf_trace_printk("CPU-%d %llu", key, count);
>
> return 0;
>
> }
>
> """
_______________________________________________
iovisor-dev mailing list
iovisor-dev@...
https://lists.iovisor.org/mailman/listinfo/iovisor-dev
|
Thanks Brenden!
I will try with your changes. Meanwhile please let me know if you add missing functionality.
toggle quoted message
Show quoted text
On Mon, Jul 25, 2016 at 8:14 PM, Brenden Blanco <bblanco@...> wrote: This needs support in bcc.
I had a patch laying around that I never finished, you can find the partial support here: https://github.com/iovisor/bcc/tree/perf-counter
It shouldn't be too hard to finalize that, let me see what I can do.
On Mon, Jul 25, 2016 at 4:11 PM, riya khanna via iovisor-dev <iovisor-dev@...> wrote:
So I fixed the error above by using "count = my_map.perf_read(key);" as opposed to "count = bpf_perf_event_read(&my_map, key);". However, how do I selectively enable counters (e.g. instructions, cache misses, etc.)?
Thanks, Riya
On Mon, Jul 25, 2016 at 9:58 AM, riya khanna <riyakhanna1983@...> wrote:
Hi,
I'm trying to read perf counters using bpf. However, adding BPF_PERF_ARRAY reports error:
bpf: Invalid argument unrecognized bpf_ld_imm64 inns
Is there an example/sample to read perf counters that I can follow? The code below is what I'm trying to execute.
Thanks, Riya
# load BPF program
bpf_text = """
#include <uapi/linux/ptrace.h>
BPF_PERF_ARRAY(my_map, 32);
int start_counting(struct pt_regs *ctx) {
if (!PT_REGS_PARM1(ctx))
return 0;
u64 count;
u32 key = bpf_get_smp_processor_id();
count = bpf_perf_event_read(&my_map, key);
bpf_trace_printk("CPU-%d %llu", key, count);
return 0;
}
""" _______________________________________________ iovisor-dev mailing list iovisor-dev@... https://lists.iovisor.org/mailman/listinfo/iovisor-dev
|
From your patches I see that perf support is enabled per-cpu. Could this be extended to enabling all or a group of perf counters on all CPU cores similar to what perf_event_open provides (with args -1)?
toggle quoted message
Show quoted text
On Mon, Jul 25, 2016 at 9:55 PM, riya khanna <riyakhanna1983@...> wrote: Thanks Brenden!
I will try with your changes. Meanwhile please let me know if you add missing functionality.
On Mon, Jul 25, 2016 at 8:14 PM, Brenden Blanco <bblanco@...> wrote:
This needs support in bcc.
I had a patch laying around that I never finished, you can find the partial support here: https://github.com/iovisor/bcc/tree/perf-counter
It shouldn't be too hard to finalize that, let me see what I can do.
On Mon, Jul 25, 2016 at 4:11 PM, riya khanna via iovisor-dev <iovisor-dev@...> wrote:
So I fixed the error above by using "count = my_map.perf_read(key);" as opposed to "count = bpf_perf_event_read(&my_map, key);". However, how do I selectively enable counters (e.g. instructions, cache misses, etc.)?
Thanks, Riya
On Mon, Jul 25, 2016 at 9:58 AM, riya khanna <riyakhanna1983@...> wrote:
Hi,
I'm trying to read perf counters using bpf. However, adding BPF_PERF_ARRAY reports error:
bpf: Invalid argument unrecognized bpf_ld_imm64 inns
Is there an example/sample to read perf counters that I can follow? The code below is what I'm trying to execute.
Thanks, Riya
# load BPF program
bpf_text = """
#include <uapi/linux/ptrace.h>
BPF_PERF_ARRAY(my_map, 32);
int start_counting(struct pt_regs *ctx) {
if (!PT_REGS_PARM1(ctx))
return 0;
u64 count;
u32 key = bpf_get_smp_processor_id();
count = bpf_perf_event_read(&my_map, key);
bpf_trace_printk("CPU-%d %llu", key, count);
return 0;
}
""" _______________________________________________ iovisor-dev mailing list iovisor-dev@... https://lists.iovisor.org/mailman/listinfo/iovisor-dev
|
I'm testing perf counters on a 8-core machine.
since BPF_PERF_ARRAY.perf_read(cpu) reads from local CPU, I'm aggregating counters across all cpus by doing:
BPF_PERF_ARRAY(counter, 32);
for (key = 0; key < 8; key++) counter.perf_read(key);
However, this reports error:
bpf: Invalid argument back-edge from insn 69 to 17
If I loop from 0-4, it works. The code below works: for (key = 0; key < 4; key++) counter.perf_read(key);
What could be wrong here?
toggle quoted message
Show quoted text
On Tue, Jul 26, 2016 at 7:29 PM, riya khanna <riyakhanna1983@...> wrote: From your patches I see that perf support is enabled per-cpu. Could this be extended to enabling all or a group of perf counters on all CPU cores similar to what perf_event_open provides (with args -1)?
On Mon, Jul 25, 2016 at 9:55 PM, riya khanna <riyakhanna1983@...> wrote:
Thanks Brenden!
I will try with your changes. Meanwhile please let me know if you add missing functionality.
On Mon, Jul 25, 2016 at 8:14 PM, Brenden Blanco <bblanco@...> wrote:
This needs support in bcc.
I had a patch laying around that I never finished, you can find the partial support here: https://github.com/iovisor/bcc/tree/perf-counter
It shouldn't be too hard to finalize that, let me see what I can do.
On Mon, Jul 25, 2016 at 4:11 PM, riya khanna via iovisor-dev <iovisor-dev@...> wrote:
So I fixed the error above by using "count = my_map.perf_read(key);" as opposed to "count = bpf_perf_event_read(&my_map, key);". However, how do I selectively enable counters (e.g. instructions, cache misses, etc.)?
Thanks, Riya
On Mon, Jul 25, 2016 at 9:58 AM, riya khanna <riyakhanna1983@...> wrote:
Hi,
I'm trying to read perf counters using bpf. However, adding BPF_PERF_ARRAY reports error:
bpf: Invalid argument unrecognized bpf_ld_imm64 inns
Is there an example/sample to read perf counters that I can follow? The code below is what I'm trying to execute.
Thanks, Riya
# load BPF program
bpf_text = """
#include <uapi/linux/ptrace.h>
BPF_PERF_ARRAY(my_map, 32);
int start_counting(struct pt_regs *ctx) {
if (!PT_REGS_PARM1(ctx))
return 0;
u64 count;
u32 key = bpf_get_smp_processor_id();
count = bpf_perf_event_read(&my_map, key);
bpf_trace_printk("CPU-%d %llu", key, count);
return 0;
}
""" _______________________________________________ iovisor-dev mailing list iovisor-dev@... https://lists.iovisor.org/mailman/listinfo/iovisor-dev
|
Brenden Blanco <bblanco@...>
|
Thanks Brenden!
I'm working with your branch for now. Additionally, I'm unable to create software events (see exception below). Just wanted to bring this to your attention.
Traceback (most recent call last):
File "./test_bpf.py", line 176, in <module> sw_clock.open_perf_event(1, 0)
File "/usr/lib/python2.7/dist-packages/bcc/table.py", line 410, in open_perf_event fd = self._open_perf_event(typ, config, i) File "/usr/lib/python2.7/dist-packages/bcc/table.py", line 416, in _open_perf_event self[self.Key(cpu)] = self.Leaf(fd) File "/usr/lib/python2.7/dist-packages/bcc/table.py", line 320, in __setitem__ super(ArrayBase, self).__setitem__(key, leaf) File "/usr/lib/python2.7/dist-packages/bcc/table.py", line 169, in __setitem__ raise Exception("Could not update table") Exception: Could not update table
toggle quoted message
Show quoted text
On Fri, Jul 29, 2016 at 1:34 PM, Brenden Blanco <bblanco@...> wrote: On Fri, Jul 29, 2016 at 10:21 AM, riya khanna <riyakhanna1983@...> wrote:
I'm testing perf counters on a 8-core machine.
since BPF_PERF_ARRAY.perf_read(cpu) reads from local CPU, I'm aggregating counters across all cpus by doing:
BPF_PERF_ARRAY(counter, 32);
for (key = 0; key < 8; key++) counter.perf_read(key);
I think it would make more sense to only read the counter on the cpu where the event is taking place. So:
u64 key = cycles.perf_read(bpf_get_smp_processor_id());
And then aggregate counters in userspace.
I have spent some time over the past couple days cleaning up the code in that private branch, but have been distracted a bit so haven't finalized it. Hopefully a PR will come soon.
However, this reports error:
bpf: Invalid argument back-edge from insn 69 to 17
If I loop from 0-4, it works. The code below works: for (key = 0; key < 4; key++) counter.perf_read(key);
What could be wrong here? The kernel verifier won't allow loops (i.e. back edges), and depending on the loop unroll optimization decision made by llvm, this short loop may have been automatically unrolled. Still, the solution should be to remove the loop and just read the local cpu's perf counter as mentioned above.
On Tue, Jul 26, 2016 at 7:29 PM, riya khanna <riyakhanna1983@...> wrote:
From your patches I see that perf support is enabled per-cpu. Could this be extended to enabling all or a group of perf counters on all CPU cores similar to what perf_event_open provides (with args -1)?
On Mon, Jul 25, 2016 at 9:55 PM, riya khanna <riyakhanna1983@...> wrote:
Thanks Brenden!
I will try with your changes. Meanwhile please let me know if you add missing functionality.
On Mon, Jul 25, 2016 at 8:14 PM, Brenden Blanco <bblanco@...> wrote:
This needs support in bcc.
I had a patch laying around that I never finished, you can find the partial support here: https://github.com/iovisor/bcc/tree/perf-counter
It shouldn't be too hard to finalize that, let me see what I can do.
On Mon, Jul 25, 2016 at 4:11 PM, riya khanna via iovisor-dev <iovisor-dev@...> wrote:
So I fixed the error above by using "count = my_map.perf_read(key);" as opposed to "count = bpf_perf_event_read(&my_map, key);". However, how do I selectively enable counters (e.g. instructions, cache misses, etc.)?
Thanks, Riya
On Mon, Jul 25, 2016 at 9:58 AM, riya khanna <riyakhanna1983@...> wrote:
Hi,
I'm trying to read perf counters using bpf. However, adding BPF_PERF_ARRAY reports error:
bpf: Invalid argument unrecognized bpf_ld_imm64 inns
Is there an example/sample to read perf counters that I can follow? The code below is what I'm trying to execute.
Thanks, Riya
# load BPF program
bpf_text = """
#include <uapi/linux/ptrace.h>
BPF_PERF_ARRAY(my_map, 32);
int start_counting(struct pt_regs *ctx) {
if (!PT_REGS_PARM1(ctx))
return 0;
u64 count;
u32 key = bpf_get_smp_processor_id();
count = bpf_perf_event_read(&my_map, key);
bpf_trace_printk("CPU-%d %llu", key, count);
return 0;
}
""" _______________________________________________ iovisor-dev mailing list iovisor-dev@... https://lists.iovisor.org/mailman/listinfo/iovisor-dev
|
Hi Brenden,
Saw test_perf_event.py in your branch. Its creates and enables per counters once during start. Is it also possible to enable/disable/reset counters on the fly? Perhaps we need a kernel patch for this?
Thanks, Riya
toggle quoted message
Show quoted text
On Fri, Jul 29, 2016 at 1:57 PM, riya khanna <riyakhanna1983@...> wrote: Thanks Brenden!
I'm working with your branch for now. Additionally, I'm unable to create software events (see exception below). Just wanted to bring this to your attention.
Traceback (most recent call last):
File "./test_bpf.py", line 176, in <module> sw_clock.open_perf_event(1, 0)
File "/usr/lib/python2.7/dist-packages/bcc/table.py", line 410, in open_perf_event fd = self._open_perf_event(typ, config, i) File "/usr/lib/python2.7/dist-packages/bcc/table.py", line 416, in _open_perf_event self[self.Key(cpu)] = self.Leaf(fd) File "/usr/lib/python2.7/dist-packages/bcc/table.py", line 320, in __setitem__ super(ArrayBase, self).__setitem__(key, leaf) File "/usr/lib/python2.7/dist-packages/bcc/table.py", line 169, in __setitem__ raise Exception("Could not update table") Exception: Could not update table
On Fri, Jul 29, 2016 at 1:34 PM, Brenden Blanco <bblanco@...> wrote:
On Fri, Jul 29, 2016 at 10:21 AM, riya khanna <riyakhanna1983@...> wrote:
I'm testing perf counters on a 8-core machine.
since BPF_PERF_ARRAY.perf_read(cpu) reads from local CPU, I'm aggregating counters across all cpus by doing:
BPF_PERF_ARRAY(counter, 32);
for (key = 0; key < 8; key++) counter.perf_read(key);
I think it would make more sense to only read the counter on the cpu where the event is taking place. So:
u64 key = cycles.perf_read(bpf_get_smp_processor_id());
And then aggregate counters in userspace.
I have spent some time over the past couple days cleaning up the code in that private branch, but have been distracted a bit so haven't finalized it. Hopefully a PR will come soon.
However, this reports error:
bpf: Invalid argument back-edge from insn 69 to 17
If I loop from 0-4, it works. The code below works: for (key = 0; key < 4; key++) counter.perf_read(key);
What could be wrong here? The kernel verifier won't allow loops (i.e. back edges), and depending on the loop unroll optimization decision made by llvm, this short loop may have been automatically unrolled. Still, the solution should be to remove the loop and just read the local cpu's perf counter as mentioned above.
On Tue, Jul 26, 2016 at 7:29 PM, riya khanna <riyakhanna1983@...> wrote:
From your patches I see that perf support is enabled per-cpu. Could this be extended to enabling all or a group of perf counters on all CPU cores similar to what perf_event_open provides (with args -1)?
On Mon, Jul 25, 2016 at 9:55 PM, riya khanna <riyakhanna1983@...> wrote:
Thanks Brenden!
I will try with your changes. Meanwhile please let me know if you add missing functionality.
On Mon, Jul 25, 2016 at 8:14 PM, Brenden Blanco <bblanco@...> wrote:
This needs support in bcc.
I had a patch laying around that I never finished, you can find the partial support here: https://github.com/iovisor/bcc/tree/perf-counter
It shouldn't be too hard to finalize that, let me see what I can do.
On Mon, Jul 25, 2016 at 4:11 PM, riya khanna via iovisor-dev <iovisor-dev@...> wrote:
So I fixed the error above by using "count = my_map.perf_read(key);" as opposed to "count = bpf_perf_event_read(&my_map, key);". However, how do I selectively enable counters (e.g. instructions, cache misses, etc.)?
Thanks, Riya
On Mon, Jul 25, 2016 at 9:58 AM, riya khanna <riyakhanna1983@...> wrote:
Hi,
I'm trying to read perf counters using bpf. However, adding BPF_PERF_ARRAY reports error:
bpf: Invalid argument unrecognized bpf_ld_imm64 inns
Is there an example/sample to read perf counters that I can follow? The code below is what I'm trying to execute.
Thanks, Riya
# load BPF program
bpf_text = """
#include <uapi/linux/ptrace.h>
BPF_PERF_ARRAY(my_map, 32);
int start_counting(struct pt_regs *ctx) {
if (!PT_REGS_PARM1(ctx))
return 0;
u64 count;
u32 key = bpf_get_smp_processor_id();
count = bpf_perf_event_read(&my_map, key);
bpf_trace_printk("CPU-%d %llu", key, count);
return 0;
}
""" _______________________________________________ iovisor-dev mailing list iovisor-dev@... https://lists.iovisor.org/mailman/listinfo/iovisor-dev
|
Brenden Blanco <bblanco@...>
|
On Tue, Aug 9, 2016 at 12:05 PM, Brenden Blanco <bblanco@...> wrote: On Tue, Aug 9, 2016 at 8:54 AM, riya khanna <riyakhanna1983@...> wrote:
Hi Brenden,
Saw test_perf_event.py in your branch. Its creates and enables per counters once during start. Is it also possible to enable/disable/reset counters on the fly? Perhaps we need a kernel patch for this? It doesn't "create" counters, it just attaches to the already available counters provided by the hardware or OS. Yes, it enables monitoring when attached. Any type of "reset" infrastructure would adversely impact other users of those same counters (perf). I consider it the job of userspace or the program to compute deltas or other types of history. Well, there are limited counters. How to multiplex from userspace on the fly (e.g. monitoring a set of events first, followed by a different set)? Also, is it possible to handle counter overflow?
Thanks, Riya
On Fri, Jul 29, 2016 at 1:57 PM, riya khanna <riyakhanna1983@...> wrote:
Thanks Brenden!
I'm working with your branch for now. Additionally, I'm unable to create software events (see exception below). Just wanted to bring this to your attention.
Traceback (most recent call last):
File "./test_bpf.py", line 176, in <module> sw_clock.open_perf_event(1, 0)
File "/usr/lib/python2.7/dist-packages/bcc/table.py", line 410, in open_perf_event fd = self._open_perf_event(typ, config, i) File "/usr/lib/python2.7/dist-packages/bcc/table.py", line 416, in _open_perf_event self[self.Key(cpu)] = self.Leaf(fd) File "/usr/lib/python2.7/dist-packages/bcc/table.py", line 320, in __setitem__ super(ArrayBase, self).__setitem__(key, leaf) File "/usr/lib/python2.7/dist-packages/bcc/table.py", line 169, in __setitem__ raise Exception("Could not update table") Exception: Could not update table
On Fri, Jul 29, 2016 at 1:34 PM, Brenden Blanco <bblanco@...> wrote:
On Fri, Jul 29, 2016 at 10:21 AM, riya khanna <riyakhanna1983@...> wrote:
I'm testing perf counters on a 8-core machine.
since BPF_PERF_ARRAY.perf_read(cpu) reads from local CPU, I'm aggregating counters across all cpus by doing:
BPF_PERF_ARRAY(counter, 32);
for (key = 0; key < 8; key++) counter.perf_read(key);
I think it would make more sense to only read the counter on the cpu where the event is taking place. So:
u64 key = cycles.perf_read(bpf_get_smp_processor_id());
And then aggregate counters in userspace.
I have spent some time over the past couple days cleaning up the code in that private branch, but have been distracted a bit so haven't finalized it. Hopefully a PR will come soon.
However, this reports error:
bpf: Invalid argument back-edge from insn 69 to 17
If I loop from 0-4, it works. The code below works: for (key = 0; key < 4; key++) counter.perf_read(key);
What could be wrong here? The kernel verifier won't allow loops (i.e. back edges), and depending on the loop unroll optimization decision made by llvm, this short loop may have been automatically unrolled. Still, the solution should be to remove the loop and just read the local cpu's perf counter as mentioned above.
On Tue, Jul 26, 2016 at 7:29 PM, riya khanna <riyakhanna1983@...> wrote:
From your patches I see that perf support is enabled per-cpu. Could this be extended to enabling all or a group of perf counters on all CPU cores similar to what perf_event_open provides (with args -1)?
On Mon, Jul 25, 2016 at 9:55 PM, riya khanna <riyakhanna1983@...> wrote:
Thanks Brenden!
I will try with your changes. Meanwhile please let me know if you add missing functionality.
On Mon, Jul 25, 2016 at 8:14 PM, Brenden Blanco <bblanco@...> wrote:
This needs support in bcc.
I had a patch laying around that I never finished, you can find the partial support here: https://github.com/iovisor/bcc/tree/perf-counter
It shouldn't be too hard to finalize that, let me see what I can do.
On Mon, Jul 25, 2016 at 4:11 PM, riya khanna via iovisor-dev <iovisor-dev@...> wrote:
So I fixed the error above by using "count = my_map.perf_read(key);" as opposed to "count = bpf_perf_event_read(&my_map, key);". However, how do I selectively enable counters (e.g. instructions, cache misses, etc.)?
Thanks, Riya
On Mon, Jul 25, 2016 at 9:58 AM, riya khanna <riyakhanna1983@...> wrote:
Hi,
I'm trying to read perf counters using bpf. However, adding BPF_PERF_ARRAY reports error:
bpf: Invalid argument unrecognized bpf_ld_imm64 inns
Is there an example/sample to read perf counters that I can follow? The code below is what I'm trying to execute.
Thanks, Riya
# load BPF program
bpf_text = """
#include <uapi/linux/ptrace.h>
BPF_PERF_ARRAY(my_map, 32);
int start_counting(struct pt_regs *ctx) {
if (!PT_REGS_PARM1(ctx))
return 0;
u64 count;
u32 key = bpf_get_smp_processor_id();
count = bpf_perf_event_read(&my_map, key);
bpf_trace_printk("CPU-%d %llu", key, count);
return 0;
}
""" _______________________________________________ iovisor-dev mailing list iovisor-dev@... https://lists.iovisor.org/mailman/listinfo/iovisor-dev
|
Brenden Blanco <bblanco@...>
|
On Tue, Aug 9, 2016 at 12:24 PM, Brenden Blanco <bblanco@...> wrote:
On Tue, Aug 9, 2016 at 9:16 AM, riya khanna <riyakhanna1983@...> wrote:
On Tue, Aug 9, 2016 at 12:05 PM, Brenden Blanco <bblanco@...> wrote:
On Tue, Aug 9, 2016 at 8:54 AM, riya khanna <riyakhanna1983@...> wrote:
Hi Brenden,
Saw test_perf_event.py in your branch. Its creates and enables per counters once during start. Is it also possible to enable/disable/reset counters on the fly? Perhaps we need a kernel patch for this? It doesn't "create" counters, it just attaches to the already available counters provided by the hardware or OS. Yes, it enables monitoring when attached.
Any type of "reset" infrastructure would adversely impact other users of those same counters (perf). I consider it the job of userspace or the program to compute deltas or other types of history. Well, there are limited counters. How to multiplex from userspace on the fly (e.g. monitoring a set of events first, followed by a different set)? I would just create a different BPF_PERF_ARRAY for each different one.
Yes, but if you create more than available number of hardware counters (i.e. try to monitor more events concurrently than allowed by the hardware), counter value is reported as '0'. Verified with a test program (perf.c, attached) that uses perf_event_open() syscall. Depending upon the number of counters available on your platform, change NUM_REQ_HW_CNTRS to verify the behavior A way to enable/disable events at runtime will help userspace multiplex over available hardware counters and monitor more more events (similar to perf stat tool) Also, is it possible to handle counter overflow? What does it mean to "handle"? If computing deltas, for instance, the subtraction will just underflow and wrap around to the correct value, assuming the values are both unsigned.
Thanks, Riya
On Fri, Jul 29, 2016 at 1:57 PM, riya khanna <riyakhanna1983@...> wrote:
Thanks Brenden!
I'm working with your branch for now. Additionally, I'm unable to create software events (see exception below). Just wanted to bring this to your attention.
Traceback (most recent call last):
File "./test_bpf.py", line 176, in <module> sw_clock.open_perf_event(1, 0)
File "/usr/lib/python2.7/dist-packages/bcc/table.py", line 410, in open_perf_event fd = self._open_perf_event(typ, config, i) File "/usr/lib/python2.7/dist-packages/bcc/table.py", line 416, in _open_perf_event self[self.Key(cpu)] = self.Leaf(fd) File "/usr/lib/python2.7/dist-packages/bcc/table.py", line 320, in __setitem__ super(ArrayBase, self).__setitem__(key, leaf) File "/usr/lib/python2.7/dist-packages/bcc/table.py", line 169, in __setitem__ raise Exception("Could not update table") Exception: Could not update table
On Fri, Jul 29, 2016 at 1:34 PM, Brenden Blanco <bblanco@...> wrote:
On Fri, Jul 29, 2016 at 10:21 AM, riya khanna <riyakhanna1983@...> wrote:
I'm testing perf counters on a 8-core machine.
since BPF_PERF_ARRAY.perf_read(cpu) reads from local CPU, I'm aggregating counters across all cpus by doing:
BPF_PERF_ARRAY(counter, 32);
for (key = 0; key < 8; key++) counter.perf_read(key);
I think it would make more sense to only read the counter on the cpu where the event is taking place. So:
u64 key = cycles.perf_read(bpf_get_smp_processor_id());
And then aggregate counters in userspace.
I have spent some time over the past couple days cleaning up the code in that private branch, but have been distracted a bit so haven't finalized it. Hopefully a PR will come soon.
However, this reports error:
bpf: Invalid argument back-edge from insn 69 to 17
If I loop from 0-4, it works. The code below works: for (key = 0; key < 4; key++) counter.perf_read(key);
What could be wrong here? The kernel verifier won't allow loops (i.e. back edges), and depending on the loop unroll optimization decision made by llvm, this short loop may have been automatically unrolled. Still, the solution should be to remove the loop and just read the local cpu's perf counter as mentioned above.
On Tue, Jul 26, 2016 at 7:29 PM, riya khanna <riyakhanna1983@...> wrote:
From your patches I see that perf support is enabled per-cpu. Could this be extended to enabling all or a group of perf counters on all CPU cores similar to what perf_event_open provides (with args -1)?
On Mon, Jul 25, 2016 at 9:55 PM, riya khanna <riyakhanna1983@...> wrote:
Thanks Brenden!
I will try with your changes. Meanwhile please let me know if you add missing functionality.
On Mon, Jul 25, 2016 at 8:14 PM, Brenden Blanco <bblanco@...> wrote:
This needs support in bcc.
I had a patch laying around that I never finished, you can find the partial support here: https://github.com/iovisor/bcc/tree/perf-counter
It shouldn't be too hard to finalize that, let me see what I can do.
On Mon, Jul 25, 2016 at 4:11 PM, riya khanna via iovisor-dev <iovisor-dev@...> wrote:
So I fixed the error above by using "count = my_map.perf_read(key);" as opposed to "count = bpf_perf_event_read(&my_map, key);". However, how do I selectively enable counters (e.g. instructions, cache misses, etc.)?
Thanks, Riya
On Mon, Jul 25, 2016 at 9:58 AM, riya khanna <riyakhanna1983@...> wrote:
Hi,
I'm trying to read perf counters using bpf. However, adding BPF_PERF_ARRAY reports error:
bpf: Invalid argument unrecognized bpf_ld_imm64 inns
Is there an example/sample to read perf counters that I can follow? The code below is what I'm trying to execute.
Thanks, Riya
# load BPF program
bpf_text = """
#include <uapi/linux/ptrace.h>
BPF_PERF_ARRAY(my_map, 32);
int start_counting(struct pt_regs *ctx) {
if (!PT_REGS_PARM1(ctx))
return 0;
u64 count;
u32 key = bpf_get_smp_processor_id();
count = bpf_perf_event_read(&my_map, key);
bpf_trace_printk("CPU-%d %llu", key, count);
return 0;
}
""" _______________________________________________ iovisor-dev mailing list iovisor-dev@... https://lists.iovisor.org/mailman/listinfo/iovisor-dev
|
Brenden Blanco <bblanco@...>
|
How do I do that at runtime conditionally (e.g. upon execution of a particular kernel/user code)?
May be this example will help present my case:
I'm trying to monitor different sets of events (e.g. set1 consists of cpu cycles, cache misses, and cache refs and set 2 consists of instruction count, branch refs, and branch misses) at different times (e.g. set 1 should be enabled upon execution of a kernel function i.e. kprobe and set 2 should be enabled for a particular user function i.e. uprobe).
I can create table.open_perf_event() for each event I'm trying to monitor. This will call ioctl(PERF_EVENT_IOC_ENABLE) for each event. However, if number of events > number of available counters, there will be a problem as demonstrated by my perf.c example (attached in my last email).
Let me know if I'm missing something.
toggle quoted message
Show quoted text
On Wed, Aug 10, 2016 at 7:36 PM, Brenden Blanco <bblanco@...> wrote:
On Tue, Aug 9, 2016 at 9:59 AM, riya khanna <riyakhanna1983@...> wrote:
On Tue, Aug 9, 2016 at 12:24 PM, Brenden Blanco <bblanco@...> wrote:
On Tue, Aug 9, 2016 at 9:16 AM, riya khanna <riyakhanna1983@...> wrote:
On Tue, Aug 9, 2016 at 12:05 PM, Brenden Blanco <bblanco@...> wrote:
On Tue, Aug 9, 2016 at 8:54 AM, riya khanna <riyakhanna1983@...> wrote:
Hi Brenden,
Saw test_perf_event.py in your branch. Its creates and enables per counters once during start. Is it also possible to enable/disable/reset counters on the fly? Perhaps we need a kernel patch for this? It doesn't "create" counters, it just attaches to the already available counters provided by the hardware or OS. Yes, it enables monitoring when attached.
Any type of "reset" infrastructure would adversely impact other users of those same counters (perf). I consider it the job of userspace or the program to compute deltas or other types of history. Well, there are limited counters. How to multiplex from userspace on the fly (e.g. monitoring a set of events first, followed by a different set)? I would just create a different BPF_PERF_ARRAY for each different one. Yes, but if you create more than available number of hardware counters (i.e. try to monitor more events concurrently than allowed by the hardware), counter value is reported as '0'. Verified with a test program (perf.c, attached) that uses perf_event_open() syscall. Depending upon the number of counters available on your platform, change NUM_REQ_HW_CNTRS to verify the behavior
A way to enable/disable events at runtime will help userspace multiplex over available hardware counters and monitor more more events (similar to perf stat tool) If you just call table.open_perf_event(NEW_TYPE) multiple times, it should bind a new counter to the same table entry, allowing you to change the monitored event over time. Have you tried that?
Also, is it possible to handle counter overflow? What does it mean to "handle"? If computing deltas, for instance, the subtraction will just underflow and wrap around to the correct value, assuming the values are both unsigned.
Thanks, Riya
On Fri, Jul 29, 2016 at 1:57 PM, riya khanna <riyakhanna1983@...> wrote:
Thanks Brenden!
I'm working with your branch for now. Additionally, I'm unable to create software events (see exception below). Just wanted to bring this to your attention.
Traceback (most recent call last):
File "./test_bpf.py", line 176, in <module> sw_clock.open_perf_event(1, 0)
File "/usr/lib/python2.7/dist-packages/bcc/table.py", line 410, in open_perf_event fd = self._open_perf_event(typ, config, i) File "/usr/lib/python2.7/dist-packages/bcc/table.py", line 416, in _open_perf_event self[self.Key(cpu)] = self.Leaf(fd) File "/usr/lib/python2.7/dist-packages/bcc/table.py", line 320, in __setitem__ super(ArrayBase, self).__setitem__(key, leaf) File "/usr/lib/python2.7/dist-packages/bcc/table.py", line 169, in __setitem__ raise Exception("Could not update table") Exception: Could not update table
On Fri, Jul 29, 2016 at 1:34 PM, Brenden Blanco <bblanco@...> wrote:
On Fri, Jul 29, 2016 at 10:21 AM, riya khanna <riyakhanna1983@...> wrote:
I'm testing perf counters on a 8-core machine.
since BPF_PERF_ARRAY.perf_read(cpu) reads from local CPU, I'm aggregating counters across all cpus by doing:
BPF_PERF_ARRAY(counter, 32);
for (key = 0; key < 8; key++) counter.perf_read(key);
I think it would make more sense to only read the counter on the cpu where the event is taking place. So:
u64 key = cycles.perf_read(bpf_get_smp_processor_id());
And then aggregate counters in userspace.
I have spent some time over the past couple days cleaning up the code in that private branch, but have been distracted a bit so haven't finalized it. Hopefully a PR will come soon.
However, this reports error:
bpf: Invalid argument back-edge from insn 69 to 17
If I loop from 0-4, it works. The code below works: for (key = 0; key < 4; key++) counter.perf_read(key);
What could be wrong here? The kernel verifier won't allow loops (i.e. back edges), and depending on the loop unroll optimization decision made by llvm, this short loop may have been automatically unrolled. Still, the solution should be to remove the loop and just read the local cpu's perf counter as mentioned above.
On Tue, Jul 26, 2016 at 7:29 PM, riya khanna <riyakhanna1983@...> wrote:
From your patches I see that perf support is enabled per-cpu. Could this be extended to enabling all or a group of perf counters on all CPU cores similar to what perf_event_open provides (with args -1)?
On Mon, Jul 25, 2016 at 9:55 PM, riya khanna <riyakhanna1983@...> wrote:
Thanks Brenden!
I will try with your changes. Meanwhile please let me know if you add missing functionality.
On Mon, Jul 25, 2016 at 8:14 PM, Brenden Blanco <bblanco@...> wrote:
This needs support in bcc.
I had a patch laying around that I never finished, you can find the partial support here: https://github.com/iovisor/bcc/tree/perf-counter
It shouldn't be too hard to finalize that, let me see what I can do.
On Mon, Jul 25, 2016 at 4:11 PM, riya khanna via iovisor-dev <iovisor-dev@...> wrote:
So I fixed the error above by using "count = my_map.perf_read(key);" as opposed to "count = bpf_perf_event_read(&my_map, key);". However, how do I selectively enable counters (e.g. instructions, cache misses, etc.)?
Thanks, Riya
On Mon, Jul 25, 2016 at 9:58 AM, riya khanna <riyakhanna1983@...> wrote:
Hi,
I'm trying to read perf counters using bpf. However, adding BPF_PERF_ARRAY reports error:
bpf: Invalid argument unrecognized bpf_ld_imm64 inns
Is there an example/sample to read perf counters that I can follow? The code below is what I'm trying to execute.
Thanks, Riya
# load BPF program
bpf_text = """
#include <uapi/linux/ptrace.h>
BPF_PERF_ARRAY(my_map, 32);
int start_counting(struct pt_regs *ctx) {
if (!PT_REGS_PARM1(ctx))
return 0;
u64 count;
u32 key = bpf_get_smp_processor_id();
count = bpf_perf_event_read(&my_map, key);
bpf_trace_printk("CPU-%d %llu", key, count);
return 0;
}
""" _______________________________________________ iovisor-dev mailing list iovisor-dev@... https://lists.iovisor.org/mailman/listinfo/iovisor-dev
|