Failing bcc tests on xenial with 4.13 kernel


Mike Percy <mpercy@...>
 

Hi bcc devs,

I am trying to get the bcc tests to pass on the master branch on top of Ubuntu 16.04 running the HWE kernel, which is version 4.13.0-36-generic. I'm compiling bcc using LLVM 6.0 (yes, it's now available) from the llvm.org APT repo. Has anyone gotten the bcc tests to pass on this kernel? If not, is there a recommended platform? Is it possible that the tests could be currently broken on master? (I'm not sure if bcc is currently set up to use a CI system.)

The following tests are failing for me:

* test_libbcc: test listing all USDT probes in Ruby/MRI
* py_test_brb2: TestBPFSocket
* py_test_tools_smoke: test_bpflist
* py_test_tools_smoke: test_ucalls
* py_test_tools_smoke: test_wakeuptime

Since this is kernel version 4.13, I would expect wakeuptime to work. I am particularly interested in that and related tools for some performance analysis I want to do.

Below, I've added some relevant details about the kernel I'm running and the output of the specific test failures.

Thanks in advance for any advice or pointers!

Best regards,
Mike


Some relevant info on my kernel:

$ uname -a
Linux mpercy-T460p 4.13.0-36-generic #40~16.04.1-Ubuntu SMP Fri Feb 16 23:25:58 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

It looks like all the required options are enabled:

$ egrep 'BPF|NET_SCH_SFQ|NET_ACT|CONFIG_DUMMY|CONFIG_VXLAN' /boot/config-4.13.0-36-generic | sort
CONFIG_BPF_EVENTS=y
CONFIG_BPF_JIT=y
CONFIG_BPF_SYSCALL=y
CONFIG_BPF=y
CONFIG_DUMMY=m
CONFIG_HAVE_EBPF_JIT=y
CONFIG_NET_ACT_BPF=m
CONFIG_NET_ACT_GACT=m
CONFIG_NET_ACT_POLICE=m
CONFIG_NET_CLS_BPF=m
CONFIG_NET_SCH_SFQ=m
CONFIG_VXLAN=m
# n.b.: options not appearing in INSTALLING.md were elided


Test failure output:

3/35 Testing: test_libbcc
3/35 Test: test_libbcc
Command: "/home/mpercy/src/bcc/build/tests/wrapper.sh" "c_test_all" "sudo" "/home/mpercy/src/bcc/build/tests/cc/test_libbcc"
Directory: /home/mpercy/src/bcc/build/tests/cc
"test_libbcc" start time: Mar 06 22:11 PST
Output:
----------------------------------------------------------
Parse error:
    4@i%ra+1r
-------^

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
test_libbcc is a Catch v1.4.0 host application.
Run with -? for options

-------------------------------------------------------------------------------
test listing all USDT probes in Ruby/MRI
  without a running Ruby process
-------------------------------------------------------------------------------
../tests/cc/test_usdt_probes.cc:118
...............................................................................

../tests/cc/test_usdt_probes.cc:127: FAILED:
  REQUIRE( ctx.num_probes() > 10 )
with expansion:
  0 > 10

===============================================================================
test cases:  22 |  21 passed | 1 failed
assertions: 433 | 432 passed | 1 failed

Failed
<end of output>
Test time =   9.37 sec
----------------------------------------------------------
Test Failed.
"test_libbcc" end time: Mar 06 22:11 PST
"test_libbcc" time elapsed: 00:00:09
----------------------------------------------------------

16/35 Testing: py_test_brb2
16/35 Test: py_test_brb2
Command: "/home/mpercy/src/bcc/build/tests/wrapper.sh" "py_brb2_c" "sudo" "/home/mpercy/src/bcc/tests/python/test_brb2.py" "test_brb2.c"
Directory: /home/mpercy/src/bcc/tests/python
"py_test_brb2" start time: Mar 06 22:11 PST
Output:
----------------------------------------------------------
net.ipv4.ip_forward = 1
PING 200.1.1.1 (200.1.1.1) 56(84) bytes of data.

--- 200.1.1.1 ping statistics ---
2 packets transmitted, 0 received, 100% packet loss, time 1026ms

F
======================================================================
FAIL: test_brb2 (__main__.TestBPFSocket)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/mpercy/src/bcc/tests/python/test_brb2.py", line 176, in test_brb2
    self.assertEqual(self.pem_stats[c_uint(0)].value, 12)
AssertionError: 4L != 12

----------------------------------------------------------------------
Ran 1 test in 12.734s

FAILED (failures=1)
Failed
<end of output>
Test time =  12.90 sec
----------------------------------------------------------
Test Failed.
"py_test_brb2" end time: Mar 06 22:12 PST
"py_test_brb2" time elapsed: 00:00:12
----------------------------------------------------------

27/35 Testing: py_test_tools_smoke
27/35 Test: py_test_tools_smoke
Command: "/home/mpercy/src/bcc/build/tests/wrapper.sh" "py_test_tools_smoke" "sudo" "/home/mpercy/src/bcc/tests/python/test_tools_smoke.py"
Directory: /home/mpercy/src/bcc/tests/python
"py_test_tools_smoke" start time: Mar 06 22:12 PST
Output:
----------------------------------------------------------
......Traceback (most recent call last):
  File "../../tools/bpflist.py", line 75, in <module>
    find_bpf_fds(int(pdir))
  File "../../tools/bpflist.py", line 63, in find_bpf_fds
    for fd in os.listdir(root):
OSError: [Errno 2] No such file or directory: '/proc/18787/fd'
F.......................Killed
....................Traceback (most recent call last):
  File "../../tools/stackcount.py", line 370, in <module>
    Tool().run()
  File "../../tools/stackcount.py", line 352, in run
    for addr in kernel_stack:
KeyboardInterrupt
.../bin/sh: 1: ausyscall: not found
.In file included from /virtual/main.c:3:
In file included from include/net/sock.h:58:
include/linux/memcontrol.h:580:31: warning: implicit conversion from enumeration type 'enum node_stat_item' to different enumeration type 'enum memcg_stat_item' [-Wenum-conversion]
        __mod_memcg_state(pn->memcg, idx, val);
        ~~~~~~~~~~~~~~~~~            ^~~
include/linux/memcontrol.h:593:29: warning: implicit conversion from enumeration type 'enum node_stat_item' to different enumeration type 'enum memcg_stat_item' [-Wenum-conversion]
        mod_memcg_state(pn->memcg, idx, val);
        ~~~~~~~~~~~~~~~            ^~~
include/linux/memcontrol.h:605:38: warning: implicit conversion from enumeration type 'enum node_stat_item' to different enumeration type 'enum memcg_stat_item' [-Wenum-conversion]
        __mod_memcg_state(page->mem_cgroup, idx, val);
        ~~~~~~~~~~~~~~~~~                   ^~~
include/linux/memcontrol.h:618:36: warning: implicit conversion from enumeration type 'enum node_stat_item' to different enumeration type 'enum memcg_stat_item' [-Wenum-conversion]
        mod_memcg_state(page->mem_cgroup, idx, val);
        ~~~~~~~~~~~~~~~                   ^~~
include/linux/memcontrol.h:639:40: warning: implicit conversion from enumeration type 'enum memcg_stat_item' to different enumeration type 'enum vm_event_item' [-Wenum-conversion]
                count_memcg_events(page->mem_cgroup, idx, 1);
                ~~~~~~~~~~~~~~~~~~                   ^~~
5 warnings generated.
.In file included from /virtual/main.c:3:
In file included from include/net/sock.h:58:
include/linux/memcontrol.h:580:31: warning: implicit conversion from enumeration type 'enum node_stat_item' to different enumeration type 'enum memcg_stat_item' [-Wenum-conversion]
        __mod_memcg_state(pn->memcg, idx, val);
        ~~~~~~~~~~~~~~~~~            ^~~
include/linux/memcontrol.h:593:29: warning: implicit conversion from enumeration type 'enum node_stat_item' to different enumeration type 'enum memcg_stat_item' [-Wenum-conversion]
        mod_memcg_state(pn->memcg, idx, val);
        ~~~~~~~~~~~~~~~            ^~~
include/linux/memcontrol.h:605:38: warning: implicit conversion from enumeration type 'enum node_stat_item' to different enumeration type 'enum memcg_stat_item' [-Wenum-conversion]
        __mod_memcg_state(page->mem_cgroup, idx, val);
        ~~~~~~~~~~~~~~~~~                   ^~~
include/linux/memcontrol.h:618:36: warning: implicit conversion from enumeration type 'enum node_stat_item' to different enumeration type 'enum memcg_stat_item' [-Wenum-conversion]
        mod_memcg_state(page->mem_cgroup, idx, val);
        ~~~~~~~~~~~~~~~                   ^~~
include/linux/memcontrol.h:639:40: warning: implicit conversion from enumeration type 'enum memcg_stat_item' to different enumeration type 'enum vm_event_item' [-Wenum-conversion]
                count_memcg_events(page->mem_cgroup, idx, 1);
                ~~~~~~~~~~~~~~~~~~                   ^~~
5 warnings generated.
.In file included from /virtual/main.c:3:
In file included from include/net/sock.h:58:
include/linux/memcontrol.h:580:31: warning: implicit conversion from enumeration type 'enum node_stat_item' to different enumeration type 'enum memcg_stat_item' [-Wenum-conversion]
        __mod_memcg_state(pn->memcg, idx, val);
        ~~~~~~~~~~~~~~~~~            ^~~
include/linux/memcontrol.h:593:29: warning: implicit conversion from enumeration type 'enum node_stat_item' to different enumeration type 'enum memcg_stat_item' [-Wenum-conversion]
        mod_memcg_state(pn->memcg, idx, val);
        ~~~~~~~~~~~~~~~            ^~~
include/linux/memcontrol.h:605:38: warning: implicit conversion from enumeration type 'enum node_stat_item' to different enumeration type 'enum memcg_stat_item' [-Wenum-conversion]
        __mod_memcg_state(page->mem_cgroup, idx, val);
        ~~~~~~~~~~~~~~~~~                   ^~~
include/linux/memcontrol.h:618:36: warning: implicit conversion from enumeration type 'enum node_stat_item' to different enumeration type 'enum memcg_stat_item' [-Wenum-conversion]
        mod_memcg_state(page->mem_cgroup, idx, val);
        ~~~~~~~~~~~~~~~                   ^~~
include/linux/memcontrol.h:639:40: warning: implicit conversion from enumeration type 'enum memcg_stat_item' to different enumeration type 'enum vm_event_item' [-Wenum-conversion]
                count_memcg_events(page->mem_cgroup, idx, 1);
                ~~~~~~~~~~~~~~~~~~                   ^~~
5 warnings generated.
.In file included from /virtual/main.c:3:
In file included from include/net/sock.h:58:
include/linux/memcontrol.h:580:31: warning: implicit conversion from enumeration type 'enum node_stat_item' to different enumeration type 'enum memcg_stat_item' [-Wenum-conversion]
        __mod_memcg_state(pn->memcg, idx, val);
        ~~~~~~~~~~~~~~~~~            ^~~
include/linux/memcontrol.h:593:29: warning: implicit conversion from enumeration type 'enum node_stat_item' to different enumeration type 'enum memcg_stat_item' [-Wenum-conversion]
        mod_memcg_state(pn->memcg, idx, val);
        ~~~~~~~~~~~~~~~            ^~~
include/linux/memcontrol.h:605:38: warning: implicit conversion from enumeration type 'enum node_stat_item' to different enumeration type 'enum memcg_stat_item' [-Wenum-conversion]
        __mod_memcg_state(page->mem_cgroup, idx, val);
        ~~~~~~~~~~~~~~~~~                   ^~~
include/linux/memcontrol.h:618:36: warning: implicit conversion from enumeration type 'enum node_stat_item' to different enumeration type 'enum memcg_stat_item' [-Wenum-conversion]
        mod_memcg_state(page->mem_cgroup, idx, val);
        ~~~~~~~~~~~~~~~                   ^~~
include/linux/memcontrol.h:639:40: warning: implicit conversion from enumeration type 'enum memcg_stat_item' to different enumeration type 'enum vm_event_item' [-Wenum-conversion]
                count_memcg_events(page->mem_cgroup, idx, 1);
                ~~~~~~~~~~~~~~~~~~                   ^~~
5 warnings generated.
.In file included from /virtual/main.c:3:
In file included from include/net/sock.h:58:
include/linux/memcontrol.h:580:31: warning: implicit conversion from enumeration type 'enum node_stat_item' to different enumeration type 'enum memcg_stat_item' [-Wenum-conversion]
        __mod_memcg_state(pn->memcg, idx, val);
        ~~~~~~~~~~~~~~~~~            ^~~
include/linux/memcontrol.h:593:29: warning: implicit conversion from enumeration type 'enum node_stat_item' to different enumeration type 'enum memcg_stat_item' [-Wenum-conversion]
        mod_memcg_state(pn->memcg, idx, val);
        ~~~~~~~~~~~~~~~            ^~~
include/linux/memcontrol.h:605:38: warning: implicit conversion from enumeration type 'enum node_stat_item' to different enumeration type 'enum memcg_stat_item' [-Wenum-conversion]
        __mod_memcg_state(page->mem_cgroup, idx, val);
        ~~~~~~~~~~~~~~~~~                   ^~~
include/linux/memcontrol.h:618:36: warning: implicit conversion from enumeration type 'enum node_stat_item' to different enumeration type 'enum memcg_stat_item' [-Wenum-conversion]
        mod_memcg_state(page->mem_cgroup, idx, val);
        ~~~~~~~~~~~~~~~                   ^~~
include/linux/memcontrol.h:639:40: warning: implicit conversion from enumeration type 'enum memcg_stat_item' to different enumeration type 'enum vm_event_item' [-Wenum-conversion]
                count_memcg_events(page->mem_cgroup, idx, 1);
                ~~~~~~~~~~~~~~~~~~                   ^~~
5 warnings generated.
.In file included from /virtual/main.c:3:
In file included from include/net/sock.h:58:
include/linux/memcontrol.h:580:31: warning: implicit conversion from enumeration type 'enum node_stat_item' to different enumeration type 'enum memcg_stat_item' [-Wenum-conversion]
        __mod_memcg_state(pn->memcg, idx, val);
        ~~~~~~~~~~~~~~~~~            ^~~
include/linux/memcontrol.h:593:29: warning: implicit conversion from enumeration type 'enum node_stat_item' to different enumeration type 'enum memcg_stat_item' [-Wenum-conversion]
        mod_memcg_state(pn->memcg, idx, val);
        ~~~~~~~~~~~~~~~            ^~~
include/linux/memcontrol.h:605:38: warning: implicit conversion from enumeration type 'enum node_stat_item' to different enumeration type 'enum memcg_stat_item' [-Wenum-conversion]
        __mod_memcg_state(page->mem_cgroup, idx, val);
        ~~~~~~~~~~~~~~~~~                   ^~~
include/linux/memcontrol.h:618:36: warning: implicit conversion from enumeration type 'enum node_stat_item' to different enumeration type 'enum memcg_stat_item' [-Wenum-conversion]
        mod_memcg_state(page->mem_cgroup, idx, val);
        ~~~~~~~~~~~~~~~                   ^~~
include/linux/memcontrol.h:639:40: warning: implicit conversion from enumeration type 'enum memcg_stat_item' to different enumeration type 'enum vm_event_item' [-Wenum-conversion]
                count_memcg_events(page->mem_cgroup, idx, 1);
                ~~~~~~~~~~~~~~~~~~                   ^~~
5 warnings generated.
....Killed
F.....check dmesg output for possible cause
check dmesg output for possible cause
..Traceback (most recent call last):
  File "../../tools/wakeuptime.py", line 216, in <module>
    print("    %-16s %s" % ("target:", k.target.decode()))
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe0 in position 0: ordinal not in range(128)
F....
======================================================================
FAIL: test_bpflist (__main__.SmokeTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/mpercy/src/bcc/tests/python/test_tools_smoke.py", line 89, in test_bpflist
    self.run_with_duration("bpflist.py")
  File "/home/mpercy/src/bcc/tests/python/test_tools_smoke.py", line 32, in run_with_duration
    (timeout, full_command), shell=True))
AssertionError: 0 != 1

======================================================================
FAIL: test_ucalls (__main__.SmokeTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/mpercy/src/bcc/tests/python/test_tools_smoke.py", line 336, in test_ucalls
    timeout=60, kill_timeout=60)
  File "/home/mpercy/src/bcc/tests/python/test_tools_smoke.py", line 52, in run_with_int
    or (rc == 137 and kill), "rc was %d" % rc)
AssertionError: rc was 137

======================================================================
FAIL: test_wakeuptime (__main__.SmokeTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/mpercy/src/bcc/tests/python/test_tools_smoke.py", line 371, in test_wakeuptime
    self.run_with_duration("wakeuptime.py 1")
  File "/home/mpercy/src/bcc/tests/python/test_tools_smoke.py", line 32, in run_with_duration
    (timeout, full_command), shell=True))
AssertionError: 0 != 1

----------------------------------------------------------------------
Ran 76 tests in 368.071s

FAILED (failures=3)
Failed
<end of output>
Test time = 368.11 sec
----------------------------------------------------------
Test Failed.
"py_test_tools_smoke" end time: Mar 06 22:18 PST
"py_test_tools_smoke" time elapsed: 00:06:08
----------------------------------------------------------


Yonghong Song
 

On Tue, Mar 6, 2018 at 11:12 PM, Mike Percy via iovisor-dev
<iovisor-dev@...> wrote:
Hi bcc devs,

I am trying to get the bcc tests to pass on the master branch on top of
Ubuntu 16.04 running the HWE kernel, which is version 4.13.0-36-generic. I'm
compiling bcc using LLVM 6.0 (yes, it's now available) from the llvm.org APT
repo. Has anyone gotten the bcc tests to pass on this kernel? If not, is
there a recommended platform? Is it possible that the tests could be
currently broken on master? (I'm not sure if bcc is currently set up to use
a CI system.)
Maybe some dependencies are not satisfied. I will put some comments
below and you
may try to install additional packages and run again.


The following tests are failing for me:

* test_libbcc: test listing all USDT probes in Ruby/MRI
* py_test_brb2: TestBPFSocket
* py_test_tools_smoke: test_bpflist
* py_test_tools_smoke: test_ucalls
* py_test_tools_smoke: test_wakeuptime

Since this is kernel version 4.13, I would expect wakeuptime to work. I am
particularly interested in that and related tools for some performance
analysis I want to do.

Below, I've added some relevant details about the kernel I'm running and the
output of the specific test failures.

Thanks in advance for any advice or pointers!

Best regards,
Mike


Some relevant info on my kernel:

$ uname -a
Linux mpercy-T460p 4.13.0-36-generic #40~16.04.1-Ubuntu SMP Fri Feb 16
23:25:58 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

It looks like all the required options are enabled:

$ egrep 'BPF|NET_SCH_SFQ|NET_ACT|CONFIG_DUMMY|CONFIG_VXLAN'
/boot/config-4.13.0-36-generic | sort
CONFIG_BPF_EVENTS=y
CONFIG_BPF_JIT=y
CONFIG_BPF_SYSCALL=y
CONFIG_BPF=y
CONFIG_DUMMY=m
CONFIG_HAVE_EBPF_JIT=y
CONFIG_NET_ACT_BPF=m
CONFIG_NET_ACT_GACT=m
CONFIG_NET_ACT_POLICE=m
CONFIG_NET_CLS_BPF=m
CONFIG_NET_SCH_SFQ=m
CONFIG_VXLAN=m
# n.b.: options not appearing in INSTALLING.md were elided


Test failure output:

3/35 Testing: test_libbcc
3/35 Test: test_libbcc
Command: "/home/mpercy/src/bcc/build/tests/wrapper.sh" "c_test_all" "sudo"
"/home/mpercy/src/bcc/build/tests/cc/test_libbcc"
Directory: /home/mpercy/src/bcc/build/tests/cc
"test_libbcc" start time: Mar 06 22:11 PST
Output:
----------------------------------------------------------
Parse error:
4@i%ra+1r
-------^

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
test_libbcc is a Catch v1.4.0 host application.
Run with -? for options

-------------------------------------------------------------------------------
test listing all USDT probes in Ruby/MRI
without a running Ruby process
-------------------------------------------------------------------------------
../tests/cc/test_usdt_probes.cc:118
...............................................................................

../tests/cc/test_usdt_probes.cc:127: FAILED:
REQUIRE( ctx.num_probes() > 10 )
with expansion:
0 > 10
Maybe 'ruby' is not installed?


===============================================================================
test cases: 22 | 21 passed | 1 failed
assertions: 433 | 432 passed | 1 failed

Failed
<end of output>
Test time = 9.37 sec
----------------------------------------------------------
Test Failed.
"test_libbcc" end time: Mar 06 22:11 PST
"test_libbcc" time elapsed: 00:00:09
----------------------------------------------------------

16/35 Testing: py_test_brb2
16/35 Test: py_test_brb2
Command: "/home/mpercy/src/bcc/build/tests/wrapper.sh" "py_brb2_c" "sudo"
"/home/mpercy/src/bcc/tests/python/test_brb2.py" "test_brb2.c"
Directory: /home/mpercy/src/bcc/tests/python
"py_test_brb2" start time: Mar 06 22:11 PST
Output:
----------------------------------------------------------
net.ipv4.ip_forward = 1
PING 200.1.1.1 (200.1.1.1) 56(84) bytes of data.

--- 200.1.1.1 ping statistics ---
2 packets transmitted, 0 received, 100% packet loss, time 1026ms

F
======================================================================
FAIL: test_brb2 (__main__.TestBPFSocket)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/home/mpercy/src/bcc/tests/python/test_brb2.py", line 176, in
test_brb2
self.assertEqual(self.pem_stats[c_uint(0)].value, 12)
AssertionError: 4L != 12

----------------------------------------------------------------------
Ran 1 test in 12.734s

FAILED (failures=1)
Failed
<end of output>
Test time = 12.90 sec
----------------------------------------------------------
Test Failed.
"py_test_brb2" end time: Mar 06 22:12 PST
"py_test_brb2" time elapsed: 00:00:12
----------------------------------------------------------
Not sure about this one.


27/35 Testing: py_test_tools_smoke
27/35 Test: py_test_tools_smoke
Command: "/home/mpercy/src/bcc/build/tests/wrapper.sh" "py_test_tools_smoke"
"sudo" "/home/mpercy/src/bcc/tests/python/test_tools_smoke.py"
Directory: /home/mpercy/src/bcc/tests/python
"py_test_tools_smoke" start time: Mar 06 22:12 PST
Output:
----------------------------------------------------------
......Traceback (most recent call last):
File "../../tools/bpflist.py", line 75, in <module>
find_bpf_fds(int(pdir))
File "../../tools/bpflist.py", line 63, in find_bpf_fds
for fd in os.listdir(root):
OSError: [Errno 2] No such file or directory: '/proc/18787/fd'
F.......................Killed
....................Traceback (most recent call last):
bpflist.py cannot find one process which just went disappearing after it ID
is recorded. This is a known issue. We probably need to add
try/except block here.


File "../../tools/stackcount.py", line 370, in <module>
Tool().run()
File "../../tools/stackcount.py", line 352, in run
for addr in kernel_stack:
KeyboardInterrupt
.../bin/sh: 1: ausyscall: not found
.In file included from /virtual/main.c:3:
In file included from include/net/sock.h:58:
include/linux/memcontrol.h:580:31: warning: implicit conversion from
enumeration type 'enum node_stat_item' to different enumeration type 'enum
memcg_stat_item' [-Wenum-conversion]
__mod_memcg_state(pn->memcg, idx, val);
~~~~~~~~~~~~~~~~~ ^~~
include/linux/memcontrol.h:593:29: warning: implicit conversion from
enumeration type 'enum node_stat_item' to different enumeration type 'enum
memcg_stat_item' [-Wenum-conversion]
mod_memcg_state(pn->memcg, idx, val);
~~~~~~~~~~~~~~~ ^~~
include/linux/memcontrol.h:605:38: warning: implicit conversion from
enumeration type 'enum node_stat_item' to different enumeration type 'enum
memcg_stat_item' [-Wenum-conversion]
__mod_memcg_state(page->mem_cgroup, idx, val);
~~~~~~~~~~~~~~~~~ ^~~
include/linux/memcontrol.h:618:36: warning: implicit conversion from
enumeration type 'enum node_stat_item' to different enumeration type 'enum
memcg_stat_item' [-Wenum-conversion]
mod_memcg_state(page->mem_cgroup, idx, val);
~~~~~~~~~~~~~~~ ^~~
include/linux/memcontrol.h:639:40: warning: implicit conversion from
enumeration type 'enum memcg_stat_item' to different enumeration type 'enum
vm_event_item' [-Wenum-conversion]
count_memcg_events(page->mem_cgroup, idx, 1);
~~~~~~~~~~~~~~~~~~ ^~~
5 warnings generated.
.In file included from /virtual/main.c:3:
In file included from include/net/sock.h:58:
include/linux/memcontrol.h:580:31: warning: implicit conversion from
enumeration type 'enum node_stat_item' to different enumeration type 'enum
memcg_stat_item' [-Wenum-conversion]
__mod_memcg_state(pn->memcg, idx, val);
~~~~~~~~~~~~~~~~~ ^~~
include/linux/memcontrol.h:593:29: warning: implicit conversion from
enumeration type 'enum node_stat_item' to different enumeration type 'enum
memcg_stat_item' [-Wenum-conversion]
mod_memcg_state(pn->memcg, idx, val);
~~~~~~~~~~~~~~~ ^~~
include/linux/memcontrol.h:605:38: warning: implicit conversion from
enumeration type 'enum node_stat_item' to different enumeration type 'enum
memcg_stat_item' [-Wenum-conversion]
__mod_memcg_state(page->mem_cgroup, idx, val);
~~~~~~~~~~~~~~~~~ ^~~
include/linux/memcontrol.h:618:36: warning: implicit conversion from
enumeration type 'enum node_stat_item' to different enumeration type 'enum
memcg_stat_item' [-Wenum-conversion]
mod_memcg_state(page->mem_cgroup, idx, val);
~~~~~~~~~~~~~~~ ^~~
include/linux/memcontrol.h:639:40: warning: implicit conversion from
enumeration type 'enum memcg_stat_item' to different enumeration type 'enum
vm_event_item' [-Wenum-conversion]
count_memcg_events(page->mem_cgroup, idx, 1);
~~~~~~~~~~~~~~~~~~ ^~~
5 warnings generated.
.In file included from /virtual/main.c:3:
In file included from include/net/sock.h:58:
include/linux/memcontrol.h:580:31: warning: implicit conversion from
enumeration type 'enum node_stat_item' to different enumeration type 'enum
memcg_stat_item' [-Wenum-conversion]
__mod_memcg_state(pn->memcg, idx, val);
~~~~~~~~~~~~~~~~~ ^~~
include/linux/memcontrol.h:593:29: warning: implicit conversion from
enumeration type 'enum node_stat_item' to different enumeration type 'enum
memcg_stat_item' [-Wenum-conversion]
mod_memcg_state(pn->memcg, idx, val);
~~~~~~~~~~~~~~~ ^~~
include/linux/memcontrol.h:605:38: warning: implicit conversion from
enumeration type 'enum node_stat_item' to different enumeration type 'enum
memcg_stat_item' [-Wenum-conversion]
__mod_memcg_state(page->mem_cgroup, idx, val);
~~~~~~~~~~~~~~~~~ ^~~
include/linux/memcontrol.h:618:36: warning: implicit conversion from
enumeration type 'enum node_stat_item' to different enumeration type 'enum
memcg_stat_item' [-Wenum-conversion]
mod_memcg_state(page->mem_cgroup, idx, val);
~~~~~~~~~~~~~~~ ^~~
include/linux/memcontrol.h:639:40: warning: implicit conversion from
enumeration type 'enum memcg_stat_item' to different enumeration type 'enum
vm_event_item' [-Wenum-conversion]
count_memcg_events(page->mem_cgroup, idx, 1);
~~~~~~~~~~~~~~~~~~ ^~~
5 warnings generated.
.In file included from /virtual/main.c:3:
In file included from include/net/sock.h:58:
include/linux/memcontrol.h:580:31: warning: implicit conversion from
enumeration type 'enum node_stat_item' to different enumeration type 'enum
memcg_stat_item' [-Wenum-conversion]
__mod_memcg_state(pn->memcg, idx, val);
~~~~~~~~~~~~~~~~~ ^~~
include/linux/memcontrol.h:593:29: warning: implicit conversion from
enumeration type 'enum node_stat_item' to different enumeration type 'enum
memcg_stat_item' [-Wenum-conversion]
mod_memcg_state(pn->memcg, idx, val);
~~~~~~~~~~~~~~~ ^~~
include/linux/memcontrol.h:605:38: warning: implicit conversion from
enumeration type 'enum node_stat_item' to different enumeration type 'enum
memcg_stat_item' [-Wenum-conversion]
__mod_memcg_state(page->mem_cgroup, idx, val);
~~~~~~~~~~~~~~~~~ ^~~
include/linux/memcontrol.h:618:36: warning: implicit conversion from
enumeration type 'enum node_stat_item' to different enumeration type 'enum
memcg_stat_item' [-Wenum-conversion]
mod_memcg_state(page->mem_cgroup, idx, val);
~~~~~~~~~~~~~~~ ^~~
include/linux/memcontrol.h:639:40: warning: implicit conversion from
enumeration type 'enum memcg_stat_item' to different enumeration type 'enum
vm_event_item' [-Wenum-conversion]
count_memcg_events(page->mem_cgroup, idx, 1);
~~~~~~~~~~~~~~~~~~ ^~~
5 warnings generated.
.In file included from /virtual/main.c:3:
In file included from include/net/sock.h:58:
include/linux/memcontrol.h:580:31: warning: implicit conversion from
enumeration type 'enum node_stat_item' to different enumeration type 'enum
memcg_stat_item' [-Wenum-conversion]
__mod_memcg_state(pn->memcg, idx, val);
~~~~~~~~~~~~~~~~~ ^~~
include/linux/memcontrol.h:593:29: warning: implicit conversion from
enumeration type 'enum node_stat_item' to different enumeration type 'enum
memcg_stat_item' [-Wenum-conversion]
mod_memcg_state(pn->memcg, idx, val);
~~~~~~~~~~~~~~~ ^~~
include/linux/memcontrol.h:605:38: warning: implicit conversion from
enumeration type 'enum node_stat_item' to different enumeration type 'enum
memcg_stat_item' [-Wenum-conversion]
__mod_memcg_state(page->mem_cgroup, idx, val);
~~~~~~~~~~~~~~~~~ ^~~
include/linux/memcontrol.h:618:36: warning: implicit conversion from
enumeration type 'enum node_stat_item' to different enumeration type 'enum
memcg_stat_item' [-Wenum-conversion]
mod_memcg_state(page->mem_cgroup, idx, val);
~~~~~~~~~~~~~~~ ^~~
include/linux/memcontrol.h:639:40: warning: implicit conversion from
enumeration type 'enum memcg_stat_item' to different enumeration type 'enum
vm_event_item' [-Wenum-conversion]
count_memcg_events(page->mem_cgroup, idx, 1);
~~~~~~~~~~~~~~~~~~ ^~~
5 warnings generated.
.In file included from /virtual/main.c:3:
In file included from include/net/sock.h:58:
include/linux/memcontrol.h:580:31: warning: implicit conversion from
enumeration type 'enum node_stat_item' to different enumeration type 'enum
memcg_stat_item' [-Wenum-conversion]
__mod_memcg_state(pn->memcg, idx, val);
~~~~~~~~~~~~~~~~~ ^~~
include/linux/memcontrol.h:593:29: warning: implicit conversion from
enumeration type 'enum node_stat_item' to different enumeration type 'enum
memcg_stat_item' [-Wenum-conversion]
mod_memcg_state(pn->memcg, idx, val);
~~~~~~~~~~~~~~~ ^~~
include/linux/memcontrol.h:605:38: warning: implicit conversion from
enumeration type 'enum node_stat_item' to different enumeration type 'enum
memcg_stat_item' [-Wenum-conversion]
__mod_memcg_state(page->mem_cgroup, idx, val);
~~~~~~~~~~~~~~~~~ ^~~
include/linux/memcontrol.h:618:36: warning: implicit conversion from
enumeration type 'enum node_stat_item' to different enumeration type 'enum
memcg_stat_item' [-Wenum-conversion]
mod_memcg_state(page->mem_cgroup, idx, val);
~~~~~~~~~~~~~~~ ^~~
include/linux/memcontrol.h:639:40: warning: implicit conversion from
enumeration type 'enum memcg_stat_item' to different enumeration type 'enum
vm_event_item' [-Wenum-conversion]
count_memcg_events(page->mem_cgroup, idx, 1);
~~~~~~~~~~~~~~~~~~ ^~~
5 warnings generated.
....Killed
F.....check dmesg output for possible cause
check dmesg output for possible cause
..Traceback (most recent call last):
File "../../tools/wakeuptime.py", line 216, in <module>
print(" %-16s %s" % ("target:", k.target.decode()))
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe0 in position 0:
ordinal not in range(128)
Maybe this is a real issue related to python 2/3 compatibility, could
you help take a look?


F....
======================================================================
FAIL: test_bpflist (__main__.SmokeTests)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/home/mpercy/src/bcc/tests/python/test_tools_smoke.py", line 89, in
test_bpflist
self.run_with_duration("bpflist.py")
File "/home/mpercy/src/bcc/tests/python/test_tools_smoke.py", line 32, in
run_with_duration
(timeout, full_command), shell=True))
AssertionError: 0 != 1

======================================================================
FAIL: test_ucalls (__main__.SmokeTests)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/home/mpercy/src/bcc/tests/python/test_tools_smoke.py", line 336, in
test_ucalls
timeout=60, kill_timeout=60)
File "/home/mpercy/src/bcc/tests/python/test_tools_smoke.py", line 52, in
run_with_int
or (rc == 137 and kill), "rc was %d" % rc)
AssertionError: rc was 137
timeout handling issue?
43 # timeout returns 124 if the program did not terminate prematurely,
44 # and returns 137 if we used KILL instead of INT. So there are three
45 # sensible scenarios:
46 # 1. The script is allowed to return early, and it did, with a
47 # success return code.
48 # 2. The script timed out and was killed by the SIGINT signal.
49 # 3. The script timed out and was killed by the SIGKILL
signal, and
50 # this was what we asked for using kill=True.
51 self.assertTrue((rc == 0 and allow_early) or rc == 124
52 or (rc == 137 and kill), "rc was %d" % rc)


======================================================================
FAIL: test_wakeuptime (__main__.SmokeTests)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/home/mpercy/src/bcc/tests/python/test_tools_smoke.py", line 371, in
test_wakeuptime
self.run_with_duration("wakeuptime.py 1")
File "/home/mpercy/src/bcc/tests/python/test_tools_smoke.py", line 32, in
run_with_duration
(timeout, full_command), shell=True))
AssertionError: 0 != 1

----------------------------------------------------------------------
Ran 76 tests in 368.071s

FAILED (failures=3)
Failed
<end of output>
Test time = 368.11 sec
----------------------------------------------------------
Test Failed.
"py_test_tools_smoke" end time: Mar 06 22:18 PST
"py_test_tools_smoke" time elapsed: 00:06:08
----------------------------------------------------------


_______________________________________________
iovisor-dev mailing list
iovisor-dev@...
https://lists.iovisor.org/mailman/listinfo/iovisor-dev


Mike Percy <mpercy@...>
 

Y Song, thank you very much for your detailed feedback.

Let me focus on the specific wakeuptime.py failure since that seems to be my biggest blocker at the moment:

>   File "../../tools/wakeuptime.py", line 216, in <module>
>     print("    %-16s %s" % ("target:", k.target.decode()))
> UnicodeDecodeError: 'ascii' codec can't decode byte 0xe0 in position 0:
> ordinal not in range(128)

Maybe this is a real issue related to python 2/3 compatibility, could
you help take a look?


With Linux 4.13, due to issues with field randomization, you may need two #define directives before the includes:
#define randomized_struct_fields_start  struct {
#define randomized_struct_fields_end    };
#include <linux/sched.h>
int do_trace(void *ctx) {
    struct task_struct *t = (struct task_struct *)bpf_get_current_task();
[...]

And indeed that fixed the wakeuptime tool. For now, this patch seems to do the trick on this kernel (the undefs just get rid of some warnings):

diff --git a/tools/wakeuptime.py b/tools/wakeuptime.py
index cf0ca7d..3f6fd67 100755
--- a/tools/wakeuptime.py
+++ b/tools/wakeuptime.py
@@ -86,6 +86,10 @@ def signal_ignore(signal, frame):

 # define BPF program
 bpf_text = """
+#undef randomized_struct_fields_start
+#undef randomized_struct_fields_end
+#define randomized_struct_fields_start  struct {
+#define randomized_struct_fields_end    };
 #include <uapi/linux/ptrace.h>
 #include <linux/sched.h>

It appears that same workaround is also needed for many other tools in the repo to work correctly on this kernel.

I was looking around at the kernel 4.13 release notes and I found the RANDSTRUCT patch <https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=313dd1b629219db50cad532dba6a3b3b22ffe622> but I'm not sure that is what I'm hitting here since I didn't find a reference to it in my kernel config. Any ideas about the root cause for this workaround and if this is a permanent issue with kernels after 4.13? If so, then I suppose we should find a more maintainable way of injecting that workaround automatically in the BPF class, or find a more permanent workaround for bcc.

Thoughts?

Thanks again, 
Mike


Yonghong Song
 

Your workaround looks good. Basically, you can put the workaround in
for 4.13 for any tools which references
task_struct construct.

The bcc issue https://github.com/iovisor/bcc/issues/1492 has more details.
It is an issue for 4.13.
it is an issue for 4.14 original release, but the patch is ported to
stable release, so later 4.14 update should contain the fix.
4.15 and later contains the fix.

On Wed, Mar 7, 2018 at 10:34 PM, Mike Percy <mpercy@...> wrote:
Y Song, thank you very much for your detailed feedback.

Let me focus on the specific wakeuptime.py failure since that seems to be my
biggest blocker at the moment:

File "../../tools/wakeuptime.py", line 216, in <module>
print(" %-16s %s" % ("target:", k.target.decode()))
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe0 in position 0:
ordinal not in range(128)
Maybe this is a real issue related to python 2/3 compatibility, could
you help take a look?

It looks like this is a known issue. According to
https://github.com/iovisor/bcc/blob/master/docs/reference_guide.md#7-bpf_get_current_task
:

With Linux 4.13, due to issues with field randomization, you may need two
#define directives before the includes:
#define randomized_struct_fields_start struct {
#define randomized_struct_fields_end };
#include <linux/sched.h>
int do_trace(void *ctx) {
struct task_struct *t = (struct task_struct *)bpf_get_current_task();
[...]

And indeed that fixed the wakeuptime tool. For now, this patch seems to do
the trick on this kernel (the undefs just get rid of some warnings):

diff --git a/tools/wakeuptime.py b/tools/wakeuptime.py
index cf0ca7d..3f6fd67 100755
--- a/tools/wakeuptime.py
+++ b/tools/wakeuptime.py
@@ -86,6 +86,10 @@ def signal_ignore(signal, frame):

# define BPF program
bpf_text = """
+#undef randomized_struct_fields_start
+#undef randomized_struct_fields_end
+#define randomized_struct_fields_start struct {
+#define randomized_struct_fields_end };
#include <uapi/linux/ptrace.h>
#include <linux/sched.h>

It appears that same workaround is also needed for many other tools in the
repo to work correctly on this kernel.

I was looking around at the kernel 4.13 release notes and I found the
RANDSTRUCT patch
<https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=313dd1b629219db50cad532dba6a3b3b22ffe622>
but I'm not sure that is what I'm hitting here since I didn't find a
reference to it in my kernel config. Any ideas about the root cause for this
workaround and if this is a permanent issue with kernels after 4.13? If so,
then I suppose we should find a more maintainable way of injecting that
workaround automatically in the BPF class, or find a more permanent
workaround for bcc.

Thoughts?

Thanks again,
Mike


Mike Percy <mpercy@...>
 

Great! Thanks for the info!

Mike

On Wed, Mar 7, 2018 at 10:47 PM, Y Song <ys114321@...> wrote:
Your workaround looks good. Basically, you can put the workaround in
for 4.13 for any tools which references
task_struct construct.

The bcc issue https://github.com/iovisor/bcc/issues/1492 has more details.
It is an issue for 4.13.
it is an issue for 4.14 original release, but the patch is ported to
stable release, so later 4.14 update should contain the fix.
4.15 and later contains the fix.

On Wed, Mar 7, 2018 at 10:34 PM, Mike Percy <mpercy@...> wrote:
> Y Song, thank you very much for your detailed feedback.
>
> Let me focus on the specific wakeuptime.py failure since that seems to be my
> biggest blocker at the moment:
>
>> >   File "../../tools/wakeuptime.py", line 216, in <module>
>> >     print("    %-16s %s" % ("target:", k.target.decode()))
>> > UnicodeDecodeError: 'ascii' codec can't decode byte 0xe0 in position 0:
>> > ordinal not in range(128)
>>
>> Maybe this is a real issue related to python 2/3 compatibility, could
>> you help take a look?
>
>
> It looks like this is a known issue. According to
> https://github.com/iovisor/bcc/blob/master/docs/reference_guide.md#7-bpf_get_current_task
> :
>
>> With Linux 4.13, due to issues with field randomization, you may need two
>> #define directives before the includes:
>> #define randomized_struct_fields_start  struct {
>> #define randomized_struct_fields_end    };
>> #include <linux/sched.h>
>> int do_trace(void *ctx) {
>>     struct task_struct *t = (struct task_struct *)bpf_get_current_task();
>> [...]
>
>
> And indeed that fixed the wakeuptime tool. For now, this patch seems to do
> the trick on this kernel (the undefs just get rid of some warnings):
>
> diff --git a/tools/wakeuptime.py b/tools/wakeuptime.py
> index cf0ca7d..3f6fd67 100755
> --- a/tools/wakeuptime.py
> +++ b/tools/wakeuptime.py
> @@ -86,6 +86,10 @@ def signal_ignore(signal, frame):
>
>  # define BPF program
>  bpf_text = """
> +#undef randomized_struct_fields_start
> +#undef randomized_struct_fields_end
> +#define randomized_struct_fields_start  struct {
> +#define randomized_struct_fields_end    };
>  #include <uapi/linux/ptrace.h>
>  #include <linux/sched.h>
>
> It appears that same workaround is also needed for many other tools in the
> repo to work correctly on this kernel.
>
> I was looking around at the kernel 4.13 release notes and I found the
> RANDSTRUCT patch
> <https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=313dd1b629219db50cad532dba6a3b3b22ffe622>
> but I'm not sure that is what I'm hitting here since I didn't find a
> reference to it in my kernel config. Any ideas about the root cause for this
> workaround and if this is a permanent issue with kernels after 4.13? If so,
> then I suppose we should find a more maintainable way of injecting that
> workaround automatically in the BPF class, or find a more permanent
> workaround for bcc.
>
> Thoughts?
>
> Thanks again,
> Mike
>