Re: [PATCHv5 RFC 1/3] BPF: New helper to obtainnamespace data from current task.


neirac
 

Hi Yonghong,

 

I had time yesterday to finish the code to avoid using getname_kernel , self-tests run ok with expected results.

Here is the current code to avoid the getname_kernel call.

 

diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c

index 5e28718928ca..bfb2bbf37a58 100644

--- a/kernel/bpf/helpers.c

+++ b/kernel/bpf/helpers.c

@@ -11,6 +11,12 @@

#include <linux/uidgid.h>

#include <linux/filter.h>

#include <linux/ctype.h>

+#include <linux/pid_namespace.h>

+#include <linux/major.h>

+#include <linux/stat.h>

+#include <linux/namei.h>

+#include <linux/version.h>

+

 

 #include "../../lib/kstrtox.h"

@@ -312,6 +318,81 @@ void copy_map_value_locked(struct bpf_map *map, void *dst, void *src,

               preempt_enable();

}

+BPF_CALL_2(bpf_get_current_pidns_info, struct bpf_pidns_info *, pidns_info, u32,

+             size)

+{

+             const char* name = "/proc/self/ns/pid";

+             struct pid_namespace *pidns = NULL;

+             int len = strlen(name) + 1;

+             struct filename *tmp = NULL;

+             struct inode *inode;

+             struct path kp;

+             pid_t tgid = 0;

+             pid_t pid = 0;

+             int ret;

+            

+

+             if (unlikely(size != sizeof(struct bpf_pidns_info)))

+                             goto clear;

+

+             pidns = task_active_pid_ns(current);

+

+             if (unlikely(!pidns))

+                             goto clear;

+

+             pidns_info->nsid =  pidns->ns.inum;

+             pid = task_pid_nr_ns(current, pidns);

+

+             if (unlikely(!pid))

+                             goto clear;

+

+             tgid = task_tgid_nr_ns(current, pidns);

+

+             if (unlikely(!tgid))

+                             goto clear;

+

+             pidns_info->tgid = (u32) tgid;

+             pidns_info->pid = (u32) pid;

+

+             if(!names_cachep)

+                             goto clear;

+

+             tmp = kmem_cache_alloc(names_cachep, GFP_ATOMIC);

+             if (unlikely(!tmp)) {

+                             goto clear;

+             }

+

+             memcpy((char *)tmp->name, name, len);

+             tmp->uptr = NULL;

+             tmp->aname = NULL;

+             tmp->refcnt = 1;

+

+        ret = filename_lookup(AT_FDCWD, tmp, 0, &kp, NULL);

+

+             if (ret)

+                             goto clear;

+            

+

+             inode = d_backing_inode(kp.dentry);

+             pidns_info->dev = inode->i_sb->s_dev;

+

+             return 0;

+

+       clear:

+             if (pidns_info)

+                             memset((void *)pidns, 0, (size_t) size);

+

+             return -EINVAL;

+}

+

+const struct bpf_func_proto bpf_get_current_pidns_info_proto = {

+             .func      = bpf_get_current_pidns_info,

+             .gpl_only             = false,

+             .ret_type             = RET_INTEGER,

+             .arg1_type          = ARG_PTR_TO_UNINIT_MEM,

+             .arg2_type          = ARG_CONST_SIZE,

+};

+

#ifdef CONFIG_CGROUPS

BPF_CALL_0(bpf_get_current_cgroup_id)

{

 

 

Now the problem I’m facing is that after running the sample code to call the helper, I’m seeing this once but then it’s never repeated running in later runs of the same sample code.

 

[ 75.097213] CPU: 0 PID: 1382 Comm: bash Not tainted 5.3.0-rc1patch12-00120-g5d8525007580-dirty #7

[   75.097213] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 04/13/2018

[   75.097215] RIP: 0010:idr_alloc_u32+0xa8/0xc0

[   75.097216] Code: 31 d2 48 89 e6 48 89 ef e8 e5 54 00 00 31 c0 48 8b 7c 24 20 65 48 33 3c 25 28 00 00 00 75 1a 48 83 c4 28 5b 5d 41 5c 41 5d c3 <0f> 0b 0d 04 00 80 00 89 47 04 e9 78 ff ff ff e8 14 ed 93 ff 0f 1f

[   75.097216] RSP: 0018:ffffac39007c7d08 EFLAGS: 00010046

[   75.097217] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 000000000001ffff

[   75.097218] RDX: ffffac39007c7d5c RSI: 0000000000000d99 RDI: ffffffffbea46b88

[   75.097218] RBP: ffffffffbea46b88 R08: 0000000000000a20 R09: 0000000000000000

[   75.097218] R10: 0000000001200000 R11: 0000000000000000 R12: 0000000000000000

[   75.097219] R13: ffffac39007c7d5c R14: 000000000000012c R15: ffffffffbea46b80

[   75.097219] FS:  00007f7ba7691b40(0000) GS:ffffa0fef5c00000(0000) knlGS:0000000000000000

[   75.097220] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033

[   75.097220] CR2: 00000000022eea64 CR3: 000000022f498004 CR4: 00000000003606f0

[   75.097246] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000

[   75.097246] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400

[   75.097247] Call Trace:

[   75.097251]  ? kmem_cache_alloc+0x161/0x600

[   75.097252]  idr_alloc_cyclic+0x53/0xb0

[   75.097254]  alloc_pid+0xc1/0x2a0

[   75.097255]  copy_process+0xdc4/0x1c20

[   75.097256]  _do_fork+0x74/0x360

[   75.097257]  ? recalc_sigpending+0x17/0x50

[   75.097258]  __x64_sys_clone+0x7f/0xa0

[   75.097260]  do_syscall_64+0x55/0x130

[   75.097261]  entry_SYSCALL_64_after_hwframe+0x44/0xa9

[   75.097263] RIP: 0033:0x7f7ba6d6538b

[   75.097264] Code: db 45 85 f6 0f 85 95 01 00 00 64 4c 8b 04 25 10 00 00 00 31 d2 4d 8d 90 d0 02 00 00 31 f6 bf 11 00 20 01 b8 38 00 00 00 0f 05 <48> 3d 00 f0 ff ff 0f 87 de 00 00 00 85 c0 41 89 c5 0f 85 e5 00 00

[   75.097265] RSP: 002b:00007ffc9a6cf180 EFLAGS: 00000246 ORIG_RAX: 0000000000000038

[   75.097265] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f7ba6d6538b

[   75.097266] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000001200011

[   75.097266] RBP: 00007ffc9a6cf1b0 R08: 00007f7ba7691b40 R09: 0000000000000002

[   75.097266] R10: 00007f7ba7691e10 R11: 0000000000000246 R12: 0000000000000000

[   75.097267] R13: 00000000004cd15e R14: 0000000000000000 R15: 0000000000000000

[   75.097268] ---[ end trace 2937ef4d051e39cd ]---

 

I need a couple of days to debug this and then submit the patch if nobody finds anything odd in the code.

I hope tomorrow I’ll have good news on this if nobody finds anything odd in the code first.

 

Bests

 

From: Y Song
Sent: 05 August 2019 16:16
To: carlos antonio neira bustos
Cc: Alexei Starovoitov; Tom Herbert via iovisor-dev; Eric Biederman
Subject: Re: [iovisor-dev][PATCHv5 RFC 1/3] BPF: New helper to obtainnamespace data from current task.

 

Carlos,

 

Any update on this? It would be good to submit the patch earlier for a

release cycle

so we got a chance it may be accepted in the current release cycle.

 

On Fri, Jul 26, 2019 at 3:41 AM carlos antonio neira bustos

<cneirabustos@...> wrote:

> Thanks Yonghong!.

> Yes, I was looking at filename_lookup today on how to do it, I'll work on that.

> Thank you very much.

> El vie., 26 de jul. de 2019 01:18, Yonghong Song <ys114321@...> escribió:

>> 

>> On Thu, Jul 25, 2019 at 12:11 PM Alexei Starovoitov

>> <alexei.starovoitov@...> wrote:

>> >

>> > On Thu, Jul 25, 2019 at 11:53 AM neirac <cneirabustos@...> wrote:

>> > >

>> > > Hey Yonghong,

>> > > I have changed getname_kernel interface to specify the allocation type,

>> > > all previous callers preserve GFP_KERNEL as the allocation type, after this change

>> > > getname_kernel could be used with GFP_ATOMIC in ebpf helpers.

>> > > If this change goes in, I could complete bpf_get_current_pidns_info helper.

>> > > Let me know your thoughts.

>> >

>> > please figure out how to get pidns_info w/o calling into getname_kernel().

>> 

>> Looks like we can call filename_lookup() directly. We do not need to

>> call getname_kernel().

>> We know the exact path name. We can do a atomic allocation inside the helper.

>> Carlos, could you take a look along this line? In this way, we do not

>> need to change

>> any files under linux/fs.

>> 

>>

>> 

 

Join iovisor-dev@lists.iovisor.org to automatically receive all group messages.