Re: Access packet payload in TC egress programs

Yonghong Song

On Fri, Oct 22, 2021 at 12:31 AM Federico Parola
<federico.parola@...> wrote:

Thanks for the answer, I wasn't aware of the existence of that helper.
I have two additional comments:

1. The documentation of the helper says that passing a length of zero
should pull the whole length of the packet [1], however with that
parameter the length of direct accessible data stays unchanged. I think
there is a mismatch in the behavior and the documentation.
The source code is
BPF_CALL_2(bpf_skb_pull_data, struct sk_buff *, skb, u32, len)
/* Idea is the following: should the needed direct read/write
* test fail during runtime, we can pull in more data and redo
* again, since implicitly, we invalidate previous checks here.
* Or, since we know how much we need to make read/writeable,
* this can be done once at the program beginning for direct
* access case. By this we overcome limitations of only current
* headroom being accessible.
return bpf_try_make_writable(skb, len ? : skb_headlen(skb));

So if len is 0, it will only try to make *existing* linear data to be
writable, so
you are right. It seems we are not not trying to pull more data in. I will
check with other kernel developers later.

2. I'd like to avoid re-parsing all the headers after I have pulled new
data. To do so I save the offset I just reached (the end of the TCP
header), pull data, get the new data and data_end pointers and add the
offset to data. However the verifier does not accept my accesses to the
packet from this point on. Here is some example code:
The current behavior is after data pull, you will need to reparse the packet.
There are a lot of helpers fitting in this case:

bool bpf_helper_changes_pkt_data(void *func)
if (func == bpf_skb_vlan_push ||
func == bpf_skb_vlan_pop ||
func == bpf_skb_store_bytes ||
func == bpf_skb_change_proto ||
func == bpf_skb_change_head ||
func == sk_skb_change_head ||
func == bpf_skb_change_tail ||
func == sk_skb_change_tail ||
func == bpf_skb_adjust_room ||
func == sk_skb_adjust_room ||
func == bpf_skb_pull_data ||
func == sk_skb_pull_data ||
func == bpf_clone_redirect ||
func == bpf_l3_csum_replace ||
func == bpf_l4_csum_replace ||
func == bpf_xdp_adjust_head ||
func == bpf_xdp_adjust_meta ||
func == bpf_msg_pull_data ||
func == bpf_msg_push_data ||
func == bpf_msg_pop_data ||
func == bpf_xdp_adjust_tail ||
func == bpf_lwt_seg6_store_bytes ||
func == bpf_lwt_seg6_adjust_srh ||
func == bpf_lwt_seg6_action ||
func == bpf_sock_ops_store_hdr_opt ||
func == bpf_lwt_in_push_encap ||
func == bpf_lwt_xmit_push_encap)
return true;

return false;

It is possible that we could fine tune this behavior as some helpers
like bpf_skb_pull_data() may not need to start over again. But I
could miss some conditions.

Could you post your questions at bpf@...?
Networking people in the mailing list may give you a better
answer why this behavior for bpf_skb_pull_data() and whether
it can be improved.

unsigned payload_offset = (void *)tcph + (tcph->doff << 2) - data;
bpf_skb_pull_data(ctx, ctx->len);
data = (void *)(long)ctx->data;
data_end = (void *)(long)ctx->data_end;

struct tls_record_hdr *rech = data + payload_offset;
if ((void *)(rech + 1) > data_end)
return TC_ACT_OK;

bpf_trace_printk("It's a handshake");

Running this code gives me the error "R1 offset is outside of the
packet" even if I performed the correct check on packet boundaries. If I
re-parse all header the code is accepted. Is there a way to solve the


On 20/10/21 08:11, Y Song wrote:
On Tue, Oct 19, 2021 at 8:13 AM Federico Parola
<federico.parola@...> wrote:

Dear all,
how can I access the payload of the packet in a program attached to the
TC egress hook (SCHED_CLS attached to clsact qdisc)?
ctx->data_end points to the end of the L4 header, while on the ingress
hook it points to the end of the packet (tested on kernel v5.14).
This could be the case that linear data only covers up to the end of
L4 header. In such cases, you can use bpf_skb_pull_data() helper
to get more data into linear region and after that your ctx->data_end
will point to much later packet data.

Best regards,
Federico Parola

Join to automatically receive all group messages.