Questions about runqlen


Abel Wu
 

Hi, when I looked into the runqlen script yesterday, I found that,
sadly, I misunderstood the "queue length" all the time not only the
"length" part but also the "queue" part.

Queue
=====
Only CFS runqueues are taken into account. This makes sense when
main workloads are all under CFS scheduler, which is common in
cloud scenario. But what I don't quite follow is that the selected
queue is task->se.cfs_rq which is from a task view, rather than the
top level cfs_rq from a cpu view. I suppose the task view is not
enough to draw the whole picture of saturation?

Length
======
Within this scope length means the number of schedulable entities,
that is cfs_rq->nr_running. From time sharing point of view, it is
OK because it represents how many units involved in scheduling of
this cfs_rq. But what about from execution point of view in which
the number of tasks (cfs_rq->h_nr_running) will be used?

And besides the above, without the shares information of each entity,
how could runqlen help us optimizing the performance? Maybe we should
always focus on occupancy rather than length?

It would be very much appreciated if someone can shed some light.

Thanks & Best regards,
Abel


Yonghong Song
 

On Tue, Mar 16, 2021 at 4:00 AM Abel Wu <wuyun.abel@...> wrote:

Hi, when I looked into the runqlen script yesterday, I found that,
sadly, I misunderstood the "queue length" all the time not only the
"length" part but also the "queue" part.
Could you file an "issue" for the question? This issue, the
questions/answers can be easily tracked.


Queue
=====
Only CFS runqueues are taken into account. This makes sense when
main workloads are all under CFS scheduler, which is common in
cloud scenario. But what I don't quite follow is that the selected
queue is task->se.cfs_rq which is from a task view, rather than the
top level cfs_rq from a cpu view. I suppose the task view is not
enough to draw the whole picture of saturation?

Length
======
Within this scope length means the number of schedulable entities,
that is cfs_rq->nr_running. From time sharing point of view, it is
OK because it represents how many units involved in scheduling of
this cfs_rq. But what about from execution point of view in which
the number of tasks (cfs_rq->h_nr_running) will be used?

And besides the above, without the shares information of each entity,
how could runqlen help us optimizing the performance? Maybe we should
always focus on occupancy rather than length?
There are some answers in this issue:
https://github.com/iovisor/bcc/issues/3093

To be accurate for cgroup/task-group environments, you may
need to traverse to the root. Could you check and experiment
whether this can solve your issue? if this is the case, we may
need to enhance runqlen.py. Maybe you could help provide
a pull request? Thanks!


It would be very much appreciated if someone can shed some light.

Thanks & Best regards,
Abel





Abel Wu
 

Hi Y Song,

On 3/21/21 1:38 AM, Y Song wrote:
On Tue, Mar 16, 2021 at 4:00 AM Abel Wu <wuyun.abel@...> wrote:

Hi, when I looked into the runqlen script yesterday, I found that,
sadly, I misunderstood the "queue length" all the time not only the
"length" part but also the "queue" part.
Could you file an "issue" for the question? This issue, the
questions/answers can be easily tracked.


Queue
=====
Only CFS runqueues are taken into account. This makes sense when
main workloads are all under CFS scheduler, which is common in
cloud scenario. But what I don't quite follow is that the selected
queue is task->se.cfs_rq which is from a task view, rather than the
top level cfs_rq from a cpu view. I suppose the task view is not
enough to draw the whole picture of saturation?

Length
======
Within this scope length means the number of schedulable entities,
that is cfs_rq->nr_running. From time sharing point of view, it is
OK because it represents how many units involved in scheduling of
this cfs_rq. But what about from execution point of view in which
the number of tasks (cfs_rq->h_nr_running) will be used?

And besides the above, without the shares information of each entity,
how could runqlen help us optimizing the performance? Maybe we should
always focus on occupancy rather than length?
There are some answers in this issue:
https://github.com/iovisor/bcc/issues/3093
To be accurate for cgroup/task-group environments, you may
need to traverse to the root. Could you check and experiment
whether this can solve your issue? if this is the case, we may
need to enhance runqlen.py. Maybe you could help provide
a pull request? Thanks!
Loop is forbidden in BPF programs (although bounded loop is
supported from linux-5.3, tracking down to NULL se->parent is
un-bounded). Maybe it's worth trying to get the definition of
struct rq? I will PR if made some progress.

Thanks,
Abel


It would be very much appreciated if someone can shed some light.

Thanks & Best regards,
Abel