Hi, when I looked into the runqlen script yesterday, I found that,
sadly, I misunderstood the "queue length" all the time not only the
"length" part but also the "queue" part.
Only CFS runqueues are taken into account. This makes sense when
main workloads are all under CFS scheduler, which is common in
cloud scenario. But what I don't quite follow is that the selected
queue is task->se.cfs_rq which is from a task view, rather than the
top level cfs_rq from a cpu view. I suppose the task view is not
enough to draw the whole picture of saturation?
Within this scope length means the number of schedulable entities,
that is cfs_rq->nr_running. From time sharing point of view, it is
OK because it represents how many units involved in scheduling of
this cfs_rq. But what about from execution point of view in which
the number of tasks (cfs_rq->h_nr_running) will be used?
And besides the above, without the shares information of each entity,
how could runqlen help us optimizing the performance? Maybe we should
always focus on occupancy rather than length?
It would be very much appreciated if someone can shed some light.
Thanks & Best regards,