On 02/14/2011 03:15 PM, Robin Gareus wrote:
Normal
processes, that are not caught specially are grouped into groups,
using the pgrp process value. ( to workaround bad ui behaviour like kde
& gnome this value can be overridden by rules. Usually all programs
started get a group and all children of them end in the same group.)
I guess you guys need something different.
• do you a lot of rt tasks besides pulseaudio and jackd ?
All the audio-interface related IRQ tasks (on PREEMPT RT - e.g.
sirq-hrtimer, sirq-timer and irq/18-uhci_hcd, irq/17-ohci1394,
irq/28-hda_inte,.. etc - but they're inside the kernel and may not need
special attention from ulatencyd)
yes. what can be done is to bind those to certain cpu cores, don't know
if this can help reducing latency, tho.
All processes/threads that inherit the RT privileges
from JACK.
Are these processes parent of jack or do they just connect to it ?
If these are only connected, how can i get a list of all pids in the graph ?
do you have me same simple command i just can run that causes non
trivial load on jackd and it's childs to test ?
There may be a few ALSA-midi clients (no JACK) that
try to acquire RT
scheduling but I can't think of any just now. Oh, and cdrdao (mastering
audio to CD) likes to have RT privileges.
ok.
AFAICT, the majority use professional LA users is not
using pulseaudio
at all.
of course. PA is for desktop usage, not pro audio :-)
• do they need
a lot of cpu time ? is 45% not enough for jackd/pulseaudio ?
What is the reason behind 45% ? Wouldn't 99% make more sense? just use
the bare minimum to prevent the system from locking up. Am I missing
something there?
I have chosen 45% rt because on a normal desktop the user does not
expect that some rt task started for whatever reason will eat up all cpu
power. But of course, for pro audio this is ok. Switching configurations
in ulatencyd is just a two clicks and a password away. You can change
the default configuration, too.
I simply think that "one configuration fits all" does not work. You can
make a configuration that does not suck, but will be far from perfect. I
goal is perfection for different workloads :-)
For a typical desktop system, it's pretty close.
Why would I buy a fast multi-core CPU to have 55% of
it just idling on
stage, during a performance or in a studio? Some headroom is fine, but
why restrict the box?
I don't restrict to idle, at least thats not the intention. The only
thing the desktop config does restrict is the RT amount, because I don't
think that more then 45% cpu is a normal workload on RT tasks on a desktop.
FWIW: I sometimes bump into 50-90 %CPU load (actually
~160% on a dual
core or ~300% on quad core) by running jammin, multiple jconvolvers or
occasional foo-yc20.
• would it be better to just put everything into
one cgroup instead of
many smaller groups ? maybe just make exceptions for ui apps and the
rest into one group ?
In order to come to a conclusion of "what is better". Could you please
outline advantages/disadvantages of each approach?
each group has a cpu.shares value and a rt_runtime_us (along with other
values).
cgroup_a cpu.shares 10:
task 1
task 2
cgroup_b cpu.shares 5:
task 3
task 4
imaging all 4 tasks want 100% cpu time, then in this setting task 1 & 2
get 66% of the cpu time and task 3 & 4 get 33%.
if you put all in one group: very task gets the same cpu time.
shares that are not used will always be given to other slots, so there
is no disadvantage in giving a group a higher share.
finding good groups of processes is of course not a easy task to get
right. the current grouping works very well for desktops.
There are
other neat things that could be done:
• move all away from one processor and move jackd for example on it,
giving him lowest latency possible. ok, not perfect as the kernel may
still use this cpu. But interrupts can also be masked on the core.
That would void the parallel execution of jack2 and tschack graphs,
wouldn't it? The main JACK-process is not CPU heavy: it's the JACK
client's threads that cause the CPU load.
i ment of course the complete graph. If you get like 4 cores, you move
all processes to 1 core and use the 3 remaining cores to do audio only.
what about a configuration like this:
- rt_audio: cpu.shares 3000 rt: 99%
- desktop ui: cpu.shares 500 rt: 0
- audio ui: cpu.shares 2000 rt: 0
- the rest: cpu.shares 100 rt: 0
?
kind regards
daniel