What I want to do, is to use the resources I have to run multiple signal
generation and processing chains asynchronously, in parallel, and then
use the final audio-hardware-synchronized chain to resample them all
into one, perhaps using the Zita tools.
I think you may be confusing two different concepts.
Running in parallel
running asynchronous processing
No, I know the difference very well.  My architecture currently has seven synchronous parallel chains right now ( see http://lsn.ponderworthy.com/doku.php/concurrent_patch_management ), in which I call upon whichever simultaneous chain sets I need by switching MIDI input port.  Each chain is controlled and modulated by Calf plugin sets and a single big (but beautifully multithreaded, thank God for you sir) Non-Mixer.  Chain #1 has two Yoshimis running two patches each, it builds from there.  I am not going to raise my latency from the ~2ms it currently uses beyond 4ms at very most, I play this box live with very rapid notes.  I am not sure why I cannot cram more in the JACK cycle given my low CPU usage for the JACK process, but I am thinking that this is probably bound to the audio signal sample rate, and I am already at 96Hz for that very reason.  So I could go to 192Hz overall audio sampling -- not something I have intended for many reasons -- or desynchronize something.

You can run multiple independent synchronous processes in parallel, which
is probably what you want.  Using jack2 should allow that, but you have to
check the connection graph to make sure it is happening.  The original
description you gave was not very detailed, are the Yoshimi instances just
connected to the jack output and that is all?  Is there any stage where
both Yoshimi instances connect that would become the processing limit for
the period?
Each Yoshimi instance hits a separate input (thread) in Non-Mixer, this is one of the things which has made things work as well as they are.
use the final audio-hardware-synchronized chain to resample them all
into one, perhaps using the Zita tools.
Resampling will always add additional latency.  Having unsynchronized
streams that have to be synchronized will always add additional latency.
The zita-nj (network to jack) tools describe this very well in the man
page excerpt below.
That helped!  I now have to wonder what best to do.  At least in theory, I could run the pre-hardware chains at 192K :-)  That could certainly be set up to add very little latency to the resampling phase, if it worked!  We will see :-)  I almost want to go to non-digital electronic audio for the resample phase; it could tend to simplify...it would mean a lot more output hardware...

--
Jonathan E. Brickman   jeb@ponderworthy.com   (785)233-9977
Hear us at http://ponderworthy.com -- CDs and MP3 now available!
Music of compassion; fire, and life!!!