Hello,
 
 thank you,
your answers made up my mind
and currently I am in the process of reading in depth about your suggestions.
 

>> What is the use case for many sound cards ?

There are two use cases:

- The Madi-card should run HQ-mastering grade converters (around 16 i/o).

- The Expert Sleepers converters will interface with an Eurorack modular-synthesizer.
 They are the only ones who have DC-coupled i/o with 20 Vpp,
for control-voltages and audio. 

( Any DIY solution is out of question for now,
as they have a decade of experience and their gear is running smoothly )
 

>> Use 48000 and be happy, 96000 is
>> only good for recording bats


This could lead towards an endless debate, - which I am not up for in this place,
but some thoughts on-topic:

I am in a love-hate relation with digital audio processing,
never experienced a converter in person that is comparable to an all-analog chain.
Speacially the very highs and 3Dness. 

Now I decided for a no-compromise chain (that's why I choose Linux)
- and I am aware that there are different scientific statements about 48 vs 96k.

( some:
 hearing range, overhead, commercials, other factors for quality as jitter, whole chain, room ect. ...
vs.
 latency, plugin-qualities and -precision, non-ideal niquist-filters, downsampling to 44.1/16, ... )
 
And I guess there is no mastering-studio running 48k in 2020, no offence intended.

I am a PD-programmer and do have a background in electronics and acoustics,
so I guess that I know what I am talking about (as everyone does ;)
 

But,
if I would use 48k only,
I could run the Expert-Sleepers converters via ADAT
and yes, ideed - things would be easier.
 
 
>> It looks like those ES-8's have ADAT I/O.  Could you sync their internal clocks by daisy chaining them via ADAT I/O off of the RME's ADAT output?
 
Not at 96k (and yes I know s/mux, but that's not possible here). 


Now I'd like to elaborate the possiblities for running the Madi-chain at 96k, at least? 

>> you will be using USB hubs which have been known
>> to cause trouble with audio devices. so be prepared for xruns at any
>> buffer size less than 1024 (maybe even there).
 
Yes, I have read about those troubles. But now I found positive experiences, too (including the Jack-website).

It is not resolved why there have been issues with multiple USB-devices
(might just be clonflicts in shedueling, clocks, bandwith?),

but if I would use more than one USB 3.2 - PCIe slots in a desktop-computer,
the bandwith would be more than sufficient!??? 
 
Of course, faster than 1024 buffer size would be cool.

>> Each card have its own hardware clock

>> You would use an instance of zita_a2j to connect each "secondary" card to the JACK server which is using the "master" card. zita_a2j will resample as needed to keep things in sync.

https://kokkinizita.linuxaudio.org/linuxaudio/

Very cool!
I am having a close reading on this now.

It seems like this is taking care of the drifting clocks with a buffer and alignment?
The RME as master? Does that mean the hardware-clock of the RME would define the whole DSP-chain? Somewhere I read that RME-cards can only run as slave in Linux, but maybe this is outdated?
There might be some other down-sides? Phase issues ect? ... I will read more ...

Zita seems to make up many possibilities ...
 
 

For now I see the following routes:
 
- 96k for everything, RME-HDSPe-Madi and multiple USB-Expert Sleepers
 via zita-ajbridge ?
 
https://www.expert-sleepers.co.uk/moduleoverview.html
https://www.rme-audio.de/de_hdspe-madi.html
 
 
- 48k for everything, run the Expert Sleepers i/o with Adat via RME-MadiFx (with Adrian Knoths Alsa-driver) <> RME-Adi648 or multiple RME-RayDat

https://www.expert-sleepers.co.uk/moduleoverview.html
https://www.rme-audio.de/de_hdspe-madi-fx.html
https://github.com/adiknoth/madifx
https://www.rme-audio.de/de_adi-648.html
https://www.rme-audio.de/de_hdspe-raydat.html




- 96k for the Madi-Chain and 48k for the ADAT Expert Sleepers chain.

 Maybe this could be achieved
 - with zita-ajbridge by re-sampling?
 - Or with zita-njbridge in a network with one Rasperry-Pi for each USB-connection and re-sampling?
 With this I might get rid of USB-conflicts, too. Running into more possible failures?

Is running different samplerates a good idea?
 
 
Bests, Manuel