On Wed, Mar 11, 2009 at 11:35:15AM +0100, Bengt Gördén wrote:
Anyone an idea
as to what is happening here,
and how it could be cured ?
I can't say that I can help you but 19 years as a network engineer can come in
handy.
The most obvious is that the network is overloaded in some way. Doesn't need
to be in terms of bandwidth. It can be things like spanning-tree going
heywire in switches. Redirects is another thing that comes to my mind. A
sophisticated (there are things like slow pings of IPv6 addresses that fills
up the table in a switch) DoS might also be the case.
If we assume it's the network that causes the problem you obviously have the
switches and the routers that can cause the problems not to say firewalls.
What network set up (equipment and such) do you have?
Is the network protected in some way?
I'll make a complete list of all the equipment
next monday (I don't return there before).
The network is completely isolated, there are just
the four WFS computers (and a fifth which was not
being used), and my Thinkpad R51.
Apart from the ssh -X connection the only traffic
are the messages I mentioned before (around 50/s)
and the status reports from the slave computers,
around 60/s for all three added. All messages are
less than an MTU.
Have you made some performance tests? iperf is handy.
TCP:
server:> iperf -s
client:> iperf -c server
UDP:
server:> iperf -su
client:> iperf -c server -u
Good tip, will do.
Now you're here you may be able to comment on
another problem: the network adapter in the
master does not receive the multicast messages
it is sending (even if the TX socket has the
IP_MULTICAST_LOOP option set). Chip is Intel
85something, driver e1000e.
Many thanks,
--
FA
Laboratorio di Acustica ed Elettroacustica
Parma, Italia
Be quiet, Master Land; and you, Professor,
will you be so good as to listen to me ?