I'm really late on this one, but I finally looked at the linked
waveforms. Here is what I see.
Ignoring push-pull crossover distortion, there are basically two kinds
of distortion happening. There are two stages to the amplification, a
pre-amp and an output amp. The pre-amp provides soft clipping of the
input signal, generating harmonics of the input signal. As the input
level is increased, the pre-amp output looks more like a square wave but
with rounded edges. The pre-amp is outputting at its maximum voltage
(saturation). The output amp sees this square wave as a dc step input.
The output circuit is under damped, meaning that its step response will
include ringing. The frequency of the ringing is related to the design
of the output circuitry and has nothing to do with the frequency of the
input signal, since the output amp sees its input as dc. The degree of
clipping from the pre-amp determines how long the output amp sees its
input as a step, and determines how long the output signal contains the
ringing. The output amp is not operating at maximum output voltage.
This is what allows the overshoot and undershoot spikes in the ringing
to pass unclipped. For even more distortion the output amp could be
biased to clip the overshoot portion of the ringing.
This pseudo-knowledge comes from a distant dream of having studied
impulse, step, and ramp responses of analog circuits 20+ years ago.
Tom