I agree
entirely. If each VVID=a voice then we should just call
 them Voice ID's, and let the event-sender make decisions about
 voice reappropriation. 
 Actually, they're still virtual, unless we have zero latency feedback
 from the synths. (Which is not possible, unless everything is
 function call based, and processing is blockless.) The sender never
 knows when a VVID loses it's voice, and can't even be sure a VVID
 *gets* a voice in the first place. Thus, it can't rely on anything
 that has a fixed relation to physical synth voices. 
 
<Arguing my model>
I think it is fair to say that for a block, the sender can assume a
voice-allocation succeeds.  The only time a VID is ever virtual is during
the creation block.  The sender can assume that the neagtive VID exists for
that block, and at the end of the block's run() it will know whether it can
send any further events to that VID.
I think this protocol is not so insane.  At least no more insane than the
VVID allocation scheme.