Again, this is something I sell so take this with a grain of salt.
HaiVision encoders and decoders work great for this application. HaiVision guarantees a 70MS encode to decode so you can place encoders and decoders in each campus and do responsive singing/reading. The pair will run about 13k for the HD version. It's a very simple setup. They also have a full range of recording and playback through their Video Furnace line.
The catch is you really need to be buying a private network connection between campus's if you are going across town. Something like a MPLS from your cable company, or telephone carrier.
I'll shut up now.