You don't need W/C to accurately sync up video to audio. Except for Field recorders such as Zaxcom or Sound Devices, most recorders don't have W/C anyhow. Timecode is often sent to and from cameras and recorders. that makes it easier to spot areas needed in post and line stuff up.
music is easy to sync up in post. Often you don't have the ability to slate the cameras and recorder but there are some many percussive elements it's pretty easy.
The drift between devices is because of the crystal clocking. For music it's of little worry since things come in 3-6 minute chunks, unless it's orchestral. A nudge here and there, if even needed is all that's needed to fix. I made some tests of audio equipment i use for TV and concert production to check for clocking drift: Zoom h4n, H6, Protools, SoundD and Roland all running autonomously but recording the same signal at the same time. After 45 minutes the recorders were pretty much all within 2 frames of each other.
The "butt plug" transmitters can work just fine. Typically cameras receive a "hop" from the audio mixer and have receivers set to same frequency.
I've done many music/concert shoots where the cameras just used their onboard mics and the audio was replaced later. If they happen to want to edit and just use the camera sound then you'd have to nail it pretty well and hope for a clean RF.
a redundant recorder, like a Zoom sure gives some piece of mind.
You can get some splitters and do your own thing too.
are the cameras going to a truck or switcher? how many cameras?
how are you distributing the audio to all the cameras?
is there no snake connecting FOH with the stage?