Since no one else has apparently addressed it, you noted the two variations on this. In L/C/R there are three output buses and the signal routing to these is via panning. As you pan a signal from left to center to right it literally pans across the three outputs. At center the signal goes just to the center output with no left or right output, full left and only left output and between left and center splits the signal between the left and center outputs.
In L/R + Mono you still have thee output buses but whether something is L/R or mono is a result of an output bus assignment rather than a pan. You can assign a signal to either or both outputs. If a signal is assigned L/R and panned center, it comes equally from both left and right and not from the mono output. A mono send is just that and pan does not affect it.
L/R + Mono is often used to have a mono center speech speaker/cluster/array and then split left and right stereo speakers/clusters/arrays. Using a single point for the speech reinforcement can reduce the timing and localization issues that would result if the same signal is reproduced from multiple locations. At the same time, the L/R arrangement allows for greater stereo separation for stereo playback. Since the center speaker/cluster/array is typically used primarily for speech sources, the speaker array components used are often selected based on voice reproduction (maybe a 12" woofer and usually no sub) while the left and right speaker array components are typically selected based more on music reproduction.
L/C/R also uses three speakers/clusters/arrays and allows for greater flexibility in imaging. However, in many system panning a single channel source across multiple speakers can result in combfiltering and other anomalies, so these systems have to be designed with this in mind.
In both approaches, for the system to work properly the left, center and right speakers/clusters/arrays must each properly cover the listener area, as others have already noted.
As far as vox and music causing combfiltering, you get summation and cancellation at different frequencies any time you mix two signals. However, combfiltering in speaker systems is normally caused by the same signal coming from multiple sources such that those signals arrive out of phase with one another. This issue affects the original signal component rather than how that signal combines with different signals.