I was recently asked to provide some insight into how Conor Maynard might have achieved the rich, upfront vocal tone on his Youtube cover of Shawn Mendes’s ‘Stitches’. Specifically, a problem with simple vocal-and-piano textures like this is that making the voice too bright or present can lead to harshness a cause difficulties getting it to blend with the piano.
Well, the first thing to point out is that we’re not hearing Maynard through the microphone shown in the video itself — his line has clearly been recorded separately. (If you need any convincing, check out 2:19, where the audio track and his mouth disagree on the first word of “I’ll be needing stitches”.) That said, the large-diaphragm condenser mic shown in the video might well have been used for the recording session, as the tone is very much what I’d expect from a microphone like that.
If you listen to the word “deeper” at around 0:31, the ‘p’ produces a bit of a ‘pop’, so I’d guess that Conor was working close enough to the microphone that the popshield was unable to sufficiently block the wind-blasting of his plosives — in other words, a distance of eight inches or fewer. This surmise is further supported by the richness of the vocal’s low end, which would have been strongly boosted by the proximity effect at that close range, and the audibility of extraneous mouth noises (eg. on “(be)fore” at 0:19 or “sore” at 0:26), which don’t project nearly as well as other elements on a sung performance. I suspect that the mic was placed directly in line with his mouth too, because plosive blasts tend to be pretty directional, and we probably wouldn’t have heard them had the mic been more off-axis. This kind of mic position is something of a mixed blessing in terms of a vocal’s high spectrum, however: on the one hand it typically captures plenty of intimate, pop-tastic breathiness; but on the other hand it tends to overemphasise the sibilants (‘s’ and ‘sh’ sounds), fricatives (‘f’ and ‘th’ sounds), and noisy stop-consonants (‘t’ and ‘k’ sounds) as well. So one of the primary concerns when recording vocals of this type is how you deal with that.
One approach is to get the singer to replace these sounds with less harsh-sounding versions — for example, adjusting ‘s’, ‘f’, and ‘t’ sounds to more closely resemble ‘z’, ‘v’, and ‘d’. Although you might think it would quickly make nonsense of the lyrics, in practice listeners are remarkably tolerant of this in practice. The exact choice of mic can make a big difference too, as most large-diaphragm condensers boost the high end of on-axis sounds, and the exact nature of this boost can vary a great deal between models.
Like almost every other mainstream pop vocal these days, this has almost certainly been buffed with judicious pitch-correction processing. If you don’t normally pitch-correct your vocal recordings, then this may be part of the reason they don’t seem to blend as well as this — pitching isn’t just a musical issue, because sounds that are in tune also sit better in the mix. It’s also clear that the level of the vocal has been extremely tightly controlled at mixdown, as likely as not by a combination of fairly heavy compression and detailed fader automation, and again this level solidity contributes to the sense that the vocal blends naturally with the piano — for example, it never seems to leap out further in front by being momentarily too loud.
As you’ve probably already realised, this kind of vocal sound needs additional high-frequency emphasis at mixdown, but the difficulty with that is avoiding overemphasis of the noisy consonants (especially sibilants) at the same time, so another key task is careful consonant management. If the recording’s been carefully done, a de-esser may be all that’s required to rebalance things, but in practice many mix engineers employ much more labour-intensive methods where maximum vocal brightness is required. For example, some will edit all the consonants onto a different track, and then boost the high frequencies only the consonant-free track, while others use fader and EQ automation to adjust the character of each consonant manually.
If you give all those steps enough time (and they do often take a while!), I don’t think you’ll actually find too much difficulty getting the vocal to blend in the final mix. Just a touch of feedback delay and plate reverb (as in Maynard’s case) should give you all the ‘glue’ and sustain you need. Do consider de-essing the sends to both of those effects, though, and EQ the reverb return to avoid a build-up of muddy low-mid range, given that the dry vocal has a lot of energy in that region.