Back to Top

Say So

by Doja Cat

This production immediately brought to mind Dua Lipa’s recent 'Don't Start Now', but I can’t say the low end fares well from direct comparison there, unfortunately. Next to the tightly controlled sub-60Hz solidity of Dua Lipa’s kick, Doja Cat’s underpowered and slightly deflated-sounding low end makes the song come across as a bit plasticky. And the fact that (judging by the digital-download files) the master of ‘Don’t Start Now’ seems to have an intergrated loudness almost 1dB lower than ‘Say So’ won’t help either, because it’ll likely mean that Ms Lipa’s audio will exhibit louder peaks, and consequently even greater punch, under loudness-matched streaming playback conditions.

While this kind of thing may not concern her legion of new TikTok fans (following the viral dance challenge that launched the song into the chart stratosphere), it nonetheless feels like a bit of a miscalculation, because it’s not like the weaker low end has been traded for any significant improvement in mass-market translation, as far as I can hear. Indeed, the main vocal hook of ‘Say So’ actually projects much less well on single-speaker mobile devices than that of ‘Don’t Start Now’, on account of the lesser mono-compatibility of Doja Cat’s panned vocal layers. To be fair, though, if you check the stereo sides signal of ‘Say So’, you’ll notice that it’s mostly the upper harmony line, not the main melody, that’s being sacrificed when the mix is mono-summed.

All of which makes me reconsider the song’s filtered intro and outro in a fresh light. You see, if this song had launched directly into its main groove from the outset, the listener would have been treated to a potentially unforgiving direct comparison between its full-texture production sonics and those of whatever track preceded it in their playlist – quite possibly ‘Don’t Start Now’, in fact, given the strong stylistic similarities. With an eight-second buffer of band-pass-filtered intro, though, the listener’s ear has time to adapt, and will treat the intro as the sonic point of reference from which to judge the sound of the first hook section that enters at 0:09. And because that intro is band-pass filtered, the track will seem super bassy and super bright by contrast. By the same token, the 17-second outro is also filtered, preventing side-by-side sonic comparison with any subsequent track in the playlist. Now it might be fanciful on my part, but I can’t help wondering whether this tactic might be symptomatic of a crisis of confidence on the part of the producers…

Further unflattering comparisons can be found in the long-term dynamics of the two tracks. Where ‘Don’t Start Now’ delivers a strong sense of build-up and momentum through its timeline, with each successive chorus offering additional arrangement interest, ‘Say So’ struggles to deliver much in the way of forward drive, not least because all six of its chorus iterations are pretty much identical! Now, as a mix engineer, I’m pretty used to listening to the same song for days at a time, but I got pretty sick of this song in short order, which doesn’t bode well for the song’s long-term appeal once its 15 seconds of 15-second-dance-choreo fame have passed.

On the other hand, perhaps this is just a hard-headed business decision. Why expend unnecessary budget developing any facet of the production that requires more than half a minute to appreciate, if you reckon that’s the upper limit of your target market’s likely attention span? In a sense, it’s just an intensification of the same reasoning that’s long encouraged pop producers to favour three-minute song durations. Or maybe the Xeroxed choruses are just about providing maximum convenience for dance-along video creators, who can bop over any of the choruses without any worries that they might have chosen the wrong one…

The main chorus melody it thought-provoking too, as my Project Studio Tea Break co-host Jon Whitten pointed out to me the other day, in that it’s doubled throughout at the interval of a perfect fourth. This isn’t something you hear every day, and it lends things a certain East Asian flavour, doubtless partly on account of the notes all forming part of an A major pentatonic scale. This scale itself is contained within the prevailing E Dorian mode, but sensibly omits the note G, so there’s no danger that any of the parallel motions will generate a less palatable dissonant tritone with C#.

Another thing that’s intriguing about this doubling is the question of which line is the main melody? Both of them feature a lot of non-triad notes against the underlying Em-A-D(-Bm) harmonic progression, but perhaps the top line could be said to be a little more consonant. On the other hand, the lower line feels a little fuller in the mix balance. So, basically, I think your guess is as good a mine! Maybe the real answer, though, is that both the melodies are the main melody, and that we just have to resign ourselves that we’ll never be able to sing along to 100 percent of it? And does it really matter anyway? If each part of the melody appeals to different listeners, then who cares about the nomenclature. There are plenty of Simon & Garfunkel songs (‘The Boxer’ immediately springs to mind) where different people will have different opinions about which singer is carrying the ‘lead melody’ at any given point, and it certainly didn’t hurt them!