Loudness in mastering – part 2(Letto 177 volte)



Continued from Part 1: https://alessandrofois.com/staging-a2/loudness-nel-mastering-parte-1-dinamica/


To prevent the background noise and other disorders inherent in recording media (for example, the rustle of analogue tapes) the following was sought:

  • to keep the maximum peak of the recording as high as possible, but below the distortion point
  • Of compress the “useful dynamic space” in a relatively narrow range, capable of reproducing a dynamic range suitable for various types of use but sufficiently wide to reproduce the dynamic expressiveness of music with dignity

In subsequent years, particularly in the pop music sector, the production industry gradually reduced the dynamic range, compressing it more and more in order to increase the volume of the lowest moments of the performance dynamics, to the point of reducing the dynamic range used to just a few dB.

As we shall see, this phenomenon has accelerated significantly with the advent of digital media.

Over the course of approximately 20 years (from the 1990s to the 2010s), the need to compress music to ensure more convenient enjoyment gradually turned into a frantic race to Perceivable volume.

The aim, encouraged by producers, was to use loudness to overcome the sound impact of competing music productions, which triggered a veritable Volume War, defined precisely as “Loudness War”.

Loudness war

The term loudness war It therefore refers to the tendency of the music industry, fuelled by artists and producers, to produce and publish music using high levels of loudness, which have become increasingly higher year after year, in a continuous attempt to exceed the volume of productions released by “competing” artists and record labels.

The introduction of processors digital signals and limiters The improved quality and extreme precision allowed sound engineers to significantly increase the perceived volume in a recording; and since “stronger” was generally perceived by users as “best”, the sound engineer, due to “pressure” from producers, they tried to “push” the volume to the maximum possible limit, which led the recording industry to volume war.

Many music professionals, especially sound engineers and artists, believe that this trend has led to the sacrifice of sound quality and of the dynamic expression to obtain high volume levels in audio support.

Above are the waveforms of a track released in 1980 and remastered several times in subsequent years, with a tendency towards increased loudness in 2001 until the peak of the loudness war in 2005, then returning to a high but more moderate intensity in 2011.

The analogue era

This procedure was not used before the advent of digital technology, partly due to the physical limitations inherent in the system of mechanical engraving of vinyl.

In truth, even with vinyl, some records sounded louder than others, depending on the natural dynamics of the various musical genres and thanks to the different mastering techniques then in use, but each disc was “an island unto itself”, and the only perceived need was to: 

  • ensure that all tracks on the same album had a proportionate volume, remaining within a dynamic range greater than background noise level, without ever distort in peaks higher and ensuring that this dynamic space be sufficiently large for an’dynamic musical expression correct.

In the case of compilation albums (containing previously released tracks from different albums and sometimes even by different artists), a remastering, in order to level the volume and tonal balance perceived during listening of the various tracks, in order to achieve greater “homogeneity” between the contents of the album.

In the’analogue era, those who wanted to listen to it “louder” could simply increase the amplification volume of their own reproduction system, adjusting the listening volume each time you change the record on the turntable, so that you can adjust it to suit your listening needs at that moment.

The only limitation was determined by the amplification power and the mechanical resistance of the loudspeakers in the reproduction system.

With the advent of audio cassettes, the criterion for use by users did not change substantially, leaving it still up to the volume knob the individual end user the task of levelling the’intensity sound, according to the listener's preferences.

The digital era

For a certain period, by convention, listening to CDs was also characterised essentially by the same listening procedures on the part of users, and this routine continued for most of the 1980s, which was a decade characterised in any case by a significant increase in the loudness, but gradual and moderate.

There “Volume War” It seems to have really begun in the 1990s, with the spread of multiple CD players installed in cars, which allowed users to skip from one track to another and even from one CD to another. This mode of use highlighted the differences in volume between one disc and another.

Specifically, the race for volume intensified when manufacturers realised that users with multiple CD players often used a “free” playback mode, switching between tracks on different CDs, forcing them to constantly “adjust” the listening volume, which was particularly unpleasant when driving.

If the user did not change the listening volume, tracks with greater dynamic range (which were perceived as having a lower average volume) were penalised, appearing “thinner” in comparison to others that sounded louder.

This observation was the decisive factor that prompted various producers and artists to demand a solution from sound engineers, namely extreme compression of the master, pushing it to ever higher levels.

The phenomenon continued with the advent of portable players and USB sticks, becoming unsustainable within a few years and provoking complaints from many sound engineers and artists, who called for the identification of a reference standard that would better respect sound quality, music and its dynamics.

Consequences on audio

Since the sound level of an audio file cannot exceed a certain limit (0 dB digital), the overall volume can only be increased by reducing the dynamic range and subsequently “normalising the level of the track” (thus bringing the maximum peak at the point of maximum tolerance of digital sampling, i.e. close to 0 db). 

The above was therefore achieved by “compressing upwards” and in an increasingly extreme manner the “dynamics”, with the result of increasingly compromising the peaks and causing acoustic distortions of various kinds and the almost total loss of dynamic expressive modulation.

Adverse effects

  • Music with a reduced dynamic range was found to be stressful and lacking in expression.
  • Excessive peak smoothing produced many “noise points”, which became denser and more audible as compression increased. In the worst cases, it was as if a “continuous ferrous background noise” had been created, similar to “white noise”.”

Concrete positive effects

  • Greater usability of audio content when listening in noisy environments

N.B.

For years, sound engineers have been forced to “climb up the mirror” to satisfy the demands of their clients. 

In order to minimise the damage caused by excessive compression, they have learned to optimise processes as much as possible, including through:

  • the use of multiband compression
  • of the step-by-step automation of compression values, 
  • analogue and valve compression techniques (or digital with analogue emulation), in order to create more harmonious “saturation walls”.

However, this has also resulted in the production of “sound monsters” which, in the opinion of many, are intolerable.

The solutions

Eager to put an end to the volume war, towards the end of the 1990s, the sound engineer Bob Katz developed a criterion called K-System.

K-System

The K-System (Katz Bob System) is a protocol for setting up mix and monitor calibrations in an audio studio.

Although the standards of loudness such as EBU R128, as we shall see, are more widely used today using a scale in LUFS/dB, the K-System , which uses an RMS/dB scale, is still a good way to adjust audio levels.

This system uses three different standards, known as K-20, K-14 e K-12.

These numbers express the amplitude of the song's dynamic range in dB RMS, so that at each step (from K-20 to K-12), the available dynamic range decreases and the loudness (understood as the perceived average volume) increases.

The “label” display at the top of the meter scale must indicate the maximum level expected for the target (20 dB or 14 dB or 12 dB) and, just as with normal measurement, corresponds to the digital signal at full scale.

In order to function properly, the system requires that the monitor's listening level be carefully calibrated so that its perceptible level, when standing on the label, is 0 dB of the meter, corresponds to 85 dB SPL.

The above is, in fact, the ideal reference condition for mixing and mastering at K-20, to K-14 and to K-12. 

The gauges of the K-System show both the peak level that one RMS.

The red upper part of the meters is the area of maximum intensity.

In music recording, the RMS level should only reach the red zone during the most intense passages, at climaxes, and during occasional peaks. 

In fact, according to the average results of tests carried out by Katz himself with  Based on user samples, it has been found that if you always use the red zone, you may feel the need to reduce the monitor's gain.

Here are some details of the three measurements:

K-12

This level was designed exclusively for radio broadcasting.

With this, it follows that -12 dBFS = 0 VU = 85 dB SPL.

The headroom limited to just 12 dB explains its exclusive use for compressed audio material intended solely for broadcast transmission (although it was subsequently also used for finalising more intense musical genres, such as dance music (especially electronic dance music) and a certain type of pop music).

K-14

This was to be the standard for most commercial pop recordings, created for home and private listening in general.

Pop music mixes are examples of material suitable for K-14, where -14 dBFS = 0 VU = 85 dB SPL.

The headroom margin is 14 dB.

The K-14 scale was probably the most widely used of the three standards.

K-20

This scale offers the widest dynamic range available among the three systems.

It was designed primarily for large theatre mixes, dynamic music mixes, cinema, television broadcasting, and mixes of classical and traditional styles.

Any audio programme with a wide dynamic range should have been aligned with the K-20 standard.

This means that -20 dBFS = 0 VU = 85 dB SPL, with a headroom of 20 dB. 

blank

Schematic representation of the loudness scale devised by Bob Katz. A good alternative reference for countering the loudness war, later replaced by LUFS measurement following broadcasting regulations and the advent of streaming platforms. The three highest points of the red zones (shown here in dark grey) are aligned with the 0 dB level of the digital scale.

In short, the aim was to establish the reference dynamic range for various types of listening. For years, some sound engineers (few, in truth) aligned themselves with the criteria proposed by Engineer Katz, while most of them, pressured by producers, continued to operate with loudness “at full throttle”.

LUFS

Meanwhile, since 2006, institutions ITU and European Broadcasting Union they gradually developed a protocol aimed at limiting the dynamic warfare, finally defining a measurement standard with relative unit of measurement, which would allow for better analysis of the audio signal, interpreting it primarily in terms of perception, in order to produce masters with standard characteristics. 

The unit of measurement in question was called LKFS, then redefined and renamed LUFS from European Broadcasting Union (European Broadcasting Union) in the document EBU R128 of 2014.

This current measurement system allows audio files to be analysed no longer on the basis of the scale RMS, but rather by using a different protocol, with a measurement criterion that is very similar to the’RMS, but with added variables that take into account the psychoacoustic perception of the average user.

The acronym LUFS  means “unit of measurement of volume relative to full scale”.

It was originally a volume standard designed to enable the normalisation of audio levels for television broadcasting.

LUFS is standardised in a set of algorithms designed to measure the volume of the audio programme and its “real peak” level (for further information, please refer to ITU-R BS 1770 and subsequent amendments introduced between 2011 and 2015).

LUFS are measured on an absolute scale and correspond to one decibel (dB).

The new system was perfected in the following years, and the standard of:

-23 LUFS (European Broadcasting Union)

it established itself in broadcasting contexts, also involving (in part) the film industry.

The level of:

-1 dBTP

Instead, it became the standard for the maximum peak of the audio programme, thus ensuring ample margins to prevent any risk of clipping.

Soon the standard became statutory provision, but only obliging operators in this regard. broadcasting.

L'recording industry, However, he turned a deaf ear to this suggestion, as no producer would want to release recordings that sound “lower” than those of their competitors.

The streaming revolution

A new element was therefore needed, one so decisive that it would dissuade the’recording industry from continuing with the loudness war.

The opportunity arose with the spread of streaming platforms, a phenomenon that had already spread worldwide in 2019, and therefore had enormous power in determining the “de facto” imposition of a standard.

Le streaming platforms, in order to ensure extreme homogeneity In terms of audio reproduction, they must be able to reproduce every musical genre sufficiently well, from the most rarefied and delicate classical music to the densest and most intense heavy metal.

These platforms must offer:

  • a sufficiently constant average perceived listening volume for all tracks in their “catalogue”, even though these volumes are extremely heterogeneous
  • an acceptable dynamic range, for the purposes of sufficient respect for musical expressiveness
  • distortion-free sound

This has led to the “de facto” imposition of certain loudness standard with very similar values, but not identical for all platforms at present.

Whatever the loudness original of the musical pieces featured in the streaming platforms, they will always suffer some automatic control processes and, if not compliant with standard criteria imposed by the specific streaming platform, will be automatically processed in order to make them suitable for the required loudness values.

To this end, the platform will automatically reduce the overall volume of audio files with excessive loudness in order to achieve a satisfactory listening level for all tracks in the platform's catalogue.

It is clear that, considering the above, including songs characterised by a excessive compression of the audio file will only serve to flatten its dynamics, thereby also affecting the purity of the sound, without having any real effect on loudness that will be perceived by users listening.

This is gradually discouraging producers from continuing with the senseless Volume War, directing them to produce their masters with broader and more relaxed dynamics.

N.B.

While volume reduction is guaranteed, there is no guarantee that audio files with a loudness below the standard will be enhanced. 

Consequently, it will generally be preferable to give tracks a slight excess of loudness rather than the opposite (for example, assuming that -14.0 is the standard for a specific platform, a loudness between -13.5 and -14.0 would be advisable rather than between -14.0 and -14.5).

Le streaming platforms at the moment they are not perfectly aligned according to one common standard, but currently range from -13 LUFS (e.g. YouTube, which has the most compressed dynamics) to -16.5 LUFS (e.g. Apple Music, which has the most extended dynamics).

The trend seems to be settling around a possible single standard of -14 LUFS, which is the one proposed by Spotify, currently the most important music streaming platform of the world.

For this reason, other “minor” streaming companies tend to align themselves with it, further promoting the definitive establishment of this measure, which will likely become the sole and definitive standard.

For this reason, the leading producers of plugins intended for dynamic finalisation of the master set by default to -23 LUFS for broadcasting and -14 LUFS for streaming musical, and in this sense they prepare the software utilities, often equipping them with a special level indicator, thus contributing to the establishment of this standard.

This does not exclude the possibility of completing multiple specific master's degrees, with levels of loudness different, to better adapt to each of the platforms of streaming.

Reference loudness

Before finalising, it is first necessary to clarify the three types of measurement in LUFS useful for the purposes of our analysis:

Momentary loudness meter

Similar to traditional vu meter analogue, expresses volume fluctuations in real time, imposing a moderate reactive inertia (approximately 400 ms), ideal for convenient level reading.

Very useful for viewing peak levels in order to assess the need for more or less significant interventions in the preliminary limiting of the mix.

Short-term loudness meter

It expresses the average sound level, calculated over a short time pattern of approximately 3 seconds.

Very useful for easily and smoothly monitoring the overall audio levels.

It is characterised by a reactive braking speed, somewhat similar to the ’temporary memory’ of many LED meters.

Integrated loudness meter

Expresses the actual target, according to reference and regulatory parameters. EBU – ITU

N.B.

In the revision of the ITU-R BS.1770 standard, the concept of “Loudness Gated” measurement was added, which “intelligently” reduces the measurement of performance pauses and musical passages with a particularly low level.

A good integrated loudness meter will take this parameter into account in order to avoid obtaining abnormal results.

True Peak level

Digital audio processing, partly due to ultra-fast limiting and problematic clipping, can produce inter-sample peaks (inter-sample peaks).

N.B.

Its analogue equivalent, after D/A conversion, would reveal a signal higher than the actual value of the sample, as clearly shown in the figure below. 

blank

A peak like this is also called actual peak level (True Peak Level).

Depending on the quality of the D/A converter When used in playback, these peaks could cause audible distortion.

Of course, it will always be better to prevent or minimise inter-sample peaks and at the same time ensure that every audio peak, normal and inter-sample, remains “truly” within the maximum undistorted limit of the 0 digital decibels.

A good dynamic finalisation plugin for mastering should have a function for containing True Peaks (True Peak Limiting), compliant with EBU R128 and ITU-R BS 1770 standards.

Please note that true peak (TP) measurement is not an exact science: there are many different ways to implement ITU-R BS 1770-compliant measurement, which may yield slightly different results. 

In fact, it is not unusual to find differences of a few tenths of a dB TP between different true peak meters. 

High oversampling values, the ability to interact in “look-ahead” mode, and the overall high quality of the plugin could guarantee greater precision and  reliability in containing true peaks.

Please note that the system's regulations and conventions require or recommend using a True Peak attenuation value equal to or less than -1 LUFS; some streaming platforms even require a TP attenuation value of -2 dB; in commercial mastering for audio CDs, on the other hand, TP adjustment is usually less conservative, with values of -0.5, -0.3, -0.2 dB, which exposes the master to the risk of transient distortion.

blank

The meters of a complete LUFS measurement system. On the left is the classic reference meter with 0 dB digital, and next to it is the dB measurement of the level reduction determined by the limiter. On the right are the three meters of the LUFS system: short term (S), momentary (M) and integrated (I). At the bottom left is the True Pick enable button and on the right is its level adjustment (not yet set to the standard value of -1 dB).

The new standards

To draw a conclusion, we can say that nowadays the trend is to use the following reference standards in LUFS:

Audio CD

  • -9 LUFS, with True Peak at -0.3 dB – this is the most widely used standard for rock-pop music and derivatives, although unfortunately many producers still push the standard, reaching values of -8 and -7 LUFS.

N.B.

Personally, even in these cases, I usually keep the TP at -1 dB.

Other more “expressive” musical genres, even for CDs, tend to choose solutions that allow for greater dynamic range:

  • -10 / -12 LUFS, with True Peak at -1 db – for the most expressive modern music genres, such as fusion, modern jazz, “cultured” and alternative pop music, ethno-pop (this loudness range is gradually winning over more “alternative” producers, and I personally hope that it will become a definitive standard in rock-pop as well).
  • -15/-23 LUFS, with True Peak at -1.0 dB – for traditional popular music, traditional jazz and classical music (not strictly purist in approach)
  • Uncompressed dynamics, with True Peak at -1.0 dB – for traditional popular music, traditional jazz and classical music (strictly purist approach)

blank

Schematic representation of useful dynamics and relative loudness levels used as standards in the most common applications. It is clear that, by normalising to levels close to 0 dB, we will have little useful dynamics and high loudness. Without any compression (on the right), the natural dynamics, with expressions ranging from 20 to 50 dB and above (depending on the case), will be fully respected. Note also that the normalisation level for pop music CDs is generally set to peak levels of a few decimal points, without the control of the True Peak ’circuit“.

Streaming

  • The current most widely used standard, to be considered temporarily as a reference, is -14 LUFS, but it could settle in the future at -15 dB or -13 dB.
  • Other streaming services currently range between -13 LUFS (YouTube) and -16.5 LUFS (iTunes).
  • However, for sound quality reasons, there are numerous cases in which levels similar to those used for CDs are finalised, even though platforms will automatically penalise such levels to bring them into line with the standards set by their publishing regulations.

Broadcasting and Cinema

  • The standard is -23 LUFS with True Peak at -1.0 dB, which is also a binding legal requirement for broadcasting.

For cinema, significant fluctuations are observed, between -27 and -21 LUFS (with a short-term loudness of up to -6 LUFS).


For more information on Digital Audio Mastering

0 replies

Leave a Reply

Want to join the discussion?
Feel free to contribute!

Leave a Reply

Your email address will not be published. Required fields are marked *