Audio Guide - Audio Forge | Understanding Audio Formats, Encoding & Analysis

File Info

Compression Ratio

The compression ratio tells you how much smaller a file is compared to the original uncompressed audio. It answers the question: "How much space did we save?"

How It Works

Raw, uncompressed audio (PCM) is the baseline at 0%. When encoded to MP3, AAC, or FLAC, the file gets smaller. The ratio shows the percentage of data compressed away.

What the Numbers Mean

Higher percentage = more compression. A FLAC at 40% has squeezed the file to roughly half its original size, but losslessly. An MP3 at 77% has compressed out three quarters of the data.

Compression ratio is only meaningful when compared to the same source. A 96 kHz/24-bit FLAC at 40% represents far more data than a 44.1 kHz/16-bit MP3 at 40%.

Example: 5 minute stereo track at 44.1 kHz / 16-bit

Uncompressed (WAV)~50.5 MB 0%

FLAC (lossless)~30 MB ~40%

MP3 320 kbps (CBR)~11.7 MB ~77%

AAC 128 kbps~4.7 MB ~91%

Encoding

Bitrate Mode

Bitrate mode describes how a lossy encoder allocates bits across the audio. It is one of the most important choices when encoding to MP3 or AAC.

Constant Bit Rate (CBR)

Every second of audio gets the same number of bits, regardless of complexity. Simple and predictable, but wastes space on easy passages. Best for streaming and compatibility.

Variable Bit Rate (VBR)

The encoder adapts based on audio complexity. Complex passages get more bits; silence gets fewer. In LAME MP3, V2 (~190 kbps) is the most popular audiophile choice, offering transparent quality for most listeners.

Average Bit Rate (ABR)

A hybrid approach where the encoder varies the bitrate like VBR but targets a specific average over the whole file. Less common today, as modern VBR has largely replaced it.

VBR V2

~190 kbps average
Best quality per MB
Recommended

CBR 320

320 kbps constant
Maximum compatibility
Larger files

Audio Forge shows the bitrate mode in the Media Inspector. VBR files show the average bitrate from the Xing/LAME header embedded in the file.

Audio Fundamentals

Sample Rate

The sample rate is how many times per second the audio signal is measured. It determines the highest frequency the audio can reproduce.

Nyquist Theorem

A sample rate captures frequencies up to half its value. Since human hearing tops out around 20 kHz, a 44.1 kHz sample rate is sufficient for all audible content.

Higher Rates

Higher sample rates don't produce audible improvements for playback, but they are useful during recording and mixing to reduce artifacts from audio processing.

Common sample rates

44.1 kHzCD quality, covers full audible range

48 kHzVideo and broadcast standard

88.2 kHzHi-res, 2x CD rate

96 kHzHi-res, studio and mastering

192 kHzUltra hi-res, archival quality

Audio Fundamentals

Bit Depth

Bit depth determines how precisely each audio sample is measured. More bits means more possible values, a quieter noise floor, and greater dynamic range.

For Listening

16-bit is more than enough. It provides 96 dB of dynamic range, which already exceeds the dynamic range of any practical listening environment.

For Recording

24-bit gives 144 dB of headroom, which is valuable during recording where the extra range prevents clipping. 32-bit float is used in DAWs for internal processing.

Lossy formats like MP3 and AAC don't have a bit depth in the traditional sense. They use psychoacoustic models rather than PCM samples. Audio Forge shows "N/A" for bit depth on lossy files.

Bit depth and dynamic range

16-bit96 dB dynamic range, CD quality

24-bit144 dB dynamic range, studio quality

32-bit float~1500 dB post-production / DAW

Quality

Lossless vs Lossy

Audio compression comes in two types, and understanding the difference is fundamental to managing your library.

Lossless (FLAC, ALAC, WAV, AIFF)

The decoded audio is bit-for-bit identical to the original. FLAC and ALAC achieve 40 to 60% compression. Ideal for archiving, mastering, and format conversion.

Lossy (MP3, AAC, OGG)

The encoder permanently discards audio information most listeners won't notice, using psychoacoustic models. Achieves 5 to 12x smaller files at the cost of irreversible data loss.

Never convert lossy to lossy. Each generation of lossy encoding compounds quality loss. Always start from a lossless source. Audio Forge shows the quality type (Lossless/Lossy) in the Encoding card.

Lossless

Perfect quality
40 to 60% compression
FLAC, ALAC, WAV

Lossy

Perceptually transparent*
80 to 95% compression
MP3, AAC

Encoding

Encoder Identification

Audio files often contain metadata about which software encoded them. Audio Forge detects encoder information from multiple sources.

LAME Binary Header

Audio Forge parses the LAME binary header inside the Xing/Info frame to extract VBR method, encoder version, average bitrate, and total frame count, all independent of ID3 tags.

ID3 and Metadata Tags

Encoder info is stored in ID3 TSSE (encoding settings) and TENC (encoded by) frames for MP3, and in the encoding tool atom for M4A/AAC files.

Audio Forge Stamp

Files encoded by Audio Forge are stamped with the encoder version and settings, for example "AudioForge (LAME 3.100 VBR V2)", so you can always trace how a file was created.

Detection methods

TSSE TagID3 "encoding settings" frame

LAME HeaderBinary header in Xing/Info frame

TENC TagID3 "encoded by" frame

iTunes AtomEncoding tool in M4A/AAC

Audio Analysis

Peak Level

The peak level is the highest instantaneous amplitude in the audio signal, measured in decibels (dB). It represents the single loudest point in the entire track.

The Digital Ceiling

0.0 dB is the maximum level digital audio can represent without clipping. Values very close to 0 dB are common in modern, loudly mastered music. Negative values indicate headroom.

Loudness War Indicator

A peak of 0.0 dB combined with a high RMS level indicates heavy compression and limiting. This isn't necessarily bad, but it can mean fatiguing audio with little dynamic variation.

Lots of headroomClipping risk

-20 dB-10 dB-3 dB0 dB

Audio Analysis

RMS Level

RMS (Root Mean Square) measures the average power of the audio signal. While peak level shows the single loudest moment, RMS represents the perceived overall loudness of a track.

Peak vs RMS Gap

RMS is always lower than the peak level. The gap between the two is closely related to the crest factor and, by extension, the dynamic range of the recording.

What High RMS Means

A high RMS level (close to 0 dB) combined with a peak near 0 dB means the audio has been heavily compressed or limited. It will sound loud but fatiguing, with little dynamic variation.

Typical RMS levels

Classical / Jazz-18 to -12 dB

Rock / Pop (1980s to 90s)-12 to -8 dB

Modern Pop / EDM-8 to -5 dB

Loudness war casualties-5 to -3 dB

Audio Analysis

Dynamic Range (DR)

Dynamic Range measures the difference between the loudest and quietest parts of a track. A higher DR score means more contrast between loud and soft passages.

The DR Algorithm

Audio Forge uses the foobar2000-compatible DR algorithm. It analyzes audio in 3-second blocks and computes the difference between the peak and the second-highest RMS block, producing a single DR number per track.

Mastering, Not Format

DR is a property of the mastering, not the file format. A FLAC and a 320 kbps MP3 from the same master will have nearly identical DR scores. Converting to a higher quality format does not improve dynamic range.

Heavily limitedExcellent dynamics

DR3DR7DR10DR14+

DR score interpretation

DR14+Exceptional: classical, acoustic, jazz

DR10 to 14Good: natural, dynamic mastering

DR7 to 9Average: moderate compression

DR4 to 6Poor: heavily compressed

DR1 to 3Bad: extreme limiting, "brick wall"

Audio Analysis

Loudness (LUFS)

LUFS (Loudness Units Full Scale) is the industry standard loudness measurement used by streaming platforms, broadcasters, and mastering engineers.

Frequency-Weighted

Unlike RMS, LUFS applies a frequency weighting that matches human loudness perception. Our ears are more sensitive to midrange frequencies, and LUFS accounts for this.

Platform Normalization

If your track is louder than a platform's target, it will be turned down during playback. Mastering louder than -14 LUFS no longer provides a competitive advantage on modern platforms.

LUFS and dynamic range are related but independent. A track can be loud (-10 LUFS) with good dynamics (DR12) or quiet (-18 LUFS) with poor dynamics (DR5). LUFS tells you about overall level; DR tells you about the contrast within.

Streaming platform targets

Spotify-14 LUFS

Apple Music-16 LUFS

YouTube-14 LUFS

Amazon Music-14 LUFS

Broadcast (EBU R128)-23 LUFS