home
user areas
performance
charts & pics
Z features
etc

site contents
© 1998-2001
john s. cooper



Sample Rate Conversions

Submitted by Hans Meinig
Last updated June 2, 1999

Intro:

Following a recent discussion on the Pulsar/DSP list regarding the ills of sample rate conversion I wondered whether there was a more rigorous way of comparing how well a program performed its SRC than simply listening to the end result. I was particularly interested in the conversion from 48 kHz to 44.1 kHz, as I have had many of these conversions in my line of work.

It was suggested a spectral analysis program such as SpectraLAB would be a valuable tool in making a quantitative comparison.

Procedure:

I created a test tone of 480 Hz at 48 kHz sampling rate using Cool Edit, then downsampled that file to 44.1 kHz using each chosen program's resampling commands. I then listened to each of these files carefully to see if I could detect any artefacts from the downsampling process. I then opened each file in SpectraLAB and performed a FFT (Fast Furier Transform) and displayed the results as a spectrogram.

I limited the scope of this research to looking at the 48 kHz to 44.1 kHz conversion process to save a bit of time. This is probably the most common conversion process given that most music needs to get to CD format and many digital recordings are done at 48 kHz.

In order to get some objectivity into the listening tests I used a parametric EQ to cut or boost at 480 Hz (the frequency of the tone used in the original file) to reveal more clearly the strength of errant harmonics introduced by the conversion process under question. To begin with I would cut 18dB to reveal these harmonics (if they existed at a reasonable level) and then progressively reduce this cut until the harmonics were no longer audible. I then noted the amount of cut or boost required to reach this threshold.

Results:

How each program performed:

Cool Edit had the most sophisticated control on the process and performed the best within the confines of the analysis. I couldn't detect any harmonics in any of the files produced using Cool Edit, even when cutting 18 dB at 480 Hz. I did however notice a level difference between the processed file and the original file and noted the threshold of cut/boost of this effect.

TripleDAT performed least well out of the programs analysed.

Program

Quality setting

Comments

Threshold

Highest Harmonic peak (dB above noise floor)

Time to process

(min)

Cool Edit 96

HQ with filter

No harmonics heard at -18 dB - level diff though

 

No harm.

2:00

Cool Edit 96

LQ with filter

No harmonics heard at -18 dB - level diff though up to -10 dB

   

0:15

Cool Edit 96

LQ

No harmonics heard at -18 dB - level diff though up to -15 dB

 

32 dB

0:08

Cool Edit 96

HQ

No harmonics heard at -18 dB - level diff though up to -15 dB

   

1:50

TripleDAT

NA

Harmonic artefacts clearly audible

+5dB

60 dB

 

WaveLab

HQ

 

-10 dB

46 dB

1-2 secs

WaveLab

LQ

 

- 5dB

46 dB

2-3 secs

Sound Forge 4.5

HQ

 

-15 dB

34 dB

2-3 secs

Sound Forge 4.5

LQ

 

-7 dB

50 dB

28 secs

Qtools NA Harmonic artefacts audible above -10dB. -10 dB 42 dB 20 secs

 

Extract from Cool Edit's Help file:

Higher quality settings take longer to process, but at the highest setting the resultant waveform is identical to having sampled the material at the new rate to begin with.

High quality settings should be used for greater downsampling ratios. When upsampling, the Low quality setting sounds nearly the same as the high quality setting. The difference lies in a larger phase shift in the higher frequencies, but since the phase shift is completely linear, it is very difficult to notice. Downsampling at even the lowest quality setting will not have any undesired noisy artefacts. Instead, it may just sound a little more muffled because of more high end filtering.

 

Pre/Post Filter - To prevent any chance of aliasing, the pre-filter on downsampling, or post-filter on upsampling will remove all frequencies above the Nyquist thus keeping them from aliasing to lower audible frequencies. In general, for best results this option should be enabled.

I find the claim in the first paragraph hard to accept. Looking at the spectrograms of the original 48 kHz file and the converted file at 44.1 kHz using Cool Edit's highest quality settings a difference is clearly evident when you compare noise floors. Could this difference be due to dither applied during the conversion process?

All files created were compared to the original 48kHz file aurally using WaveLab. Where differences were difficult to discern Waves Paragraphic EQ was used to progressively cut at 480 Hz to highlight harmonic artefacts. See table for threshold level - distortion just perceivable - for comparisons.

Listening summary:

TDAT - harmonic distortion was soft but perceptible without any filtering applied. In fact a boost of about 5 dB was needed to mask the harmonics.

Cool Edit - all conversion types performed well in this test. No harmonics detected with 18 dB cut at 480 Hz! Perceived level difference between samples noted down to -15 dB cut in most types.

WaveLab and Sound Forge performed very similarly, although Sound Forge produced files with marginally lower distortion levels.

Spectrogram Analysis:

A spectrogram of the original 49 kHz file was first created. This acted a baseline reference for subsequent conversions. Each file conversion spectrogram was generated alongside the 48 kHz spectrogram for direct comparison. The graph range was constricted to between -140 dBFS and -90 dBFS to best highlight the differences in harmonic content between files. For each analysis the FFT size n samples was 65536 and the smoothing window used was Hanning. I made no assumptions concerning the actual noise floor of each file, concentrating on only the relative differences between the unprocessed and processed files.

Cool Edit High Quality with filter

Using Cool Edit's highest quality settings produced a file with out any significant distortion, although it was noisier than the precursor file by about 5 dB. This difference may represent dither applied to the file during processing, although it is unclear whether Cool Edit applies dither when downsampling.

 

Cool Edit Low Quality

Using Cool Edit's lowest quality setting and not applying the pre-processing anti-aliasing filter produced a spectrogram with many harmonic peaks. The noise floor was increased by about 3 dB.

 

TripleDAT (TDAT) - no setting options available

TripleDAT produced the greatest variety of harmonic distortion as well as greatest level increase of any harmonic. The dual peaks at around 4 and 4.5 kHz, seen in most other files analysed, were about 60 dB above the noise floor of the original file.


WaveLab high quality:

Compared with TDAT again many harmonics are indicated, although with less level.

 

WaveLab low quality:

Strangely there is little difference between this graph and the graph of WaveLab at high quality, apart from the harmonics incongruously being less in this graph.

 

Sound Forge high quality:

Two prominent harmonic peaks around 5 kHz with little other prominent distortion characterise this graph. These two peaks reach about 34 dB above the noise floor.

 

Sound forge low quality:

The lowest quality setting sees the 5 kHz 'twin peaks' rise higher at about 50 dB above the noise floor. Other harmonic peaks are also quite evident above 5 kHz.

 

Qtools - Sound Forge plugin:

Qtools offers 3 functions - mono to stereo effect, stereo enhancer and SRC. The SRC section doesn't allow any tweaking - you set the sample rate you want to convert to and 'go'. Most of processing time (95%) seemed to involve anti-aliasing filtering, so the actual SRC time would be minimal. Not a good sign as quality of end result seems to be dependent on process time. In fact the listening tests and spectral analysis indicate strong presence of harmonic artefacts. These results are very similar to the results found with Wavelab at highest quality - including time taken to process the SRC (with anti-aliasing taken into account).

 

Final conclusion:

The listening test correlated quite well with the results found in the spectral analyses in terms of perceived harmonic distortion and harmonic peaks detected in the graphs. Cool Edit's SRC program was the best by far, both in terms of its performance in the listening test and the spectral analyses.

Cool Edit was also the program which took the greatest length of time to process the files - no surprises here!!!