# Presented at <br> the 102nd Convention 1997 March 22-25 Munich, Germany 



This preprint has been reproduced from the author's advance manuscript, without editing, corrections or consideration by the Review Board. The AES takes no responsibility for the contents.

Additional preprints may be obtained by sending request and remittance to the Audio Engineering Society, 60 East 42 nd St., New York, New York 10165-2520, USA.

All rights reserved. Reproduction of this preprint, or any portion thereof, is not permitted without direct permission from the Journal of the Audio Engineering Society.

## AN AUDIO ENGINEERING SOCIETY PREPRINT

# Digital Signal Processing in Direct Stream Digital Editing System 

Masayoshi Noguchi, Gen Ichimura Ayataka Nishio, Shigeo Tagami*

Audio Laboratory 1Gp.<br>Advanced Development Laboratory *System LSI devision Semiconductor Company<br>Sony Corporation<br>Tokyo, Japan

## ABSTRACT

The digital recording method known as "Direct Stream Digital (DSD)", provides an analog-like sound quality and superior flexibility when compared to current digital audio (PCM) systems. In order to make the completed master sound source file of a DSD signal for package media production, an editing system in the DSD domain is necessary. In this paper, digital signal processing in a Direct Stream Digital editing system is discussed.

## 1. Introduction

"Direct Stream Digital (DSD)", which is based on the direct recording of a $\Delta \Sigma$ modulated 1-bit digital audio signal, has been introduced as a new digital recording and archival method. In the near future, package media which contains DSD audio source material will be released. In order to make the completed DSD master sound source file for package media production, an editing system in the DSD domain is necessary. As the characteristics of the DSD signal are quite different from those of the Pulse Code Modulated (PCM) signal, a unique signal processing method for DSD is required in order to build the editing system.

In the case of PCM, the signal processing consists of a numerical calculation unit and a re-quantizer as shown in Figure 1 (a). For example, the signal level may be varied by multiplying the signal by a coefficient. The word length of the signal is expanded due to this multiplication by the coefficient and must be re-quantized into an appropriate word length for reproduction, recording or transmission.

In the case of DSD, the signal processing consists of a numerical calculation unit and a $\Delta \Sigma$ re-modulator as shown in Figure 1 (b). For example, the signal level may be varied by multiplying the DSD signal by a coefficient. The word length of the signal is expanded to that of the coefficient by this multiplication and the multiplied signal must be $\Delta \Sigma$ re-modulated into 1-bit for reproduction, recording or transmission.

As shown in this comparison, signal processing for DSD is almost the same as that for PCM except for the $\Delta \Sigma$ re-modulator which is substituted for the re-quantizer in the PCM system. In the PCM system the re-quantization process may bc bypassed when the fader coefficient is exactly " 1.0 ". As such a bypass process has not existed for the $\Delta \Sigma$ re-modulator, additional signal processing must be developed to achieve a similar function for noiseless switching between the input DSD signal and the $\Delta \Sigma$ re-modulated DSD signal.

## 2. Signal Processing for DSD Signal Switching

To output the original DSD signal from the level control unit when the fader
coefficient is exactly " 1.0 ", the $\Delta \Sigma$ re-modulator should be bypassed as shown in Figure 2. It is difficult to switch between the original DSD signal and the $\Delta \Sigma$ remodulated signal directly without a glitch because these two signals are not identical. Some unique signal processing is necessary for noiseless switching.

## 2-1. Optimization of the Delay Time

To switch between two signals (the original DSD signal and the $\Delta \Sigma$ re-modulated signal) without any discontinuities, the gain and phase of both signals must match. Even though the $\Delta \Sigma$ modulator has its own frequency and phase response, its group delay is constant and its gain is $1(0 \mathrm{~dB})$ at audio frequencies which are much lower than the sampling frequency. In order to match the gain and phase response of the $\Delta \Sigma$ re-modulated signal to that of the original DSD signal, the group delay of $\Delta \Sigma$ re-modulator is designed to be an integer multiple of the sampling period, and gain of the $\Delta \Sigma$ re-modulator is designed to be $1(0 \mathrm{~dB})$. On the other hand, the DSD input signal is delayed by a delay line (shift register) to match the group delay of the $\Delta \Sigma$ re-modulator. Then, the gain and phase of the delayed DSD input signal "a" will match those of the $\Delta \Sigma$ re-modulated signal " m " at audio frequencies as shown in Figure 3.

## 2-2. Initialization of the $\Delta \Sigma$ Re-modulator

A unique modulator design and delay compensation are necessary for noiseless switching, but are not sufficient as the $\Delta \Sigma$ re-modulated signal is influenced by the internal condition of the $\Delta \Sigma$ modulator. Initialization of the $\Delta \Sigma$ remodulator is necessary before the DSD signal is input.

## 2-3. Timing Control of the Switching

Moreover, matching of the high frequency components, which are mainly generated by noise shaping, is another factor involved in noiseless switching. As the correlation at high frequencies between the delayed original DSD signal "a" and the $\Delta \Sigma$ "re-modulated signal " $m$ " is quite low, it is necessary to analyze each signal and to switch when the high frequency components of each signal are similar. Practically, switching between two signals ("a" and "m") is possible at
times when both data streams are completely matched for several samples. Figure 4 shows the block diagram of this system. As a result, noiseless switching between the delayed original DSD signal "a" and the $\Delta \Sigma$ re-modulated signal " $m$ " is achieved when all of above factors are satisfied.

## 2-4. How To Control the Internal Conditions of the $\Delta \Sigma \mathrm{Re}$-modulator

The required factors for noiseless switching are shown in 2-1 to 2-3. But these are only for direct switching without level control. Once the fader coefficient moves from " 1.0 " to some other value, it is not possible to perform noiseless switching even though the fader coefficient returns to " 1.0 ". This is because the internal condition of the $\Delta \Sigma$ re-modulator is changed from its initial condition when the input signal differs from a true 1-bit signal. For noiseless switching, reinitialization of the $\Delta \Sigma$ re-modulator is required after application of level control. Figure 5 shows the block diagram of the switching processor with re-initialization. The re-initialization process is as follows.

While the fader coefficient " K " is not " 1.0 ", the difference between the input DSD signal " A " and the multiplied signal " $\mathrm{A} * \mathrm{~K}$ " is generated and accumulated until the fader coefficient " $K$ " returns to " 1.0 ". The accumulated data is thus equal to the operational offset in the $\Delta \Sigma$ re-modulator. After the coefficient returns to " 1.0 ", the accumulated data is subtracted from the input data to the $\Delta \Sigma$ re-modulator little by little. As a result, the operational offset in the $\Delta \Sigma$ re-modulator is removed and noiseless switching becomes possible.

Note that the data in the accumulator will increase to infinity if the subtraction data is a constant. However, the operational offset inside the $\Delta \Sigma$ re-modulator is dependent upon the $\Delta \Sigma$, re-modulator output and is reset when the absolute value of the accumulated data becomes " $2 * \mathrm{fb}$ )" including zero. Here, " fb " is the integer feedback value used in the $\Delta \Sigma$ re-modulator. Consequently, data in the subtraction accumulator should fold-over from -fb to +fb and vice versa if overflow occurs. We refer to this type of accumulator as a "Cyclic accumulator". With the cyclic accumulator, operational offset in the $\Delta \Sigma$ re-modulator is calculated exactly. Figure 6 shows the value of the cyclic accumulator in the time domain.

## 3. Cross-Fade Processing for DSD Signal

With the above processing, cross-fade processing between two different signals " A " and " B " is possible. Figure 7 shows the block diagram of the cross-fade controller. Detailed procedure of switching is as follows. The delayed DSD signal "a" is selected before switching and the operational offset in the $\Delta \Sigma$ re-modulator is " 0.0 " as shown in Figure 8. At this time " $\mathrm{Ka}=1.0$ " and " $\mathrm{Kb}=0.0$ ". Before the start of the cross fade processing, switch from "a" to the $\Delta \Sigma$ re-modulated output " $m$ " with using pattern matching. After switching, simultaneously reduce the fader coefficient "Ka" from "1.0" to " 0.0 " and increase the fader coefficient " Kb " from " 0.0 " to " 1.0 " for cross fading from " A " to " B ". Even though the cross fade has finished, it is not possible to switch from " m " to delayed DSD signal " b " because of the operational offset in the $\Delta \Sigma$ re-modulator. In this case, the operational offset is equal to the sum of " $\mathrm{A} * \mathrm{Ka}+(\mathrm{B} * \mathrm{~Kb}-\mathrm{B})$ " for each sample involved, this offset is accumulated during the cross fade process. After the cross-fade, the accumulated operational offset is subtracted from the $\Delta \Sigma$ re-modulator little by little. Once the operational offset has been removed from the $\Delta \Sigma$ re-modulator, it is possible to switch from the $\Delta \Sigma$ re-modulated output " m " to the delayed DSD signal " b " with pattern matching.

## 4. Fade-In and Fade-Out Processing for a DSD Signal

As the DSD signal is 1-bit, there is no fixed mute data like digital black $0000(\mathrm{~h})$ in the PCM system. In the DSD system, cyclic " 1 " and " 0 ", for example, may be used as a mute signal. Consequently, fade-in and fade-out processing are achieved by cross-fading the DSD signal and the mute signal as shown in Figure 9. With this processing, fade-in and fade-out are possible as in the PCM system.

## 5. Conclusion

In this paper, level control and switch processing are described. With this processing, cross-fade, fade-in and fade-out of the DSD signal is possible as in the PCM system. As a result, DSD domain editing is possible using the above processing, and custom LSI for real time editing has been developed.

## 6. References

[1] Yamazaki, Yoshio, "High-speed 1-bit Signal Processing Considering Spectrum of Quantization Noise", AES 5th Regional Convention, Tokyo, July, 1991.
[2] Yamazaki, Yoshio, and Ohta, Hiroki, "Control of Quantization Noise Spectrum with High-speed 1-bit Coding System", AES 6th Regional Convention, Tokyo, June, 1993.
[3] James Angus, "The One-Bit Alternative for Audio Processing and Mastering", AES UK Managing The Bit Budget Conference, 1995.
[4] Peter Eastty and Norikazu Horikawa, "One Bit Audio Recording", AES UK Audio for New Media Conference, April, 1996.
[5] Ayataka Nishio and et al, "Direct Stream Digital Audio System", AES 100th Convention, Copenhagen, Denmark, May, 1996.
[6] Gen Ichimura and et al, "Direct Stream Digital", JAS Conference, Tokyo, November, 1996.
[7] Ayataka Nishio and et al, "A New CD Mastering Processing Using Direct Stream Digital", AES 101th Convention, California, USA, November, 1996.

(a) PCM

(b) DSD

Figure 1 Level Controller


Fader
Coefficient K
DSD Output


Figure 2 Level Controller for DSD


Figure 3


Figure 4


Figure 5


Figure 6 Data Variation of Cyclic Accumulator

## DSD

Input A


Figure 7 DSD Signal Cross Fade Controller


Figure 8 Exposition of Cross Fade Control


Figure 9 Fade-In Fade-Out Controller

