Critique of buzz

From: Robin Davies (rerdavies@msn.com)
Date: Fri Jun 02 2000 - 00:02:39 EDT


My initial impression of buzz was that it was not that useful for realtime
applications.
However, after following up on John Lazzaro's note about the closed-form
formula
for buzz, I have changed my mind. Very efficient, emminently realtime
implementations
of buzz are quite practical with maximum errors in the range of +/- 2^-20.

I followed up on John's advice about the closed form equations for buzz. A
little further research led me to Moorer's 1973 paper in J. Accoustic
Engineering (sorry, don't have the reference handy) on DSF Synthesis
(presumably the original motivation for the CSOUND buzz operator).

The more I read, the more interested I became in getting a good real-time
implementation of buzz running in Sfx. Buzz really seems to be of core
utility in both subtractive and additive sythesis, and is ideal for use as a
source for bandwidth-limited saw waves, for example. Furthermore, moorer's
idea for using dsf synthesis as better, more controllable alternative to FM
synthesis really looks pretty practical these days on big fast machines with
poisonous amounts of memory.

Using a quick sine function purpose written for the buzz opcode in
conjunction with the closed-form equation, the total a-rate execution time
for buzz can be brought down into the 50-80 clock range. I currently have a
fully implemented buzz with +/- 1^-20 worst-case error running on Sfx that
executes at about this speed.

However, there are a number of definition problems that I see in the buzz
opcode as currently defined that I wished to raise.

The choice of cos vs sin as the trig function for buzz.
 ---------------------------

The choice of cos instead of sin in the functions that define buzz seem to
be an unfortunate one.

The original csound buzz operator takes a table that can be filled with
either sin or cos. Moorer's original conception of the DSF is based on the
identity that John Lazarro pointed out for the equation:

     SUM(k = 0..N) sin(phi + k*beta)

which can be expressed in a reasonably compact closed form.

Moorer is clearly concious that both phi and beta should be functions of
time. Also implicit in this identity is that the sum of sins can be
converted to sum of cosines by setting phi to (PI/2+T*tau). Thus the
identity can be used to generate both sin and cos series. Furthermore, the
Moorer
form can be used to generate, for example, only odd harmonics, or only even
harmonics, by selecting appropraite substitutions for phi and beta.

Does it matter? Yes it does, as immediately becomes apparent of you look at
the output of buzz. First, the sum of cosines generates odd functions that
can be typically characterised as a single upward spike in each cps
period when rolloff is positive. The shape varies according to input
parameters, but the basic shape is the same. Generally, the more harmonics
used, the narrower the spike. This is admittedly useful for
things like a buzz sound source for vocal tract modelling since it
approximates the actual sound output of a vocal fold fairly well. However,
computationally, it really sucks. For high-order harmonics, the spike tends
to get narrower and narrower, causing untold grief for audio processing
algorithms operating in discrete time domain. Typical output is a single or
double-sample spike with a little ringing around the edges. The sharp spike
is really nasty, though. Note that those sharp spikes are *particularly*
ugly in tables where it's very easy for an oscil sweep through the table to
behave irregularly around the spiked sample.

If definition of buzz is converted to use sin instead of cos, then not only
is the resulting function odd and hence symetrical around zero output (not
true of the cos version), but the sharp spikes are spread out in time,
making it easier to do digital processing of the output.

Even better than buzz would be to provide a core opcode that exposes the phi
and alpha
values of the moorer identity.

Something like this would be good:

   aopcode dsf(
        ivar phase, // 0.. 1. The constant phase added to phi.
        ksig cpsPhi,
        ksig nHarm,
        ksig cpsBeta,
       ksig rolloff
       );

The output of the opcode is:

    SUM(k = 0.. nHarm) sin(2*PI*phase + phasor(cpsPhi)*2*PI +
phasor(k*cpsBeta)*2*PI);

Consider the following examples:

    dsf(0.5, cps*5, 7, cps, 0.5) // harmonics 5 to 12 of cps. cos series.
    dsf(0.5, cps, 0, cps*2, 0.5) // Odd harmonics only.
    dsf(0.0, cps, 5, cps, 0.5) // Harmonics 1 to 6. sin series.

I'm not sure what the normalization factor would be. It's not quite as
straightforward as the sum of cosines sample, but the current scale formula
does actually put an upper limit on the maximum range, and would probably
suffice.

The rate of arguments of buzz.
 ------------------------------

The argument rates of buzz pose a serious problem for the implementer.

Specifically, the behaviour of the opcode when the ksig value of nHarm is
not strictly positive causes real grief. The reason for this is that the
opcode implementation must defer calculation of intermediate parameters in
the buzz equation until a-rate because if nHarm is not positive, then the
number of harmonics must be calculated using cps, which is an a-rate
parameter. k-rate calculations are very costly for buzz.

If nHarm is not postive, then the opcode must recalculate the number of
harmnonics from cps on *each* a-cycle, and check for changes. If the number
of harmonics change, then all intermediate parameters must be recalculated
from within a-rate code.

Ironically, there are several good optimizations that can be performed for
buzz. For example, if roloff^nHarm is suitably small then the closed form
equation can take an
even more compact form. However, again, this condition can currently only be
detected in a-rate code because it depends on cps if nHarm is non-positive.
Note that, ironically, roloff^nHarm is often less than machine epsilon for
sensible choices of rolloff where nHarm is not greater than zero.

The real problem is this: the a-rate cps parameter is fine. The conditional
behavior of the opcode based on whether nHarm is strictly positive is fine.
However, the two in combination are lethal.

There's no easy way to optimize the opcode for the condition where nHarm is
strictly positive. A conditional check must be done at a-rate to determine
whether a-rate recalculation of the opcode internal values needs to take
place.

If we ever get another crack at buzz, *one* of the following steps should be
taken.

(1) Separate opcodes for fixed and bandwidth limited harmonics.

(2) A version taking a k-rate cps argument should be provided.

 Problems with Bandwidth-limited buzz signals and scale.
 ----------------------------------------------

Consider what happens in the following scenario: cps is a-rate; nHarm is 0
or negative, triggering the bandwidth limiting behaviour; the number of
calculated harmonics changes at a-rate.

In this case, the scale will be recalculated, resulting in a sharp step (a
click) in the output signal. I think it would make more sense to have scale
calculated as if the signal were not bandwidth limited when nHarm is not
strictly positive.

Comments, opinions, anyone?



This archive was generated by hypermail 2b29 : Mon Jan 28 2002 - 12:03:56 EST