At 09:33 AM 9/1/99 -0400, Eric Scheirer wrote:
>"All known" bottlenecks would be a very long list.  But I can make
>some comments on yours and add a few more.
>
And actually we decided to rewrite the whole decoder from scratch...
>
>Yes, there's a huge amount of overhead in interpreting the parse
>tree in saolc.  Creating a virtual machine to do this efficiently
>would help a lot, I think Giorgio has done some work on this, but
>maybe it's currently proprietary.  Also, it might be easier to apply
>static optimization to VM code than to a parse tree in many cases.
>
At the moment we have not a virtual machine, I call it a virtual ALU.
A virtual machine will require a completely independent execution 
block that receives a program to execute in p-code and also "interrupts"
for interaction updates from the stream.
The proprietary or not is an issue I am trying to solve in these days. We
have a meeting of our ThreeDSPACE project the 10th of this month, in
which a decision should be taken. My intention is to release the lexer/
parser/virtual compiler in a next future, and then the execution engine,
and try to manage a kind of "linux-like" activity to find out if somebody
is interested in help completing a freeware version. Stay tuned :-)))
 
>>3. Opcodes are currently implemented as singe function with (i/k/a) rate
>>switching inside opcodes where necessary. ( There should be separate
>>functions for each rate primitive for rate dependent opcodes. A compiled
>>instrument would have a separate subprogram for each rate(?). )
>
>I'm not sure that this is necessarily much faster.
>
Results say that it is not. At least in our implementation.
>
>A major one you didn't mention is the possibility of block-processing
>the SAOL code much of the time.  Since the semantics of SAOL are sample-
>by-sample, you can't do this every time, but many instruments obviously
>give the same answer whether you do the processing in blocks or in
>samples:
>
>instr foo() {
>  asig a;
>  table t(...);
>
>  a = 120;
>  output(oscil(t,a));
>}
>
>If you were to write block-processing versions of the ugens and implement
>a feedback detector to figure out when you could use them, I bet many
>useful things would run in real-time in saolc.  This combined with (1)
>could easily give you a ten-fold speedup, I think.
>
We have some exact figures for these improvements, as you know. I report here
a small example. Consider an FM clarinet like the attached one. On my platform
(an "old" MMX 200 MHz, 64M  ram), with a short monofonic example of nearly
4 seconds the reference software takes 80 seconds, 58 on a Solaris
UltraSPARC II.
The virtual machine + clean coding takes 27 seconds (20 on the UltraSparc). 
If we add the block-by block processing (here everything is "blockable") we go 
down to 11 seconds (8 on UltraSPARC, speedup ratio is 8).
Using optimized vectorial libraries for interpolation on Pentium (oscil based
instrument !), it takes slightly less than 4 seconds. Allpass and comb are
not optimized yet, but the structure in the end is still an interpreted vm.
Anyway the speedup depends a lot on the algorithm. I can't say anyway that our
soft is fully optimized yet at this time. Claudio and me still have "some"
work to
do.
Regards, 
        Giorgio
__________________________________________________________________
Giorgio ZOIA
Integrated Systems Laboratory - DE/LSI - EPFL
CH-1015 Lausanne - SWITZERLAND
Phone: + 41 21 693 69 79      E-mail: Giorgio.Zoia@epfl.ch
Fax: +41 21 693 46 63
__________________________________________________________________
This archive was generated by hypermail 2b29 : Wed May 10 2000 - 12:15:34 EDT