|
|||||||
![]() |
|
|
Thread Tools |
|
|
#1 |
|
HardwareHeaven Extreme Member
Join Date: Jan 2005
Posts: 5,563
Rep Power: 62 ![]() ![]() ![]() ![]() ![]() ![]() |
Using accum...
Is 'accum' treated differently in a 'macs' instruction and a 'macints' instruction, and if so, how?
The reason I ask is that I am getting different results using 2 seemingly equivalent instructions: ; the following three sets of instructions work as expected and have same result macs tmp, 0, x, y; macs r, temp, x, 0x80000000; macs tmp, 0, x, y; macints r, tmp, x, 0xFFFFFFFF; macs 0, 0, x, y; macs r, accum, x, 0x80000000; ; this one has a different result (saturates) macs 0, 0, x, y; macints r, accum, x, 0xFFFFFFFF; It is looking as though 'accum' should not be used with a 'macints' instruction, but I do not recall this being documented anywhere (I thought that the only restriction was that it could only be used as an 'A' operand). Also, are there any restricitions on using 'accum' in consecutive instructions? The reason I ask is because (again) I am getting unexpected results: ; r1 and r2 have different results (when I expect them to be equal) ; r1 has the expected result, while r2 does not macs 0, 0, x0, y0; macs 0, accum, x1, y1; macs r1, accum, x2, y2; macs 0, 0, x0, y0; macs 0, accum, x1, y1; macs r2, accum, x2, y2; Additionally, if I add an instruction in between the above, 'r1' and 'r2' then appear to have the correct result, but the result of the instruction in between appears to be wrong (even an instruction like 'macs r3, 0, 0, 0', where you would expect 'r3' to equal 0). i.e. (to be clear) ; r1 and r2 have expected result, while r3 does not macs 0, 0, x0, y0; macs 0, accum, x1, y1; macs r1, accum, x2, y2; macs r3, 0, 0, 0; macs 0, 0, x0, y0; macs 0, accum, x1, y1; macs r2, accum, x2, y2; BTW: Testing was done on a 10k1 card (in case this is another 10k1 only anomaly). -Russ Last edited by Russ; Feb 20, 2007 at 04:36 AM. |
|
|
|
|
|
|
|
HardwareHeaven Extreme Member
Join Date: Jan 2005
Posts: 5,563
Rep Power: 62 ![]() ![]() ![]() ![]() ![]() ![]() |
Thanks Tril,
I had thought I had read somewhere, that there was another restriction with using 'accum', but could not remember where I had read about it (and did not recall it being associated with the 'macints' instruction). So that just leaves the second part of the post... BTW: Max, eyagos, Tiger, whom-ever is currently maintaining the beginner's DSP guide, can you add that info to the guide? Last edited by Russ; Feb 20, 2007 at 04:29 AM. |
|
|
|
|
|
#4 |
|
kX user
Join Date: Apr 2004
Posts: 851
Rep Power: 0 ![]() |
Good idea, that guide needs a lot of corrections and additions.
I'll edit it as soon as I find time.
__________________
Miss you, Steve... |
|
|
|
|
|
|
|
HardwareHeaven Extreme Member
Join Date: Jan 2005
Posts: 5,563
Rep Power: 62 ![]() ![]() ![]() ![]() ![]() ![]() |
Thanks Tiger,
It is good to hear from you
|
|
|
|
|
|
#6 |
|
h/h member-shmember
Join Date: Dec 2002
Location: Evil Empire
Posts: 2,639
Rep Power: 69 ![]() ![]() ![]() ![]() ![]() ![]() |
Well, actually it's not like we "may not" use the accum in macints - we "may" - it just won't do what we expect it to do - so actually we "don't want" to use it.
The trick is that the ALU differentiates fractional/integer MAC instructions not at (or just right after) multiplication stage (as we're accustomed to think of it - as it makes things easier) - but the real difference seems to be in the way ALU flushes accumulator's value to the R register (and the way A operand is written to accum (if A is not being accum itself)). I've tried t write down the actual MAC* operation in pseudocode and here's what i've got (simplified - w/o saturation): Code:
<MAC*>
if (A is not ACCUM) then
ACCUM[66:63] <- sign_bits(A)
ACCUM[62:31] <- A
ACCUM[30:0] <- 0
fi
ACCUM <- ACCUM + (X * Y)
R <- ACCUM[62:31]
<MACINT*>
if (A is not ACCUM) then
ACCUM[66:32] <- sign_bits(A)
ACCUM[31:0] <- A
fi
ACCUM <- ACCUM + (X * Y)
R <- ACCUM[31:0]
(1) macs tmp, 0, x, y macints r, tmp, x, 0xFFFFFFFF and (2) macs 0, 0, x, y macints r, accum, x, 0xFFFFFFFF give different results - they Are very different beasts. actually, code (2) does pretty the same job as macints tmp, 0, x, y macints r, tmp, x, 0xFFFFFFFF (if we're not counting saturation of course) --- here are three examples that may help to understand (and proof) those 'operation pseudocodes' above Code:
static a = 0x123 static b = 0x123 static c, d macs c, 0, a, b ; c = 0 macints d, accum, 0, 0 ; d = 0x14ac9 ! end Code:
static a = 0x123 static b = 0x123 static c, d macints c, 0, a, b ; c = 0x14ac9 macs d, accum, 0, 0 ; d = 0 ! end Code:
static a = 0x123 static b = 0x123 static c, d, e macs c, 0x1, a, b ; c = 0x1 (accum = 0x100014ac9) macints d, accum, 0, 0 ; d = 0x7fffffff (0x100014ac9 saturated at [31:0]) macs e, accum, 0, 0 ; e = 0x1 (again :) end Accum mysteries @ K1 - i recall also the following thread: http://www.hardwareheaven.com/effects-...e-anomoly.html So yeah, accum use on K1 sometimes fails (under some really unknown circumstances). For me it seems to be just a sort of bug in hardware. Last edited by Max M.; Feb 22, 2007 at 10:20 AM. |
|
|
|
|
|
|
|
HardwareHeaven Extreme Member
Join Date: Jan 2005
Posts: 5,563
Rep Power: 62 ![]() ![]() ![]() ![]() ![]() ![]() |
Thanks for the explanation Max
![]() Good info as always... As for the 10k1 anomalies, I wish I could figure out exactly what is happening, as it makes it a little hard to optimize the code, when you have to confirm that it is doing what it is supposed to do, everytime I use the accum instruction. However, if it is indeed some hardware bug, I guess I am out of luck. I didn't really expect an answer with that anyway, but mainly wanted to document another situation where I see it happening, so that I (and other peple) can see which situations might give them trouble, etc. |
|
|
|
|
|
|
|
HardwareHeaven Extreme Member
Join Date: Jan 2005
Posts: 5,563
Rep Power: 62 ![]() ![]() ![]() ![]() ![]() ![]() |
BTW: In both cases (with the anomalies), the manner in which the result is off, follows a similar pattern, so there does seem to be some logic to it.
i.e. Code:
[COLOR=Gray]macs 0, 0, x0, y0;[/COLOR] macs 0, accum, x1, y1; [COLOR=Gray]macs r1, accum, x2, y2;[/COLOR] [COLOR=White]macs r3, 0, 0, 0;[/COLOR] [COLOR=Gray]macs 0, 0, x0, y0;[/COLOR] [COLOR=Gray]macs 0, accum, x1, y1;[/COLOR] [COLOR=Gray]macs r2, accum, x2, y2;[/COLOR] So maybe there is some intended design here, but if so, I wonder why it would be different in the 10k2 design? |
|
|
|
|
|
#9 |
|
h/h member-shmember
Join Date: Dec 2002
Location: Evil Empire
Posts: 2,639
Rep Power: 69 ![]() ![]() ![]() ![]() ![]() ![]() |
mmm, that makes sense... sure, it should be some pattern (even if it is hardware bug)
unfortunately, i can't investigate it in any way since i've fried my Live! somewhere in 2003... ;( btw. that buggy fxmix code was something like: Code:
.... macs tmp, tmp, in5_l, In5Level; macs 0, 0, tmp, FX1; macs out_l1, accum, mono_in1, FX1; macs 0, 0, tmp, FX2; macs out_l2, accum, mono_in2, FX2; ... so yeah, it looks similar (a sort of)... >So maybe there is some intended design here, but if so, I wonder why it would be different in the 10k2 design? yeah, i doubt such crazy stuff was really intended, so of course it looks more like "it was not intended and therefore fixed in k2" :) who knows... (well, i know who knows - but... :)) So, how do you think - preliminary - can we say that for K1 it is better to avoid using a combination of "writing to 0" and "using accum in subsequent instructions"? Last edited by Max M.; Feb 22, 2007 at 05:09 AM. |
|
|
|
|
|
|
|
|
HardwareHeaven Extreme Member
Join Date: Jan 2005
Posts: 5,563
Rep Power: 62 ![]() ![]() ![]() ![]() ![]() ![]() |
Quote:
With the previous anomaly (that you posted the link for (above)), the 'accum' instruction was not even involved. I am thinking that the 'accum' instruction is not the culprit here either, and again, the problem is with leaving the "R" param zero. Maybe this is not a good idea to do on k1 models, unless we are willing to manually verify that it is working as we expect, in our code that uses it (and that it will not have any effect on plugins connected to our plugin, etc (i.e. I seem to recall one case where one plugin was bleeding into another one, and I would be willing to bet that it is related to this issue)). Additionally, using a register (temp or whatever), as the "R" param, even if it is not used for any purpose (other than to prevent this problem), does not effect the useful of using 'accum' (i.e. leaving the "R" param zero (in the previous instruction), does not seem to be a requirement of using the 'accum' instruction (guard bits still work, etc)). The only drawback is that in some cases, you might need one extra GPR for this purpose, but that is not so bad. Last edited by Russ; Feb 22, 2007 at 08:53 AM. |
|
|
|
|
|
|
|
|
HardwareHeaven Extreme Member
Join Date: Jan 2005
Posts: 5,563
Rep Power: 62 ![]() ![]() ![]() ![]() ![]() ![]() |
On a side note: While thinking about this, I had an idea regarding 'temp' registers. It seems to me (correct me if I am wrong) that it should be possible to have some global temp registers (i.e. temp registers that can be used by all plugins). If some GPR's were reserved for such a purpose, it seems that it could cut down on the total number of GPR's used overall, as each plugin would not have to create it's own temp registers (for those plugins that use them anyway).
What do you think? |
|
|
|
|
|
#12 | |
|
Tail Razer
Join Date: Jun 2005
Location: Bernyurass, AZ - USA
Posts: 4,027
Rep Power: 0 ![]() ![]() |
My recent re-study made this easy for me to find.
Quote:
Ok - I shut up now....
|
|
|
|
|
|
|
#13 |
|
h/h member-shmember
Join Date: Dec 2002
Location: Evil Empire
Posts: 2,639
Rep Power: 69 ![]() ![]() ![]() ![]() ![]() ![]() |
>temp registers.
this is how is supposed to be from the beggining. Temp registers should be global (Creative, Emu and all Linux loaders support this - only kX loader s...s )Moreover, const registers may be global too. >as I would think it would 'break' existing plugins No, it won't. |
|
|
|
|
|
|
|
HardwareHeaven Extreme Member
Join Date: Jan 2005
Posts: 5,563
Rep Power: 62 ![]() ![]() ![]() ![]() ![]() ![]() |
@Maddogg6,
By the very nature of a temp variable, you make no assumptions of it's initial value. i.e. Your first instructions that uses it, uses it as the "R" param, thus overwriting its current value (so it's initial value really makes no difference). @Max, Yeah, I didn't think about constants, but I suppose if 2 (or more) plugins use the same (non-hardware) constant, then making them global, would also save some GPRs. Out of curiousity, if this was intended from the beginning, why is it not being done? |
|
|
|
|
|
#15 |
|
h/h member-shmember
Join Date: Dec 2002
Location: Evil Empire
Posts: 2,639
Rep Power: 69 ![]() ![]() ![]() ![]() ![]() ![]() |
>Out of curiousity, if this was intended from the beginning, why is it not being done?
That's the question. Since the loader is part of the driver - it's all up to E. to implement whatever he thinks to be important. So at earlier stages of development we could go without good loader but later there were too many other thing to do. No wonder a DSP stuff had lowest priority - E., personally, has no interest in dsp programming. (Well, strictly speaking if we analyse a benefit of using shared temps - it would not be so big, less then ~20 gprs save for typical DSP setup. Although there're some situations where both temps and consts could give much bigger benefit - for example - with 2 EQG10 loaded, a shared constants would save about 30 gprs (60 for three instances and so on...).) Last edited by Max M.; Feb 22, 2007 at 10:57 PM. |
|
|
|
|
|
|
|
HardwareHeaven Extreme Member
Join Date: Jan 2005
Posts: 5,563
Rep Power: 62 ![]() ![]() ![]() ![]() ![]() ![]() |
I understand... It is too bad as for us 10k1 users, every extra GPR helps....
|
|
|
|
|
|
#17 |
|
h/h member-shmember
Join Date: Dec 2002
Location: Evil Empire
Posts: 2,639
Rep Power: 69 ![]() ![]() ![]() ![]() ![]() ![]() |
it is never late. Finally, why a few ZS Notebook owners have more priority then a màjor kX contributers? (hehe, just kidding of course - but...)
|
|
|
|
|
|
|
|
HardwareHeaven Extreme Member
Join Date: Jan 2005
Posts: 5,563
Rep Power: 62 ![]() ![]() ![]() ![]() ![]() ![]() |
Hehe, well considering E. has a ZS Notebook himself (IIRC)...
And I cannot complain as I was in a similair situation with my rear channels not working on my SB022x model... |
|
|
|
|
|
#19 | |
|
Tail Razer
Join Date: Jun 2005
Location: Bernyurass, AZ - USA
Posts: 4,027
Rep Power: 0 ![]() ![]() |
Quote:
Say 2 plugins are using a 'temp' register called 'V1' If 'temp' (and as it turns out - 'const' too) become global... Wouldn't they interfere with each other - and require changing them to 'statics' ??? I guess I thought I was understanding somethings ...maybe not... ?? edit: it would also just seem to make more sense to me to make a 'global' register - if DANE is infact 'interpreted' ... no? |
|
|
|
|
|
|
|
|
HardwareHeaven Extreme Member
Join Date: Jan 2005
Posts: 5,563
Rep Power: 62 ![]() ![]() ![]() ![]() ![]() ![]() |
Remember that all of the code for all the plugins (the entire DSP) is executed sequentially, so there is no chance for another plugin to change the value of your temp variable while your plugin is using it.
Also, do not forget, that all we are really taking about here is an address. i.e. 'temp' points to address xxxx. 'Global' in this respect, just means that all plugins can use this address to store temporary data (and again, it is temporary data, i.e. is only valid for one sample cycle, and it is only valid to your plugin, during the time your plugin's code is being executed). As for constants, well it is a constant value, so a plugin cannot change its value. BTW: Do not get confused between the Dane interpreter, and the plugin loader. Last edited by Russ; Feb 23, 2007 at 05:04 AM. |
|
|
|
|
|
#21 |
|
h/h member-shmember
Join Date: Dec 2002
Location: Evil Empire
Posts: 2,639
Rep Power: 69 ![]() ![]() ![]() ![]() ![]() ![]() |
btw. kX's loader is the first candidate to go open source. So maybe it's time.
(i've already gave my permission for the Dane to be open - although ii'm not happy about that since the Dane sources suck) the main problem is that it is in the kernel. |
|
|
|
|
|
#22 | ||
|
S-3D enthusiast
|
Quote:
Quote:
|
||
|
|
|
|
|
#23 |
|
h/h member-shmember
Join Date: Dec 2002
Location: Evil Empire
Posts: 2,639
Rep Power: 69 ![]() ![]() ![]() ![]() ![]() ![]() |
hehe. there're no comments at all
to be honest i shame of it - it is my first C programe ("Hello World!").>Can you explain that please. Yes, sure - as you probably know the driver programming is very different from those higher level application programming - it has so many restrictions and so many limits that no one can be happy with it. So despite problems with debugging (never tried to debug kernel? hehe - i envy you) - it is the problem with implemention - it is really unclear how we actuallly can make some parts of kX open without making every source code line to be open. Ah - bad expalnation - i'll go further lately Last edited by Max M.; Feb 24, 2007 at 04:53 AM. Reason: sorry, bad english |
|
|
|
|
|
|
|
HardwareHeaven Extreme Member
Join Date: Jan 2005
Posts: 5,563
Rep Power: 62 ![]() ![]() ![]() ![]() ![]() ![]() |
|
|
|
|
|
|
#25 |
|
kX Project Lead Programmer and Coordinator
Join Date: Dec 2002
Posts: 3,119
Rep Power: 75 ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Russian variables: he-he
![]() -- so, opensource, opensource... I see no way to do this at the moment. however, all the registers are accessible from the user-level, so, one can write a custom DSP uploader (as a superset on iKX class), implement the load/unload/update/..._microcode functions in his own way and send me the sources to be integrated into the driver. the only requirement - use the same data types defined in the kX SDK (if you need additional fields, let me know) note that the driver uses practically the same functions internally, with the exception of very minor changes --- here are the prototypes of the functions used in the kernel: [pre] // returns pgm id or 0 if failed KX_API(int,kx_load_microcode(kx_hw *hw,char *name,dsp_code *code,int code_size, dsp_register_info *info,int info_size,int itramsize,int xtramsize, char *copyright,char *engine,char *created,char *comment,char *guid, int force_pgm_id=0)); KX_API(int,kx_update_microcode(kx_hw *hw,int pgm_id,char *name,dsp_code *code,int code_size, dsp_register_info *info,int info_size,int itramsize,int xtramsize, char *copyright, char *engine, char *created, char *comment, char *guid, unsigned int flag)); KX_API(int,kx_unload_microcode(kx_hw *hw,int pgm)); KX_API(int,kx_connect_microcode(kx_hw *hw,int pgm1,word src,int pgm2,word dst)); KX_API(int,kx_connect_microcode(kx_hw *hw,int pgm1,char *src,int pgm2,char *dst)); // if pgm2==-1 -> dst='physical input/physical output';[untranslated] KX_API(int,kx_disconnect_microcode(kx_hw *hw,int pgm,word src)); KX_API(int,kx_disconnect_microcode(kx_hw *hw,int pgm,char *src)); // if pgm==-1 src='physical input'; [untranslated] KX_API(int,kx_get_connections(kx_hw *hw,int pgm,kxconnections *out,int size)); // if size==0 -> returns needed buffer size KX_API(int,kx_set_volume(kx_hw *hw,char *pgm_id,char *name,dword val,dword max=0x7fffffff)); KX_API(int,kx_set_volume(kx_hw *hw,int pgm_id,word reg,dword val,dword max=0x7fffffff)); // val=KX_MIN_VOLUME..KX_MAX_VOLUME KX_API(int,kx_set_microcode_name(kx_hw *hw,int pgm_id,const char *str,int what=0)); KX_API(int,kx_enum_microcode(kx_hw *hw,int pgm,dsp_microcode *mc)); KX_API(int,kx_enum_microcode(kx_hw *hw,char *pgm_id,dsp_microcode *mc)); KX_API(int,kx_enum_microcode(kx_hw *hw,dsp_microcode *mc_ret,int size)); KX_API(int,kx_get_microcode(kx_hw *hw,int pgm,dsp_code *code,int code_size, dsp_register_info *info,int info_size)); // info_size & code_size - sizes of code&info buffers (in bytes); // if they dont match ones got by kx_enum_microcode -> fuction will fail... [/pre] as you can see, these functions are quite close to their iKX:: versions -- unfortunatelly, I cannot even release the sources of the DSP functions: 1. you won't be able to compile them without appropriate headers, some of them are NDA-covered 2. you won't be able to link them, since you don't have driver object files/libraries E. |
|
|
|
|
|
|
|
HardwareHeaven Extreme Member
Join Date: Jan 2005
Posts: 5,563
Rep Power: 62 ![]() ![]() ![]() ![]() ![]() ![]() |
BTW: Another couple of features that would be useful:
An option to consolidate the DSP code (i.e. move free space to end). An option to reorder the plugins. |
|
|
|
![]() |
| Thread Tools | |
|
|