SSE4 instruction set

Mathieu Monnier manao at melix.net
Wed Sep 20 01:49:05 PDT 2006


Hi,

Attached is the promised source code. It confirms the pmulhrsw behavior, 
it shows that abs(-128) = -128, and it shows that pmaddubsw is quite 
more tricky than one would have think : it does a ( unsigned char x 
signed char) multiplication, and then a signed word saturated addition. 
So pmaddubsw mm0, mm1 != pmaddubsw mm1, mm0 ( that's the price to pay 
for a much more useful instruction ).

Regards,

Mathieu


-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: test_sse4.c
Url: http://cvs.tortall.net/pipermail/yasm-devel/attachments/20060920/85a0e9af/attachment.c 
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: test_sse4.asm
Url: http://cvs.tortall.net/pipermail/yasm-devel/attachments/20060920/85a0e9af/attachment.ksh 


More information about the yasm-devel mailing list