SSE4 instruction set
Mathieu Monnier
manao at melix.net
Wed Sep 20 01:49:05 PDT 2006
Hi,
Attached is the promised source code. It confirms the pmulhrsw behavior,
it shows that abs(-128) = -128, and it shows that pmaddubsw is quite
more tricky than one would have think : it does a ( unsigned char x
signed char) multiplication, and then a signed word saturated addition.
So pmaddubsw mm0, mm1 != pmaddubsw mm1, mm0 ( that's the price to pay
for a much more useful instruction ).
Regards,
Mathieu
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: test_sse4.c
Url: http://cvs.tortall.net/pipermail/yasm-devel/attachments/20060920/85a0e9af/attachment.c
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: test_sse4.asm
Url: http://cvs.tortall.net/pipermail/yasm-devel/attachments/20060920/85a0e9af/attachment.ksh
More information about the yasm-devel
mailing list