SSE4 instruction set
peter at tortall.net
Wed Sep 20 22:21:13 PDT 2006
On Wed, 20 Sep 2006, Mathieu Monnier wrote:
> Attached is the promised source code. It confirms the pmulhrsw behavior, it
> shows that abs(-128) = -128, and it shows that pmaddubsw is quite more
> tricky than one would have think : it does a ( unsigned char x signed char)
> multiplication, and then a signed word saturated addition. So pmaddubsw mm0,
> mm1 != pmaddubsw mm1, mm0 ( that's the price to pay for a much more useful
> instruction ).
Thanks! Your test and your implementation patch (with a couple very minor
changes) committed as r1629.
More information about the yasm-devel