SSE4 instruction set

Peter Johnson peter at
Wed Sep 20 22:21:13 PDT 2006

On Wed, 20 Sep 2006, Mathieu Monnier wrote:
> Attached is the promised source code. It confirms the pmulhrsw behavior, it 
> shows that abs(-128) = -128, and it shows that pmaddubsw is quite more 
> tricky than one would have think : it does a ( unsigned char x signed char) 
> multiplication, and then a signed word saturated addition. So pmaddubsw mm0, 
> mm1 != pmaddubsw mm1, mm0 ( that's the price to pay for a much more useful 
> instruction ).

Thanks!  Your test and your implementation patch (with a couple very minor 
changes) committed as r1629.


