From poirierg at gmail.com Tue Dec 2 05:20:18 2008 From: poirierg at gmail.com (Guillaume POIRIER) Date: Tue, 2 Dec 2008 14:20:18 +0100 Subject: What does -f macho correspond to? In-Reply-To: References: <4e03026a0811271405i743b1fdfq789b168c027f60ad@mail.gmail.com> Message-ID: <4e03026a0812020520y4d6d0f40pd8dd9f2fadcf7ae1@mail.gmail.com> Hello, On Fri, Nov 28, 2008 at 7:45 PM, Peter Johnson wrote: > On Thu, 27 Nov 2008, Guillaume POIRIER wrote: >> >> I'm a bit puzzled. When I run yasm -f help, I get, among other things >> 3 formats for OSX: >> >> macho Mac OS X ABI Mach-O File Format >> macho32 Mac OS X ABI Mach-O File Format (32-bit) >> macho64 Mac OS X ABI Mach-O File Format (64-bit) >> >> Since to the best of my knowledge, there are just 2 kinds of x86 >> Mach-O formats: 32-bits and 64-bits, what should I expect if I choose >> "-f macho" on a 32-bits machine, same question for a 64-bits machine? >> >> I'm asking this because the split-radix FFT code of FFmpeg somewhat >> lead code that segfaults when compiled with -f macho32 on a 32-bits >> machine, and is compiled and runs fine when compiled with "-f macho". >> >> Please lighten me up here if possible... Or direct me to the right >> mailing list if I'm off-topic here. > > That's very strange. Essentially all macho does is choose between macho32 > and macho64 based on the bitness of the machine architecture. E.g. -f macho > -m amd64 should be the same as -f macho64, and -f macho -m x86 (the latter > is the default x86 machine) should be the same as -f macho32. > > What are the other command line arguments? It could be you're hitting an > odd initialization code path (all 3 share the same code so it's digging > through conditionals at that layer). Thanks for the explanation. Though it was logical for me that -f macho meant "native" wordsize, now I'm least I'm sure. I looked into the user code a bit more and found the culpit. It wasn't indeed in yasm code but on FFmpeg's. Thanks a lot again, Guillaume -- One should not give up hope on imbeciles. With a little training, you can make them into soldiers. -- Pierre Desproges From pcwalton at cs.ucla.edu Wed Dec 3 21:50:49 2008 From: pcwalton at cs.ucla.edu (Patrick Walton) Date: Wed, 03 Dec 2008 21:50:49 -0800 Subject: Build failure: gen_x86_insn.py Message-ID: <49376FB9.2020305@cs.ucla.edu> Hi, I get a build failure when attempting to build yasm from svn. modules/arch/x86/gen_x86_insn.py fails here: rcstag = "$Id$" scriptname = rcstag.split()[1] scriptrev = rcstag.split()[2] because the array indices are out of range. I'd submit a patch, but I'm not sure what the goal is here... any help? Thanks! Patrick From peter at tortall.net Thu Dec 4 10:19:29 2008 From: peter at tortall.net (Peter Johnson) Date: Thu, 4 Dec 2008 10:19:29 -0800 Subject: Build failure: gen_x86_insn.py In-Reply-To: <49376FB9.2020305@cs.ucla.edu> References: <49376FB9.2020305@cs.ucla.edu> Message-ID: <20081204181824.M35579@www.tortall.net> On Wed, 03 Dec 2008 21:50:49 -0800, Patrick Walton wrote > I get a build failure when attempting to build yasm from svn. > modules/arch/x86/gen_x86_insn.py fails here: > > rcstag = "$Id$" > scriptname = rcstag.split()[1] > scriptrev = rcstag.split()[2] > > because the array indices are out of range. I'd submit a patch, but > I'm not sure what the goal is here... any help? $Id$ should have expanded to something like $Id: gen_x86_insn.py 1526 ...$ when you checked it out of svn. Are you going through something like git-svn or hg-svn which doesn't expand keywords? The scriptname and scriptrev variables are only used to print what version generated it in a comment in the output file, so it's easy enough to just catch the exception and force scriptname and scriptrev to fixed values (e.g. "gen_x86_insn.py" and "HEAD"). I'll get something like that committed later tonight; in the meantime you can just hardwire the values instead of splitting from rcstag. Peter From pcwalton at cs.ucla.edu Thu Dec 4 10:24:36 2008 From: pcwalton at cs.ucla.edu (Patrick Walton) Date: Thu, 04 Dec 2008 10:24:36 -0800 Subject: Build failure: gen_x86_insn.py In-Reply-To: <20081204181824.M35579@www.tortall.net> References: <49376FB9.2020305@cs.ucla.edu> <20081204181824.M35579@www.tortall.net> Message-ID: <49382064.7030001@cs.ucla.edu> Peter Johnson wrote: > $Id$ should have expanded to something like $Id: gen_x86_insn.py 1526 ...$ > when you checked it out of svn. Are you going through something like git-svn > or hg-svn which doesn't expand keywords? Ah, that's the problem. I was using git-svn. Thanks. Patrick From peter at tortall.net Thu Dec 4 12:09:42 2008 From: peter at tortall.net (Peter Johnson) Date: Thu, 4 Dec 2008 12:09:42 -0800 Subject: Build failure: gen_x86_insn.py In-Reply-To: <49382064.7030001@cs.ucla.edu> References: <49376FB9.2020305@cs.ucla.edu> <20081204181824.M35579@www.tortall.net> <49382064.7030001@cs.ucla.edu> Message-ID: <20081204200600.M88553@tortall.net> On Thu, 04 Dec 2008 10:24:36 -0800, Patrick Walton wrote > Peter Johnson wrote: > > $Id$ should have expanded to something like $Id: gen_x86_insn.py 1526 ...$ > > when you checked it out of svn. Are you going through something like git-svn > > or hg-svn which doesn't expand keywords? > > Ah, that's the problem. I was using git-svn. Thanks. FYI, there's a public git-svn mirror of yasm available via http://git.tortall.net/cgit.cgi/yasm.git/. Obviously I don't check out from it very often (otherwise I should have caught this problem!), but it should work otherwise. At the moment it's only making trunk available globally, although I know that the git-svn repo itself is tracking all branches. Peter From dave at sagetv.com Sat Dec 6 16:16:25 2008 From: dave at sagetv.com (David DeHaven) Date: Sat, 6 Dec 2008 16:16:25 -0800 Subject: Mach-O alignment fix? Message-ID: I think there was a misunderstanding about how the align field works in Mach-O section headers (per note 2.3 in modules/objfmts/macho/macho- objfmt.c). Through experimentation, it seems ld64 preserves the section structure by using both the calculated section address *and* the section alignment. IOW, if section alignment is 16 bytes and the section address is 0x12345, then the final location of that particular piece of the section (not of the entire section!) will be located at an address ending with (0x12345 & 0x0000F) = 0x00005. I made a little patch that forces section address alignment when calculated in macho_objfmt_calc_sectsize and have tested a couple scenarios that used to crash when using SSE instruction (that require 16 byte alignment). So far it seems to be working. Tested with xvidcore from CVS HEAD, unmodified. Pre-patch results: (otool -vl dct/x86_asm/fdct_sse2_skal.o) dct/x86_asm/fdct_sse2_skal.o: ... Section sectname __const segname __DATA addr 0x00000b2a size 0x00000330 offset 3102 align 2^4 (16) reloff 0 nreloc 0 type S_REGULAR attributes (none) reserved1 0 reserved2 0 Post-patch results: dct/x86_asm/fdct_sse2_skal.o: ... Section sectname __const segname __DATA addr 0x00000b30 size 0x00000330 offset 3102 align 2^4 (16) reloff 0 nreloc 0 type S_REGULAR attributes (none) reserved1 0 reserved2 0 Final alignment of the .rodata section declared in that file follows those addresses (0x000a1bda pre, 0x000a1be0 post). Pre-patch causes a crash in fdct_sse2_skal due to the "tan2" array being mis-aligned: (gdb) x/8i 0x0007f909 0x7f909 : movdqa 0xa1bea,%xmm4 ... Post (disassembled in otool) does not crash and passes all tests: 0007f909 movdqa 0x000a1be0,%xmm4 -DrD- Index: modules/objfmts/macho/macho-objfmt.c =================================================================== --- modules/objfmts/macho/macho-objfmt.c (revision 2155) +++ modules/objfmts/macho/macho-objfmt.c (working copy) @@ -978,6 +978,7 @@ /*@null@ */ macho_objfmt_output_info *info = (macho_objfmt_output_info *) d; /*@dependent@ *//*@null@ */ macho_section_data *msd; + unsigned long alignment = 0, delta = 0; assert(info != NULL); msd = yasm_section_get_data(sect, &macho_section_data_cb); @@ -991,8 +992,14 @@ } /* accumulate size in memory */ + alignment = yasm_section_get_align(sect); msd->vmoff = info->vmsize; - info->vmsize += msd->size; + if(alignment) { + delta = msd->vmoff % alignment; + if(delta) + msd->vmoff += alignment - delta; + } + info->vmsize += msd->size + delta; // need to reflect delta in vmsize, but not in section size! return 0; } From peter at tortall.net Sat Dec 6 18:50:35 2008 From: peter at tortall.net (Peter Johnson) Date: Sat, 6 Dec 2008 18:50:35 -0800 (PST) Subject: Mach-O alignment fix? In-Reply-To: References: Message-ID: Excellent investigation and fix! Thanks for your effort in tracking this down. Slightly modified version of your patch committed in r2161. Thanks, Peter On Sat, 6 Dec 2008, David DeHaven wrote: > > I think there was a misunderstanding about how the align field works in > Mach-O section headers (per note 2.3 in > modules/objfmts/macho/macho-objfmt.c). Through experimentation, it seems ld64 > preserves the section structure by using both the calculated section address > *and* the section alignment. IOW, if section alignment is 16 bytes and the > section address is 0x12345, then the final location of that particular piece > of the section (not of the entire section!) will be located at an address > ending with (0x12345 & 0x0000F) = 0x00005. > > I made a little patch that forces section address alignment when calculated > in macho_objfmt_calc_sectsize and have tested a couple scenarios that used to > crash when using SSE instruction (that require 16 byte alignment). So far it > seems to be working. > > > Tested with xvidcore from CVS HEAD, unmodified. > > Pre-patch results: (otool -vl dct/x86_asm/fdct_sse2_skal.o) > dct/x86_asm/fdct_sse2_skal.o: > ... > Section > sectname __const > segname __DATA > addr 0x00000b2a > size 0x00000330 > offset 3102 > align 2^4 (16) > reloff 0 > nreloc 0 > type S_REGULAR > attributes (none) > reserved1 0 > reserved2 0 > > > Post-patch results: > dct/x86_asm/fdct_sse2_skal.o: > ... > Section > sectname __const > segname __DATA > addr 0x00000b30 > size 0x00000330 > offset 3102 > align 2^4 (16) > reloff 0 > nreloc 0 > type S_REGULAR > attributes (none) > reserved1 0 > reserved2 0 > > Final alignment of the .rodata section declared in that file follows those > addresses (0x000a1bda pre, 0x000a1be0 post). > > Pre-patch causes a crash in fdct_sse2_skal due to the "tan2" array being > mis-aligned: > (gdb) x/8i 0x0007f909 > 0x7f909 : movdqa 0xa1bea,%xmm4 > ... > > Post (disassembled in otool) does not crash and passes all tests: > 0007f909 movdqa 0x000a1be0,%xmm4 > > -DrD- > > Index: modules/objfmts/macho/macho-objfmt.c > =================================================================== > --- modules/objfmts/macho/macho-objfmt.c (revision 2155) > +++ modules/objfmts/macho/macho-objfmt.c (working copy) > @@ -978,6 +978,7 @@ > /*@null@ */ macho_objfmt_output_info *info = > (macho_objfmt_output_info *) d; > /*@dependent@ *//*@null@ */ macho_section_data *msd; > + unsigned long alignment = 0, delta = 0; > > assert(info != NULL); > msd = yasm_section_get_data(sect, &macho_section_data_cb); > @@ -991,8 +992,14 @@ > } > > /* accumulate size in memory */ > + alignment = yasm_section_get_align(sect); > msd->vmoff = info->vmsize; > - info->vmsize += msd->size; > + if(alignment) { > + delta = msd->vmoff % alignment; > + if(delta) > + msd->vmoff += alignment - delta; > + } > + info->vmsize += msd->size + delta; // need to reflect delta in vmsize, > but not in section size! > > return 0; > } > > _______________________________________________ > yasm-devel mailing list > yasm-devel at tortall.net > http://cvs.tortall.net/mailman/listinfo/yasm-devel From dave at sagetv.com Sun Dec 7 12:07:26 2008 From: dave at sagetv.com (David DeHaven) Date: Sun, 7 Dec 2008 12:07:26 -0800 Subject: Mach-O alignment fix? In-Reply-To: References: Message-ID: <60186E51-B181-41A8-9C09-6FD793BCCC96@sagetv.com> > /* align both start and end of section */ Ah, good catch, I didn't think to align the end of the section too! :) -DrD- > Excellent investigation and fix! Thanks for your effort in tracking > this down. Slightly modified version of your patch committed in > r2161. > > Thanks, > Peter > > On Sat, 6 Dec 2008, David DeHaven wrote: > >> >> I think there was a misunderstanding about how the align field >> works in Mach-O section headers (per note 2.3 in modules/objfmts/ >> macho/macho-objfmt.c). Through experimentation, it seems ld64 >> preserves the section structure by using both the calculated >> section address *and* the section alignment. IOW, if section >> alignment is 16 bytes and the section address is 0x12345, then the >> final location of that particular piece of the section (not of the >> entire section!) will be located at an address ending with (0x12345 >> & 0x0000F) = 0x00005. >> >> I made a little patch that forces section address alignment when >> calculated in macho_objfmt_calc_sectsize and have tested a couple >> scenarios that used to crash when using SSE instruction (that >> require 16 byte alignment). So far it seems to be working. >> >> >> Tested with xvidcore from CVS HEAD, unmodified. >> >> Pre-patch results: (otool -vl dct/x86_asm/fdct_sse2_skal.o) >> dct/x86_asm/fdct_sse2_skal.o: >> ... >> Section >> sectname __const >> segname __DATA >> addr 0x00000b2a >> size 0x00000330 >> offset 3102 >> align 2^4 (16) >> reloff 0 >> nreloc 0 >> type S_REGULAR >> attributes (none) >> reserved1 0 >> reserved2 0 >> >> >> Post-patch results: >> dct/x86_asm/fdct_sse2_skal.o: >> ... >> Section >> sectname __const >> segname __DATA >> addr 0x00000b30 >> size 0x00000330 >> offset 3102 >> align 2^4 (16) >> reloff 0 >> nreloc 0 >> type S_REGULAR >> attributes (none) >> reserved1 0 >> reserved2 0 >> >> Final alignment of the .rodata section declared in that file >> follows those addresses (0x000a1bda pre, 0x000a1be0 post). >> >> Pre-patch causes a crash in fdct_sse2_skal due to the "tan2" array >> being mis-aligned: >> (gdb) x/8i 0x0007f909 >> 0x7f909 : movdqa 0xa1bea,%xmm4 >> ... >> >> Post (disassembled in otool) does not crash and passes all tests: >> 0007f909 movdqa 0x000a1be0,%xmm4 >> >> -DrD- >> >> Index: modules/objfmts/macho/macho-objfmt.c >> =================================================================== >> --- modules/objfmts/macho/macho-objfmt.c (revision 2155) >> +++ modules/objfmts/macho/macho-objfmt.c (working copy) >> @@ -978,6 +978,7 @@ >> /*@null@ */ macho_objfmt_output_info *info = >> (macho_objfmt_output_info *) d; >> /*@dependent@ *//*@null@ */ macho_section_data *msd; >> + unsigned long alignment = 0, delta = 0; >> >> assert(info != NULL); >> msd = yasm_section_get_data(sect, &macho_section_data_cb); >> @@ -991,8 +992,14 @@ >> } >> >> /* accumulate size in memory */ >> + alignment = yasm_section_get_align(sect); >> msd->vmoff = info->vmsize; >> - info->vmsize += msd->size; >> + if(alignment) { >> + delta = msd->vmoff % alignment; >> + if(delta) >> + msd->vmoff += alignment - delta; >> + } >> + info->vmsize += msd->size + delta; // need to reflect delta in >> vmsize, but not in section size! >> >> return 0; >> } >> >> _______________________________________________ >> yasm-devel mailing list >> yasm-devel at tortall.net >> http://cvs.tortall.net/mailman/listinfo/yasm-devel From pcwalton at cs.ucla.edu Sun Dec 14 23:08:18 2008 From: pcwalton at cs.ucla.edu (Patrick Walton) Date: Mon, 15 Dec 2008 01:08:18 -0600 Subject: Big-endian architectures and GMP Message-ID: <49460262.1010406@cs.ucla.edu> Hi, I'm working on adding 32-bit PowerPC architecture support to yasm, because hand-coding assembly in gas is like pulling teeth :) Yasm has a very nice design and the porting was a breeze. Currently I have a few instructions successfully assembling, but I've hit an issue in the intnum routines. Big-endian is not supported at the moment. It seems that the relevant function is BitVector_Block_Store, which doesn't actually know the difference between big-endian and little-endian architectures when reading in. I haven't tested it, but this looks like a bug at first glance. Is it possible to use GMP instead? The LGPL license is unfortunate, but BitVector seems to be under LGPL too. Bignums are a complicated problem and my first instinct would be that intnums and floatnums should take advantage of the well maintained and tested GMP library. Patrick From peter at tortall.net Mon Dec 15 01:07:28 2008 From: peter at tortall.net (Peter Johnson) Date: Mon, 15 Dec 2008 01:07:28 -0800 (PST) Subject: Big-endian architectures and GMP In-Reply-To: <49460262.1010406@cs.ucla.edu> References: <49460262.1010406@cs.ucla.edu> Message-ID: On Mon, 15 Dec 2008, Patrick Walton wrote: > I'm working on adding 32-bit PowerPC architecture support to yasm, because > hand-coding assembly in gas is like pulling teeth :) Yasm has a very nice > design and the porting was a breeze. Thanks and that's great news! If you're willing to contribute your code back when you've gotten further along, I'd be happy to integrate it into the tree. > Currently I have a few instructions successfully assembling, but I've hit an > issue in the intnum routines. Big-endian is not supported at the moment. It > seems that the relevant function is BitVector_Block_Store, which doesn't > actually know the difference between big-endian and little-endian > architectures when reading in. I haven't tested it, but this looks like a bug > at first glance. It's not really a bug; I've just not implemented it in the intnum wrappers yet (e.g. see intnum.c:802). The BitVector code itself never claims to do anything more than little endian (I'm taking advantage of Block_Store's "portable" format being little endian). Basically the intnum code just needs to implement big endian by doing looped 8-bit chunk reads. > Is it possible to use GMP instead? The LGPL license is unfortunate, but > BitVector seems to be under LGPL too. Bignums are a complicated problem and > my first instinct would be that intnums and floatnums should take advantage > of the well maintained and tested GMP library. BitVector is actually triple-licensed under GPL, LGPL, and the Artistic License. While the artistic license is poorly worded, my belief is yasm meets the requirements of the BitVector artistic license in a BSD-like fashion (e.g. binary only distribution is okay, as BitVector interfaces are not directly exposed to the user by yasm). There are other advantages to GMP (e.g. floating point arithmetic becomes possible) but yes, the LGPL license is an issue. GMP is also adding a dependency, which I try to avoid to make yasm as self-contained as possible. Peter From pcwalton at cs.ucla.edu Mon Dec 15 10:40:00 2008 From: pcwalton at cs.ucla.edu (Patrick Walton) Date: Mon, 15 Dec 2008 12:40:00 -0600 Subject: Big-endian architectures and GMP In-Reply-To: References: <49460262.1010406@cs.ucla.edu> Message-ID: <4946A480.6060601@cs.ucla.edu> Peter Johnson wrote: > Thanks and that's great news! If you're willing to contribute your code > back when you've gotten further along, I'd be happy to integrate it into > the tree. Of course. > BitVector is actually triple-licensed under GPL, LGPL, and the Artistic > License. While the artistic license is poorly worded, my belief is yasm > meets the requirements of the BitVector artistic license in a BSD-like > fashion (e.g. binary only distribution is okay, as BitVector interfaces > are not directly exposed to the user by yasm). There are other > advantages to GMP (e.g. floating point arithmetic becomes possible) but > yes, the LGPL license is an issue. GMP is also adding a dependency, > which I try to avoid to make yasm as self-contained as possible. How would you feel about a hand-coded 128-bit arithmetic implementation (just with 4 32-bit words) instead of BitVector? It'd probably be faster with less code. I could do that if you're ok with it. BitVector has a very strange C style IMO, and it'd avoid the wording issues with the Artistic License. Patrick From peter at tortall.net Mon Dec 15 12:07:26 2008 From: peter at tortall.net (Peter Johnson) Date: Mon, 15 Dec 2008 12:07:26 -0800 Subject: Big-endian architectures and GMP In-Reply-To: <4946A480.6060601@cs.ucla.edu> References: <49460262.1010406@cs.ucla.edu> <4946A480.6060601@cs.ucla.edu> Message-ID: <20081215194019.M70882@tortall.net> On Mon, 15 Dec 2008 12:40:00 -0600, Patrick Walton wrote > How would you feel about a hand-coded 128-bit arithmetic > implementation > (just with 4 32-bit words) instead of BitVector? It'd probably be > faster with less code. I could do that if you're ok with it. > > BitVector has a very strange C style IMO, and it'd avoid the wording > issues with the Artistic License. I'm fine with that approach, particularly since you're willing to put in the effort to write it. :) I think it's a fair amount of work due to all the arithmetic operations, but fixing the license issue and cleaning up the code would be great! Yasm actually uses 256-bit values for intnums, not 128-bit, due to the need to handle 256-bit constants for AVX. I'd like to retain the ability to do math on 256-bit quantities if possible for consistency on the syntax side. The common case is of course 64-bit or less, which is why intnum is implemented as a union of long and bitvect (the C code currently doesn't fully take advantage of this for machines with 64-bit longs). Yasm uses BitVector for floatnums as well (with 80-bit size), but it shouldn't be too difficult to change floatnums to use 128-bit arithmetic and round down to 80 bits (much as we currently round down to 64/32 bits from 80). I believe the only other place that uses BitVector is in x86, but there it's simply used as a collection of bits, so 128-bit would be fine there as well. It would also be really nice if a 2x64-bit (4x64-bit?) implementation was relatively straightforward to derive from the 4x32-bit (8x32-bit?) implementation to improve performance on 64-bit machines. BitVector does this, but in a way that makes the implementation pretty complex. I'd be fine with simple ifdefs and ignore the 16-bit machine case. Yasm won't run in a 16-bit environment due to memory usage anyway. Thanks, Peter From pcwalton at cs.ucla.edu Fri Dec 19 12:09:48 2008 From: pcwalton at cs.ucla.edu (Patrick Walton) Date: Fri, 19 Dec 2008 14:09:48 -0600 Subject: PowerPC relocation and truncation Message-ID: <494BFF8C.6030509@cs.ucla.edu> I've been working on the PowerPC relocations and there's a design decision that I'd like to run by the other yasm developers before I commit to it. PowerPC code makes extensive use of truncation of long addresses to 16-bit. There are three relocations in 32-bit PowerPC ELF code, defined as follows in the spec: For any address x: *_LO: x & 0xffff *_HI: ((x >> 16) & 0xffff) *_HA: (((x >> 16) + ((x & 0x8000) ? 1 : 0) & 0xffff) *_LO and *_HA are the most common relocations. There is no standard way to represent these expressions in assembly code. GNU GAS uses fake special symbols for these, so x at l refers to x & 0xffff, and x at ha refers to the "high adjusted" version of x as defined above. Apple GAS uses function-like unary operators for these: lo16(x), hi16(x), and ha16(x). There are a few solutions I could think of for how to handle these in Yasm on a syntactic level, using the NASM parser: (1) No special syntax. Just parse the expression, and, if it's close enough to definition of the relocation, emit the relocation in the object file. The benefit of this is that it's NASM-like. The drawback is that it would basically require boilerplate lo16(), hi16(), and ha16() macros in all user code in order to actually use the assembler for anything - these relocations are used a *lot*. (2) Define lo16(), hi16(), and ha16() unary operators in the nasm parser. The benefit of this is that it could lead to cleaner user code, while the drawback is the possibility of namespace collisions in other architectures. (3) Define lo16(), hi16(), and ha16() unary operators in the nasm parser only for PowerPC. The drawback of this is adding more architecture-specific code to the parser - there would have to be some yasm_arch_get_custom_unary_operators() or similar function added to each arch. (4) Use fake ssyms for these and use the "wrt" syntax, like GAS. I don't think this is a good solution, personally: it seems to be an abuse of the ssym notation and isn't very NASM-like. People would wonder why "x >> 16" didn't do what it was supposed to. Obviously there are benefits and drawbacks to each approach, so I'd like input before I commit to one. Thanks, Patrick From peter at tortall.net Fri Dec 19 15:26:18 2008 From: peter at tortall.net (Peter Johnson) Date: Fri, 19 Dec 2008 15:26:18 -0800 Subject: PowerPC relocation and truncation In-Reply-To: <494BFF8C.6030509@cs.ucla.edu> References: <494BFF8C.6030509@cs.ucla.edu> Message-ID: <20081219231419.M1038@www.tortall.net> On Fri, 19 Dec 2008 14:09:48 -0600, Patrick Walton wrote > There are a few solutions I could think of for how to handle these > in Yasm on a syntactic level, using the NASM parser: > (1) No special syntax. Just parse the expression, and, if it's close > enough to definition of the relocation, emit the relocation in the > object file. The benefit of this is that it's NASM-like. The > drawback is that it would basically require boilerplate lo16(), > hi16(), and ha16() macros in all user code in order to actually use > the assembler for anything - these relocations are used a *lot*. > (2) Define lo16(), hi16(), and ha16() unary operators in the nasm > parser. The benefit of this is that it could lead to cleaner user > code, while the drawback is the possibility of namespace collisions > in other architectures. > (3) Define lo16(), hi16(), and ha16() unary operators in the nasm > parser only for PowerPC. The drawback of this is adding more > architecture-specific code to the parser - there would have to be > some yasm_arch_get_custom_unary_operators() or similar function > added to each arch. Can't we do (1) using standard macros that are enabled only when the ppc architecture is used, effectively doing (3)? That gives the benefit of having them without risking namespace collisions. We already do this for the object formats via the stdmacs member of the yasm_objfmt structure, see objfmt.h and e.g. elf-objfmt.cpp line 1326+. Basically we just need to add a similar stdmacs member to yasm_arch and handle it in yasm.c. I've always had plans for doing this but simply didn't need it yet; when this is done, we can also move the [bits] macro handling to be x86-specific as it should be. > (4) Use fake ssyms for these and use the "wrt" syntax, like GAS. I > don't think this is a good solution, personally: it seems to be an > abuse of the ssym notation and isn't very NASM-like. People would > wonder why "x >> 16" didn't do what it was supposed to. Agreed this doesn't feel NASM-like. Note for GAS compatibility we should add special symbols for just the GAS parser. Peter From g.mcgarry at ieee.org Mon Dec 22 15:29:56 2008 From: g.mcgarry at ieee.org (Gregory McGarry) Date: Mon, 22 Dec 2008 15:29:56 -0800 (PST) Subject: pcc and yasm Message-ID: <372467.71572.qm@web50612.mail.re2.yahoo.com> On windows, pcc uses yasm as the assembler. You can find a snapshot of pcc for win32 at: http://pcc.ludd.ltu.se/ftp/pub/win32/ However, yasm6 will not compile many programs and yasm7 segfaults on many programs. The following example comes from the pcc-test cvs module. Anyone seen this? I thought I'd ask before setting up a yasm development system. simple.s:23: instruction not recognized: `_WndProc' simple.s:34: instruction not recognized: `_WinMain' simple.s:81: expected identifier .section .rodata L2586: .ascii "myWindowClass\0" .globl _g_szClassName _g_szClassName: .byte 109 .byte 121 .byte 87 .byte 105 .byte 110 .byte 100 .byte 111 .byte 119 .byte 67 .byte 108 .byte 97 .byte 115 .byte 115 .byte 0 .text .align 4 .globl _WndProc at 16 _WndProc at 16: pushl %ebp movl %esp,%ebp subl $8,%esp L2588: L2590: movl 12(%ebp),%eax movl %eax,-4(%ebp) jmp L2592 L2593: pushl 8(%ebp) call _DestroyWindow at 4 jmp L2591 L2594: pushl $0 call _PostQuitMessage at 4 jmp L2591 L2595: pushl 20(%ebp) pushl 16(%ebp) pushl 12(%ebp) pushl 8(%ebp) call _DefWindowProcA at 16 movl %eax,-8(%ebp) jmp L2589 L2592: cmpl $2,-4(%ebp) je L2594 cmpl $16,-4(%ebp) je L2593 jmp L2595 L2591: movl $0,-8(%ebp) jmp L2589 L2589: movl -8(%ebp),%eax leave ret $16 .section .rodata L2600: .ascii "Window Registration Failed!\0" L2601: .ascii "Error!\0" L2602: .ascii "The title of my window\0" L2604: .ascii "Window Creation Failed!\0" .text .align 4 .globl _WinMain at 16 _WinMain at 16: pushl %ebp movl %esp,%ebp subl $84,%esp L2596: L2598: movl $48,-48(%ebp) movl $0,-44(%ebp) movl $_WndProc at 16,-40(%ebp) movl $0,-36(%ebp) movl $0,-32(%ebp) movl 8(%ebp),%eax movl %eax,-28(%ebp) pushl $32512 pushl $0 call _LoadIconA at 8 movl %eax,-24(%ebp) pushl $32512 pushl $0 call _LoadCursorA at 8 movl %eax,-20(%ebp) movl $6,-16(%ebp) movl $0,-12(%ebp) movl $_g_szClassName,-8(%ebp) pushl $32512 pushl $0 call _LoadIconA at 8 movl %eax,-4(%ebp) leal -48(%ebp),%edx pushl %edx call _RegisterClassExA at 4 cmpw $0,%ax jne L2599 pushl $48 pushl $L2601 pushl $L2600 pushl $0 call _MessageBoxA at 16 movl $0,-84(%ebp) jmp L2597 L2599: pushl $0 pushl 8(%ebp) pushl $0 pushl $0 pushl $120 pushl $240 pushl $-2147483648 pushl $-2147483648 pushl $13565952 pushl $L2602 pushl $_g_szClassName pushl $512 call _CreateWindowExA at 48 movl %eax,-52(%ebp) cmpl $0,-52(%ebp) jne L2603 pushl $48 pushl $L2601 pushl $L2604 pushl $0 call _MessageBoxA at 16 movl $0,-84(%ebp) jmp L2597 L2603: pushl 20(%ebp) pushl -52(%ebp) call _ShowWindow at 8 pushl -52(%ebp) call _UpdateWindow at 4 L2605: pushl $0 pushl $0 pushl $0 leal -80(%ebp),%edx pushl %edx call _GetMessageA at 16 cmpl $0,%eax jle L2606 leal -80(%ebp),%eax pushl %eax call _TranslateMessage at 4 leal -80(%ebp),%eax pushl %eax call _DispatchMessageA at 4 jmp L2605 L2606: movl -72(%ebp),%eax movl %eax,-84(%ebp) jmp L2597 L2597: movl -84(%ebp),%eax leave ret $16 .ident "PCC: pcc 0.9.9 (win32)" Stay connected to the people that matter most with a smarter inbox. Take a look http://au.docs.yahoo.com/mail/smarterinbox From peter at tortall.net Tue Dec 23 23:49:30 2008 From: peter at tortall.net (Peter Johnson) Date: Tue, 23 Dec 2008 23:49:30 -0800 (PST) Subject: pcc and yasm In-Reply-To: <372467.71572.qm@web50612.mail.re2.yahoo.com> References: <372467.71572.qm@web50612.mail.re2.yahoo.com> Message-ID: On Mon, 22 Dec 2008, Gregory McGarry wrote: > On windows, pcc uses yasm as the assembler. You can find a snapshot of pcc for win32 at: > > http://pcc.ludd.ltu.se/ftp/pub/win32/ > > However, yasm6 will not compile many programs and yasm7 segfaults on many programs. The following example comes from the pcc-test cvs module. > > Anyone seen this? I thought I'd ask before setting up a yasm development system. > > > simple.s:23: instruction not recognized: `_WndProc' > simple.s:34: instruction not recognized: `_WinMain' > simple.s:81: expected identifier I fixed the crash (it was a double-free), but the reason it's erroring out is the @16, @4, etc. suffixes. In GAS mode, yasm's currently set up to only take @symbol instead of @number. As I believe the @number should simply be part of the symbol name, I'll try to fix it that way. Thanks, Peter From g.mcgarry at ieee.org Sat Dec 27 00:51:53 2008 From: g.mcgarry at ieee.org (Gregory McGarry) Date: Sat, 27 Dec 2008 00:51:53 -0800 (PST) Subject: pcc and yasm References: <372467.71572.qm@web50612.mail.re2.yahoo.com> Message-ID: <780030.1103.qm@web50603.mail.re2.yahoo.com> > > On windows, pcc uses yasm as the assembler. You can find a snapshot of pcc > for win32 at: > > > > http://pcc.ludd.ltu.se/ftp/pub/win32/ > > > > However, yasm6 will not compile many programs and yasm7 segfaults on many > programs. The following example comes from the pcc-test cvs module. > > > > Anyone seen this? I thought I'd ask before setting up a yasm development > system. > > > > > > simple.s:23: instruction not recognized: `_WndProc' > > simple.s:34: instruction not recognized: `_WinMain' > > simple.s:81: expected identifier > > I fixed the crash (it was a double-free), but the reason it's erroring out > is the @16, @4, etc. suffixes. In GAS mode, yasm's currently set up to > only take @symbol instead of @number. As I believe the @number should > simply be part of the symbol name, I'll try to fix it that way. Yes, the @ is considered a valid character in the symbol name. In case it matters, os x makes heavy use of $ in symbol decorations. Thanks for looking at this. Is there anything else I can do? Stay connected to the people that matter most with a smarter inbox. Take a look http://au.docs.yahoo.com/mail/smarterinbox From peter at tortall.net Sun Dec 28 00:49:17 2008 From: peter at tortall.net (Peter Johnson) Date: Sun, 28 Dec 2008 00:49:17 -0800 (PST) Subject: pcc and yasm In-Reply-To: <780030.1103.qm@web50603.mail.re2.yahoo.com> References: <372467.71572.qm@web50612.mail.re2.yahoo.com> <780030.1103.qm@web50603.mail.re2.yahoo.com> Message-ID: On Sat, 27 Dec 2008, Gregory McGarry wrote: >> I fixed the crash (it was a double-free), but the reason it's erroring out >> is the @16, @4, etc. suffixes. In GAS mode, yasm's currently set up to >> only take @symbol instead of @number. As I believe the @number should >> simply be part of the symbol name, I'll try to fix it that way. > > Yes, the @ is considered a valid character in the symbol name. In case it matters, os x makes heavy use of $ in symbol decorations. $ is already handled properly in yasm gas mode. @ is more difficult to handle as it serves a special purpose in ELF and thus isn't part of the symbol name there (it's used to designate special relocations such as sym at gotpcrel). GNU AS handles this by special-casing the @ character in the lexer for PE targets. I'll probably need to do something similar (but hopefully cleaner than in GNU AS, where it's a #defined token in the lexer lookup table). The cleanest solution I can think of without really hacking the lexer up with conditional rules is to scan identifiers for @ in elf mode. > Thanks for looking at this. Is there anything else I can do? I've been busy with family for Christmas (two 18-month-old nieces and a 5-year-old niece kept me plenty occupied!) and haven't had much time to work on fixing this yet. I'll try to get a fix committed in the next few days. Thanks, Peter From poirierg at gmail.com Tue Dec 2 05:20:18 2008 From: poirierg at gmail.com (Guillaume POIRIER) Date: Tue, 2 Dec 2008 14:20:18 +0100 Subject: What does -f macho correspond to? In-Reply-To: References: <4e03026a0811271405i743b1fdfq789b168c027f60ad@mail.gmail.com> Message-ID: <4e03026a0812020520y4d6d0f40pd8dd9f2fadcf7ae1@mail.gmail.com> Hello, On Fri, Nov 28, 2008 at 7:45 PM, Peter Johnson wrote: > On Thu, 27 Nov 2008, Guillaume POIRIER wrote: >> >> I'm a bit puzzled. When I run yasm -f help, I get, among other things >> 3 formats for OSX: >> >> macho Mac OS X ABI Mach-O File Format >> macho32 Mac OS X ABI Mach-O File Format (32-bit) >> macho64 Mac OS X ABI Mach-O File Format (64-bit) >> >> Since to the best of my knowledge, there are just 2 kinds of x86 >> Mach-O formats: 32-bits and 64-bits, what should I expect if I choose >> "-f macho" on a 32-bits machine, same question for a 64-bits machine? >> >> I'm asking this because the split-radix FFT code of FFmpeg somewhat >> lead code that segfaults when compiled with -f macho32 on a 32-bits >> machine, and is compiled and runs fine when compiled with "-f macho". >> >> Please lighten me up here if possible... Or direct me to the right >> mailing list if I'm off-topic here. > > That's very strange. Essentially all macho does is choose between macho32 > and macho64 based on the bitness of the machine architecture. E.g. -f macho > -m amd64 should be the same as -f macho64, and -f macho -m x86 (the latter > is the default x86 machine) should be the same as -f macho32. > > What are the other command line arguments? It could be you're hitting an > odd initialization code path (all 3 share the same code so it's digging > through conditionals at that layer). Thanks for the explanation. Though it was logical for me that -f macho meant "native" wordsize, now I'm least I'm sure. I looked into the user code a bit more and found the culpit. It wasn't indeed in yasm code but on FFmpeg's. Thanks a lot again, Guillaume -- One should not give up hope on imbeciles. With a little training, you can make them into soldiers. -- Pierre Desproges From pcwalton at cs.ucla.edu Wed Dec 3 21:50:49 2008 From: pcwalton at cs.ucla.edu (Patrick Walton) Date: Wed, 03 Dec 2008 21:50:49 -0800 Subject: Build failure: gen_x86_insn.py Message-ID: <49376FB9.2020305@cs.ucla.edu> Hi, I get a build failure when attempting to build yasm from svn. modules/arch/x86/gen_x86_insn.py fails here: rcstag = "$Id$" scriptname = rcstag.split()[1] scriptrev = rcstag.split()[2] because the array indices are out of range. I'd submit a patch, but I'm not sure what the goal is here... any help? Thanks! Patrick From peter at tortall.net Thu Dec 4 10:19:29 2008 From: peter at tortall.net (Peter Johnson) Date: Thu, 4 Dec 2008 10:19:29 -0800 Subject: Build failure: gen_x86_insn.py In-Reply-To: <49376FB9.2020305@cs.ucla.edu> References: <49376FB9.2020305@cs.ucla.edu> Message-ID: <20081204181824.M35579@www.tortall.net> On Wed, 03 Dec 2008 21:50:49 -0800, Patrick Walton wrote > I get a build failure when attempting to build yasm from svn. > modules/arch/x86/gen_x86_insn.py fails here: > > rcstag = "$Id$" > scriptname = rcstag.split()[1] > scriptrev = rcstag.split()[2] > > because the array indices are out of range. I'd submit a patch, but > I'm not sure what the goal is here... any help? $Id$ should have expanded to something like $Id: gen_x86_insn.py 1526 ...$ when you checked it out of svn. Are you going through something like git-svn or hg-svn which doesn't expand keywords? The scriptname and scriptrev variables are only used to print what version generated it in a comment in the output file, so it's easy enough to just catch the exception and force scriptname and scriptrev to fixed values (e.g. "gen_x86_insn.py" and "HEAD"). I'll get something like that committed later tonight; in the meantime you can just hardwire the values instead of splitting from rcstag. Peter From pcwalton at cs.ucla.edu Thu Dec 4 10:24:36 2008 From: pcwalton at cs.ucla.edu (Patrick Walton) Date: Thu, 04 Dec 2008 10:24:36 -0800 Subject: Build failure: gen_x86_insn.py In-Reply-To: <20081204181824.M35579@www.tortall.net> References: <49376FB9.2020305@cs.ucla.edu> <20081204181824.M35579@www.tortall.net> Message-ID: <49382064.7030001@cs.ucla.edu> Peter Johnson wrote: > $Id$ should have expanded to something like $Id: gen_x86_insn.py 1526 ...$ > when you checked it out of svn. Are you going through something like git-svn > or hg-svn which doesn't expand keywords? Ah, that's the problem. I was using git-svn. Thanks. Patrick From peter at tortall.net Thu Dec 4 12:09:42 2008 From: peter at tortall.net (Peter Johnson) Date: Thu, 4 Dec 2008 12:09:42 -0800 Subject: Build failure: gen_x86_insn.py In-Reply-To: <49382064.7030001@cs.ucla.edu> References: <49376FB9.2020305@cs.ucla.edu> <20081204181824.M35579@www.tortall.net> <49382064.7030001@cs.ucla.edu> Message-ID: <20081204200600.M88553@tortall.net> On Thu, 04 Dec 2008 10:24:36 -0800, Patrick Walton wrote > Peter Johnson wrote: > > $Id$ should have expanded to something like $Id: gen_x86_insn.py 1526 ...$ > > when you checked it out of svn. Are you going through something like git-svn > > or hg-svn which doesn't expand keywords? > > Ah, that's the problem. I was using git-svn. Thanks. FYI, there's a public git-svn mirror of yasm available via http://git.tortall.net/cgit.cgi/yasm.git/. Obviously I don't check out from it very often (otherwise I should have caught this problem!), but it should work otherwise. At the moment it's only making trunk available globally, although I know that the git-svn repo itself is tracking all branches. Peter From dave at sagetv.com Sat Dec 6 16:16:25 2008 From: dave at sagetv.com (David DeHaven) Date: Sat, 6 Dec 2008 16:16:25 -0800 Subject: Mach-O alignment fix? Message-ID: I think there was a misunderstanding about how the align field works in Mach-O section headers (per note 2.3 in modules/objfmts/macho/macho- objfmt.c). Through experimentation, it seems ld64 preserves the section structure by using both the calculated section address *and* the section alignment. IOW, if section alignment is 16 bytes and the section address is 0x12345, then the final location of that particular piece of the section (not of the entire section!) will be located at an address ending with (0x12345 & 0x0000F) = 0x00005. I made a little patch that forces section address alignment when calculated in macho_objfmt_calc_sectsize and have tested a couple scenarios that used to crash when using SSE instruction (that require 16 byte alignment). So far it seems to be working. Tested with xvidcore from CVS HEAD, unmodified. Pre-patch results: (otool -vl dct/x86_asm/fdct_sse2_skal.o) dct/x86_asm/fdct_sse2_skal.o: ... Section sectname __const segname __DATA addr 0x00000b2a size 0x00000330 offset 3102 align 2^4 (16) reloff 0 nreloc 0 type S_REGULAR attributes (none) reserved1 0 reserved2 0 Post-patch results: dct/x86_asm/fdct_sse2_skal.o: ... Section sectname __const segname __DATA addr 0x00000b30 size 0x00000330 offset 3102 align 2^4 (16) reloff 0 nreloc 0 type S_REGULAR attributes (none) reserved1 0 reserved2 0 Final alignment of the .rodata section declared in that file follows those addresses (0x000a1bda pre, 0x000a1be0 post). Pre-patch causes a crash in fdct_sse2_skal due to the "tan2" array being mis-aligned: (gdb) x/8i 0x0007f909 0x7f909 : movdqa 0xa1bea,%xmm4 ... Post (disassembled in otool) does not crash and passes all tests: 0007f909 movdqa 0x000a1be0,%xmm4 -DrD- Index: modules/objfmts/macho/macho-objfmt.c =================================================================== --- modules/objfmts/macho/macho-objfmt.c (revision 2155) +++ modules/objfmts/macho/macho-objfmt.c (working copy) @@ -978,6 +978,7 @@ /*@null@ */ macho_objfmt_output_info *info = (macho_objfmt_output_info *) d; /*@dependent@ *//*@null@ */ macho_section_data *msd; + unsigned long alignment = 0, delta = 0; assert(info != NULL); msd = yasm_section_get_data(sect, &macho_section_data_cb); @@ -991,8 +992,14 @@ } /* accumulate size in memory */ + alignment = yasm_section_get_align(sect); msd->vmoff = info->vmsize; - info->vmsize += msd->size; + if(alignment) { + delta = msd->vmoff % alignment; + if(delta) + msd->vmoff += alignment - delta; + } + info->vmsize += msd->size + delta; // need to reflect delta in vmsize, but not in section size! return 0; } From peter at tortall.net Sat Dec 6 18:50:35 2008 From: peter at tortall.net (Peter Johnson) Date: Sat, 6 Dec 2008 18:50:35 -0800 (PST) Subject: Mach-O alignment fix? In-Reply-To: References: Message-ID: Excellent investigation and fix! Thanks for your effort in tracking this down. Slightly modified version of your patch committed in r2161. Thanks, Peter On Sat, 6 Dec 2008, David DeHaven wrote: > > I think there was a misunderstanding about how the align field works in > Mach-O section headers (per note 2.3 in > modules/objfmts/macho/macho-objfmt.c). Through experimentation, it seems ld64 > preserves the section structure by using both the calculated section address > *and* the section alignment. IOW, if section alignment is 16 bytes and the > section address is 0x12345, then the final location of that particular piece > of the section (not of the entire section!) will be located at an address > ending with (0x12345 & 0x0000F) = 0x00005. > > I made a little patch that forces section address alignment when calculated > in macho_objfmt_calc_sectsize and have tested a couple scenarios that used to > crash when using SSE instruction (that require 16 byte alignment). So far it > seems to be working. > > > Tested with xvidcore from CVS HEAD, unmodified. > > Pre-patch results: (otool -vl dct/x86_asm/fdct_sse2_skal.o) > dct/x86_asm/fdct_sse2_skal.o: > ... > Section > sectname __const > segname __DATA > addr 0x00000b2a > size 0x00000330 > offset 3102 > align 2^4 (16) > reloff 0 > nreloc 0 > type S_REGULAR > attributes (none) > reserved1 0 > reserved2 0 > > > Post-patch results: > dct/x86_asm/fdct_sse2_skal.o: > ... > Section > sectname __const > segname __DATA > addr 0x00000b30 > size 0x00000330 > offset 3102 > align 2^4 (16) > reloff 0 > nreloc 0 > type S_REGULAR > attributes (none) > reserved1 0 > reserved2 0 > > Final alignment of the .rodata section declared in that file follows those > addresses (0x000a1bda pre, 0x000a1be0 post). > > Pre-patch causes a crash in fdct_sse2_skal due to the "tan2" array being > mis-aligned: > (gdb) x/8i 0x0007f909 > 0x7f909 : movdqa 0xa1bea,%xmm4 > ... > > Post (disassembled in otool) does not crash and passes all tests: > 0007f909 movdqa 0x000a1be0,%xmm4 > > -DrD- > > Index: modules/objfmts/macho/macho-objfmt.c > =================================================================== > --- modules/objfmts/macho/macho-objfmt.c (revision 2155) > +++ modules/objfmts/macho/macho-objfmt.c (working copy) > @@ -978,6 +978,7 @@ > /*@null@ */ macho_objfmt_output_info *info = > (macho_objfmt_output_info *) d; > /*@dependent@ *//*@null@ */ macho_section_data *msd; > + unsigned long alignment = 0, delta = 0; > > assert(info != NULL); > msd = yasm_section_get_data(sect, &macho_section_data_cb); > @@ -991,8 +992,14 @@ > } > > /* accumulate size in memory */ > + alignment = yasm_section_get_align(sect); > msd->vmoff = info->vmsize; > - info->vmsize += msd->size; > + if(alignment) { > + delta = msd->vmoff % alignment; > + if(delta) > + msd->vmoff += alignment - delta; > + } > + info->vmsize += msd->size + delta; // need to reflect delta in vmsize, > but not in section size! > > return 0; > } > > _______________________________________________ > yasm-devel mailing list > yasm-devel at tortall.net > http://cvs.tortall.net/mailman/listinfo/yasm-devel From dave at sagetv.com Sun Dec 7 12:07:26 2008 From: dave at sagetv.com (David DeHaven) Date: Sun, 7 Dec 2008 12:07:26 -0800 Subject: Mach-O alignment fix? In-Reply-To: References: Message-ID: <60186E51-B181-41A8-9C09-6FD793BCCC96@sagetv.com> > /* align both start and end of section */ Ah, good catch, I didn't think to align the end of the section too! :) -DrD- > Excellent investigation and fix! Thanks for your effort in tracking > this down. Slightly modified version of your patch committed in > r2161. > > Thanks, > Peter > > On Sat, 6 Dec 2008, David DeHaven wrote: > >> >> I think there was a misunderstanding about how the align field >> works in Mach-O section headers (per note 2.3 in modules/objfmts/ >> macho/macho-objfmt.c). Through experimentation, it seems ld64 >> preserves the section structure by using both the calculated >> section address *and* the section alignment. IOW, if section >> alignment is 16 bytes and the section address is 0x12345, then the >> final location of that particular piece of the section (not of the >> entire section!) will be located at an address ending with (0x12345 >> & 0x0000F) = 0x00005. >> >> I made a little patch that forces section address alignment when >> calculated in macho_objfmt_calc_sectsize and have tested a couple >> scenarios that used to crash when using SSE instruction (that >> require 16 byte alignment). So far it seems to be working. >> >> >> Tested with xvidcore from CVS HEAD, unmodified. >> >> Pre-patch results: (otool -vl dct/x86_asm/fdct_sse2_skal.o) >> dct/x86_asm/fdct_sse2_skal.o: >> ... >> Section >> sectname __const >> segname __DATA >> addr 0x00000b2a >> size 0x00000330 >> offset 3102 >> align 2^4 (16) >> reloff 0 >> nreloc 0 >> type S_REGULAR >> attributes (none) >> reserved1 0 >> reserved2 0 >> >> >> Post-patch results: >> dct/x86_asm/fdct_sse2_skal.o: >> ... >> Section >> sectname __const >> segname __DATA >> addr 0x00000b30 >> size 0x00000330 >> offset 3102 >> align 2^4 (16) >> reloff 0 >> nreloc 0 >> type S_REGULAR >> attributes (none) >> reserved1 0 >> reserved2 0 >> >> Final alignment of the .rodata section declared in that file >> follows those addresses (0x000a1bda pre, 0x000a1be0 post). >> >> Pre-patch causes a crash in fdct_sse2_skal due to the "tan2" array >> being mis-aligned: >> (gdb) x/8i 0x0007f909 >> 0x7f909 : movdqa 0xa1bea,%xmm4 >> ... >> >> Post (disassembled in otool) does not crash and passes all tests: >> 0007f909 movdqa 0x000a1be0,%xmm4 >> >> -DrD- >> >> Index: modules/objfmts/macho/macho-objfmt.c >> =================================================================== >> --- modules/objfmts/macho/macho-objfmt.c (revision 2155) >> +++ modules/objfmts/macho/macho-objfmt.c (working copy) >> @@ -978,6 +978,7 @@ >> /*@null@ */ macho_objfmt_output_info *info = >> (macho_objfmt_output_info *) d; >> /*@dependent@ *//*@null@ */ macho_section_data *msd; >> + unsigned long alignment = 0, delta = 0; >> >> assert(info != NULL); >> msd = yasm_section_get_data(sect, &macho_section_data_cb); >> @@ -991,8 +992,14 @@ >> } >> >> /* accumulate size in memory */ >> + alignment = yasm_section_get_align(sect); >> msd->vmoff = info->vmsize; >> - info->vmsize += msd->size; >> + if(alignment) { >> + delta = msd->vmoff % alignment; >> + if(delta) >> + msd->vmoff += alignment - delta; >> + } >> + info->vmsize += msd->size + delta; // need to reflect delta in >> vmsize, but not in section size! >> >> return 0; >> } >> >> _______________________________________________ >> yasm-devel mailing list >> yasm-devel at tortall.net >> http://cvs.tortall.net/mailman/listinfo/yasm-devel From pcwalton at cs.ucla.edu Sun Dec 14 23:08:18 2008 From: pcwalton at cs.ucla.edu (Patrick Walton) Date: Mon, 15 Dec 2008 01:08:18 -0600 Subject: Big-endian architectures and GMP Message-ID: <49460262.1010406@cs.ucla.edu> Hi, I'm working on adding 32-bit PowerPC architecture support to yasm, because hand-coding assembly in gas is like pulling teeth :) Yasm has a very nice design and the porting was a breeze. Currently I have a few instructions successfully assembling, but I've hit an issue in the intnum routines. Big-endian is not supported at the moment. It seems that the relevant function is BitVector_Block_Store, which doesn't actually know the difference between big-endian and little-endian architectures when reading in. I haven't tested it, but this looks like a bug at first glance. Is it possible to use GMP instead? The LGPL license is unfortunate, but BitVector seems to be under LGPL too. Bignums are a complicated problem and my first instinct would be that intnums and floatnums should take advantage of the well maintained and tested GMP library. Patrick From peter at tortall.net Mon Dec 15 01:07:28 2008 From: peter at tortall.net (Peter Johnson) Date: Mon, 15 Dec 2008 01:07:28 -0800 (PST) Subject: Big-endian architectures and GMP In-Reply-To: <49460262.1010406@cs.ucla.edu> References: <49460262.1010406@cs.ucla.edu> Message-ID: On Mon, 15 Dec 2008, Patrick Walton wrote: > I'm working on adding 32-bit PowerPC architecture support to yasm, because > hand-coding assembly in gas is like pulling teeth :) Yasm has a very nice > design and the porting was a breeze. Thanks and that's great news! If you're willing to contribute your code back when you've gotten further along, I'd be happy to integrate it into the tree. > Currently I have a few instructions successfully assembling, but I've hit an > issue in the intnum routines. Big-endian is not supported at the moment. It > seems that the relevant function is BitVector_Block_Store, which doesn't > actually know the difference between big-endian and little-endian > architectures when reading in. I haven't tested it, but this looks like a bug > at first glance. It's not really a bug; I've just not implemented it in the intnum wrappers yet (e.g. see intnum.c:802). The BitVector code itself never claims to do anything more than little endian (I'm taking advantage of Block_Store's "portable" format being little endian). Basically the intnum code just needs to implement big endian by doing looped 8-bit chunk reads. > Is it possible to use GMP instead? The LGPL license is unfortunate, but > BitVector seems to be under LGPL too. Bignums are a complicated problem and > my first instinct would be that intnums and floatnums should take advantage > of the well maintained and tested GMP library. BitVector is actually triple-licensed under GPL, LGPL, and the Artistic License. While the artistic license is poorly worded, my belief is yasm meets the requirements of the BitVector artistic license in a BSD-like fashion (e.g. binary only distribution is okay, as BitVector interfaces are not directly exposed to the user by yasm). There are other advantages to GMP (e.g. floating point arithmetic becomes possible) but yes, the LGPL license is an issue. GMP is also adding a dependency, which I try to avoid to make yasm as self-contained as possible. Peter From pcwalton at cs.ucla.edu Mon Dec 15 10:40:00 2008 From: pcwalton at cs.ucla.edu (Patrick Walton) Date: Mon, 15 Dec 2008 12:40:00 -0600 Subject: Big-endian architectures and GMP In-Reply-To: References: <49460262.1010406@cs.ucla.edu> Message-ID: <4946A480.6060601@cs.ucla.edu> Peter Johnson wrote: > Thanks and that's great news! If you're willing to contribute your code > back when you've gotten further along, I'd be happy to integrate it into > the tree. Of course. > BitVector is actually triple-licensed under GPL, LGPL, and the Artistic > License. While the artistic license is poorly worded, my belief is yasm > meets the requirements of the BitVector artistic license in a BSD-like > fashion (e.g. binary only distribution is okay, as BitVector interfaces > are not directly exposed to the user by yasm). There are other > advantages to GMP (e.g. floating point arithmetic becomes possible) but > yes, the LGPL license is an issue. GMP is also adding a dependency, > which I try to avoid to make yasm as self-contained as possible. How would you feel about a hand-coded 128-bit arithmetic implementation (just with 4 32-bit words) instead of BitVector? It'd probably be faster with less code. I could do that if you're ok with it. BitVector has a very strange C style IMO, and it'd avoid the wording issues with the Artistic License. Patrick From peter at tortall.net Mon Dec 15 12:07:26 2008 From: peter at tortall.net (Peter Johnson) Date: Mon, 15 Dec 2008 12:07:26 -0800 Subject: Big-endian architectures and GMP In-Reply-To: <4946A480.6060601@cs.ucla.edu> References: <49460262.1010406@cs.ucla.edu> <4946A480.6060601@cs.ucla.edu> Message-ID: <20081215194019.M70882@tortall.net> On Mon, 15 Dec 2008 12:40:00 -0600, Patrick Walton wrote > How would you feel about a hand-coded 128-bit arithmetic > implementation > (just with 4 32-bit words) instead of BitVector? It'd probably be > faster with less code. I could do that if you're ok with it. > > BitVector has a very strange C style IMO, and it'd avoid the wording > issues with the Artistic License. I'm fine with that approach, particularly since you're willing to put in the effort to write it. :) I think it's a fair amount of work due to all the arithmetic operations, but fixing the license issue and cleaning up the code would be great! Yasm actually uses 256-bit values for intnums, not 128-bit, due to the need to handle 256-bit constants for AVX. I'd like to retain the ability to do math on 256-bit quantities if possible for consistency on the syntax side. The common case is of course 64-bit or less, which is why intnum is implemented as a union of long and bitvect (the C code currently doesn't fully take advantage of this for machines with 64-bit longs). Yasm uses BitVector for floatnums as well (with 80-bit size), but it shouldn't be too difficult to change floatnums to use 128-bit arithmetic and round down to 80 bits (much as we currently round down to 64/32 bits from 80). I believe the only other place that uses BitVector is in x86, but there it's simply used as a collection of bits, so 128-bit would be fine there as well. It would also be really nice if a 2x64-bit (4x64-bit?) implementation was relatively straightforward to derive from the 4x32-bit (8x32-bit?) implementation to improve performance on 64-bit machines. BitVector does this, but in a way that makes the implementation pretty complex. I'd be fine with simple ifdefs and ignore the 16-bit machine case. Yasm won't run in a 16-bit environment due to memory usage anyway. Thanks, Peter From pcwalton at cs.ucla.edu Fri Dec 19 12:09:48 2008 From: pcwalton at cs.ucla.edu (Patrick Walton) Date: Fri, 19 Dec 2008 14:09:48 -0600 Subject: PowerPC relocation and truncation Message-ID: <494BFF8C.6030509@cs.ucla.edu> I've been working on the PowerPC relocations and there's a design decision that I'd like to run by the other yasm developers before I commit to it. PowerPC code makes extensive use of truncation of long addresses to 16-bit. There are three relocations in 32-bit PowerPC ELF code, defined as follows in the spec: For any address x: *_LO: x & 0xffff *_HI: ((x >> 16) & 0xffff) *_HA: (((x >> 16) + ((x & 0x8000) ? 1 : 0) & 0xffff) *_LO and *_HA are the most common relocations. There is no standard way to represent these expressions in assembly code. GNU GAS uses fake special symbols for these, so x at l refers to x & 0xffff, and x at ha refers to the "high adjusted" version of x as defined above. Apple GAS uses function-like unary operators for these: lo16(x), hi16(x), and ha16(x). There are a few solutions I could think of for how to handle these in Yasm on a syntactic level, using the NASM parser: (1) No special syntax. Just parse the expression, and, if it's close enough to definition of the relocation, emit the relocation in the object file. The benefit of this is that it's NASM-like. The drawback is that it would basically require boilerplate lo16(), hi16(), and ha16() macros in all user code in order to actually use the assembler for anything - these relocations are used a *lot*. (2) Define lo16(), hi16(), and ha16() unary operators in the nasm parser. The benefit of this is that it could lead to cleaner user code, while the drawback is the possibility of namespace collisions in other architectures. (3) Define lo16(), hi16(), and ha16() unary operators in the nasm parser only for PowerPC. The drawback of this is adding more architecture-specific code to the parser - there would have to be some yasm_arch_get_custom_unary_operators() or similar function added to each arch. (4) Use fake ssyms for these and use the "wrt" syntax, like GAS. I don't think this is a good solution, personally: it seems to be an abuse of the ssym notation and isn't very NASM-like. People would wonder why "x >> 16" didn't do what it was supposed to. Obviously there are benefits and drawbacks to each approach, so I'd like input before I commit to one. Thanks, Patrick From peter at tortall.net Fri Dec 19 15:26:18 2008 From: peter at tortall.net (Peter Johnson) Date: Fri, 19 Dec 2008 15:26:18 -0800 Subject: PowerPC relocation and truncation In-Reply-To: <494BFF8C.6030509@cs.ucla.edu> References: <494BFF8C.6030509@cs.ucla.edu> Message-ID: <20081219231419.M1038@www.tortall.net> On Fri, 19 Dec 2008 14:09:48 -0600, Patrick Walton wrote > There are a few solutions I could think of for how to handle these > in Yasm on a syntactic level, using the NASM parser: > (1) No special syntax. Just parse the expression, and, if it's close > enough to definition of the relocation, emit the relocation in the > object file. The benefit of this is that it's NASM-like. The > drawback is that it would basically require boilerplate lo16(), > hi16(), and ha16() macros in all user code in order to actually use > the assembler for anything - these relocations are used a *lot*. > (2) Define lo16(), hi16(), and ha16() unary operators in the nasm > parser. The benefit of this is that it could lead to cleaner user > code, while the drawback is the possibility of namespace collisions > in other architectures. > (3) Define lo16(), hi16(), and ha16() unary operators in the nasm > parser only for PowerPC. The drawback of this is adding more > architecture-specific code to the parser - there would have to be > some yasm_arch_get_custom_unary_operators() or similar function > added to each arch. Can't we do (1) using standard macros that are enabled only when the ppc architecture is used, effectively doing (3)? That gives the benefit of having them without risking namespace collisions. We already do this for the object formats via the stdmacs member of the yasm_objfmt structure, see objfmt.h and e.g. elf-objfmt.cpp line 1326+. Basically we just need to add a similar stdmacs member to yasm_arch and handle it in yasm.c. I've always had plans for doing this but simply didn't need it yet; when this is done, we can also move the [bits] macro handling to be x86-specific as it should be. > (4) Use fake ssyms for these and use the "wrt" syntax, like GAS. I > don't think this is a good solution, personally: it seems to be an > abuse of the ssym notation and isn't very NASM-like. People would > wonder why "x >> 16" didn't do what it was supposed to. Agreed this doesn't feel NASM-like. Note for GAS compatibility we should add special symbols for just the GAS parser. Peter From g.mcgarry at ieee.org Mon Dec 22 15:29:56 2008 From: g.mcgarry at ieee.org (Gregory McGarry) Date: Mon, 22 Dec 2008 15:29:56 -0800 (PST) Subject: pcc and yasm Message-ID: <372467.71572.qm@web50612.mail.re2.yahoo.com> On windows, pcc uses yasm as the assembler. You can find a snapshot of pcc for win32 at: http://pcc.ludd.ltu.se/ftp/pub/win32/ However, yasm6 will not compile many programs and yasm7 segfaults on many programs. The following example comes from the pcc-test cvs module. Anyone seen this? I thought I'd ask before setting up a yasm development system. simple.s:23: instruction not recognized: `_WndProc' simple.s:34: instruction not recognized: `_WinMain' simple.s:81: expected identifier .section .rodata L2586: .ascii "myWindowClass\0" .globl _g_szClassName _g_szClassName: .byte 109 .byte 121 .byte 87 .byte 105 .byte 110 .byte 100 .byte 111 .byte 119 .byte 67 .byte 108 .byte 97 .byte 115 .byte 115 .byte 0 .text .align 4 .globl _WndProc at 16 _WndProc at 16: pushl %ebp movl %esp,%ebp subl $8,%esp L2588: L2590: movl 12(%ebp),%eax movl %eax,-4(%ebp) jmp L2592 L2593: pushl 8(%ebp) call _DestroyWindow at 4 jmp L2591 L2594: pushl $0 call _PostQuitMessage at 4 jmp L2591 L2595: pushl 20(%ebp) pushl 16(%ebp) pushl 12(%ebp) pushl 8(%ebp) call _DefWindowProcA at 16 movl %eax,-8(%ebp) jmp L2589 L2592: cmpl $2,-4(%ebp) je L2594 cmpl $16,-4(%ebp) je L2593 jmp L2595 L2591: movl $0,-8(%ebp) jmp L2589 L2589: movl -8(%ebp),%eax leave ret $16 .section .rodata L2600: .ascii "Window Registration Failed!\0" L2601: .ascii "Error!\0" L2602: .ascii "The title of my window\0" L2604: .ascii "Window Creation Failed!\0" .text .align 4 .globl _WinMain at 16 _WinMain at 16: pushl %ebp movl %esp,%ebp subl $84,%esp L2596: L2598: movl $48,-48(%ebp) movl $0,-44(%ebp) movl $_WndProc at 16,-40(%ebp) movl $0,-36(%ebp) movl $0,-32(%ebp) movl 8(%ebp),%eax movl %eax,-28(%ebp) pushl $32512 pushl $0 call _LoadIconA at 8 movl %eax,-24(%ebp) pushl $32512 pushl $0 call _LoadCursorA at 8 movl %eax,-20(%ebp) movl $6,-16(%ebp) movl $0,-12(%ebp) movl $_g_szClassName,-8(%ebp) pushl $32512 pushl $0 call _LoadIconA at 8 movl %eax,-4(%ebp) leal -48(%ebp),%edx pushl %edx call _RegisterClassExA at 4 cmpw $0,%ax jne L2599 pushl $48 pushl $L2601 pushl $L2600 pushl $0 call _MessageBoxA at 16 movl $0,-84(%ebp) jmp L2597 L2599: pushl $0 pushl 8(%ebp) pushl $0 pushl $0 pushl $120 pushl $240 pushl $-2147483648 pushl $-2147483648 pushl $13565952 pushl $L2602 pushl $_g_szClassName pushl $512 call _CreateWindowExA at 48 movl %eax,-52(%ebp) cmpl $0,-52(%ebp) jne L2603 pushl $48 pushl $L2601 pushl $L2604 pushl $0 call _MessageBoxA at 16 movl $0,-84(%ebp) jmp L2597 L2603: pushl 20(%ebp) pushl -52(%ebp) call _ShowWindow at 8 pushl -52(%ebp) call _UpdateWindow at 4 L2605: pushl $0 pushl $0 pushl $0 leal -80(%ebp),%edx pushl %edx call _GetMessageA at 16 cmpl $0,%eax jle L2606 leal -80(%ebp),%eax pushl %eax call _TranslateMessage at 4 leal -80(%ebp),%eax pushl %eax call _DispatchMessageA at 4 jmp L2605 L2606: movl -72(%ebp),%eax movl %eax,-84(%ebp) jmp L2597 L2597: movl -84(%ebp),%eax leave ret $16 .ident "PCC: pcc 0.9.9 (win32)" Stay connected to the people that matter most with a smarter inbox. Take a look http://au.docs.yahoo.com/mail/smarterinbox From peter at tortall.net Tue Dec 23 23:49:30 2008 From: peter at tortall.net (Peter Johnson) Date: Tue, 23 Dec 2008 23:49:30 -0800 (PST) Subject: pcc and yasm In-Reply-To: <372467.71572.qm@web50612.mail.re2.yahoo.com> References: <372467.71572.qm@web50612.mail.re2.yahoo.com> Message-ID: On Mon, 22 Dec 2008, Gregory McGarry wrote: > On windows, pcc uses yasm as the assembler. You can find a snapshot of pcc for win32 at: > > http://pcc.ludd.ltu.se/ftp/pub/win32/ > > However, yasm6 will not compile many programs and yasm7 segfaults on many programs. The following example comes from the pcc-test cvs module. > > Anyone seen this? I thought I'd ask before setting up a yasm development system. > > > simple.s:23: instruction not recognized: `_WndProc' > simple.s:34: instruction not recognized: `_WinMain' > simple.s:81: expected identifier I fixed the crash (it was a double-free), but the reason it's erroring out is the @16, @4, etc. suffixes. In GAS mode, yasm's currently set up to only take @symbol instead of @number. As I believe the @number should simply be part of the symbol name, I'll try to fix it that way. Thanks, Peter From g.mcgarry at ieee.org Sat Dec 27 00:51:53 2008 From: g.mcgarry at ieee.org (Gregory McGarry) Date: Sat, 27 Dec 2008 00:51:53 -0800 (PST) Subject: pcc and yasm References: <372467.71572.qm@web50612.mail.re2.yahoo.com> Message-ID: <780030.1103.qm@web50603.mail.re2.yahoo.com> > > On windows, pcc uses yasm as the assembler. You can find a snapshot of pcc > for win32 at: > > > > http://pcc.ludd.ltu.se/ftp/pub/win32/ > > > > However, yasm6 will not compile many programs and yasm7 segfaults on many > programs. The following example comes from the pcc-test cvs module. > > > > Anyone seen this? I thought I'd ask before setting up a yasm development > system. > > > > > > simple.s:23: instruction not recognized: `_WndProc' > > simple.s:34: instruction not recognized: `_WinMain' > > simple.s:81: expected identifier > > I fixed the crash (it was a double-free), but the reason it's erroring out > is the @16, @4, etc. suffixes. In GAS mode, yasm's currently set up to > only take @symbol instead of @number. As I believe the @number should > simply be part of the symbol name, I'll try to fix it that way. Yes, the @ is considered a valid character in the symbol name. In case it matters, os x makes heavy use of $ in symbol decorations. Thanks for looking at this. Is there anything else I can do? Stay connected to the people that matter most with a smarter inbox. Take a look http://au.docs.yahoo.com/mail/smarterinbox From peter at tortall.net Sun Dec 28 00:49:17 2008 From: peter at tortall.net (Peter Johnson) Date: Sun, 28 Dec 2008 00:49:17 -0800 (PST) Subject: pcc and yasm In-Reply-To: <780030.1103.qm@web50603.mail.re2.yahoo.com> References: <372467.71572.qm@web50612.mail.re2.yahoo.com> <780030.1103.qm@web50603.mail.re2.yahoo.com> Message-ID: On Sat, 27 Dec 2008, Gregory McGarry wrote: >> I fixed the crash (it was a double-free), but the reason it's erroring out >> is the @16, @4, etc. suffixes. In GAS mode, yasm's currently set up to >> only take @symbol instead of @number. As I believe the @number should >> simply be part of the symbol name, I'll try to fix it that way. > > Yes, the @ is considered a valid character in the symbol name. In case it matters, os x makes heavy use of $ in symbol decorations. $ is already handled properly in yasm gas mode. @ is more difficult to handle as it serves a special purpose in ELF and thus isn't part of the symbol name there (it's used to designate special relocations such as sym at gotpcrel). GNU AS handles this by special-casing the @ character in the lexer for PE targets. I'll probably need to do something similar (but hopefully cleaner than in GNU AS, where it's a #defined token in the lexer lookup table). The cleanest solution I can think of without really hacking the lexer up with conditional rules is to scan identifiers for @ in elf mode. > Thanks for looking at this. Is there anything else I can do? I've been busy with family for Christmas (two 18-month-old nieces and a 5-year-old niece kept me plenty occupied!) and haven't had much time to work on fixing this yet. I'll try to get a fix committed in the next few days. Thanks, Peter