Contributed by: Samuel Thibault <samuel.thibault@ens-lyon.org>
It is built on top of the NASM parser and preproc, with the following
notable extensions for TASM syntax:
- case insensitive symbols and filenames,
- support for segment and size of labels, which permits to avoid giving
them on each memory dereference,
- support for data reservation (i.e. e.g. "var dd ?"),
- support for multiples (i.e. e.g. "var dd 1 dup 10"),
- little endian string integer constants,
- additional expression operators: shl, shr, and, or, low, high,
- additional offset keyword,
- additional fword and df,
- support for doubled quotes within quotes,
- support for array-like and structure-like notations: t[eax] and
[var].field,
- support for tasm directives: macro, rept, irp, locals, proc, struc,
segment, assume.
Notes:
- Almost all extensions are only effective when tasm_compatible_mode is
set, so we should have very reduced possible breakage.
- Because the "and" keyword can be an expression operator and an
instruction name, the data pseudo-instructions explicitly switch the
lexer state to INSTRUCTION state to fix the ambiguity.
- In gen_x86_insn.py, several instructions (namely lds and lea) now take
relaxed memory sizes. The reason is that in the case of tasm, the size
of the actual pointed data is passed up to there, and thus any type of
data should be accepted.
With all of this, loadlin can be compiled by yasm with quite reduced
modifications.
A new TASM-like frontend is also included.
svn path=/trunk/yasm/; revision=2130
These allow arbitrary prefixes and/or suffixes to be added to
externally-visible (GLOBAL, EXTERN, or COMMON) symbol names.
svn path=/trunk/yasm/; revision=2109
from libyasm core. Now absolute sections are tracked locally to the parser
and the parser generates EQUs directly for labels in absolute sections.
Fixes#106 and #103.
svn path=/trunk/yasm/; revision=1842
been using a mix of tabs and 4 spaces to indent; this looks horrible if
tab size is ever not 8. While I debated converting to tab-only indentation
that would have been a far higher impact to the source.
svn path=/trunk/yasm/; revision=1825
and common declare. The latter no longer passes through objfmt at parse time;
instead the objfmt must handle them at output time (objfmt-specific
extensions are parsed & stored by the parser). Directives are now handled
using a list (with function pointers) rather than a single function entry
point.
svn path=/trunk/yasm/; revision=1819
yasm_dbgfmt, and yasm_arch. This eliminates a lot of redundant keeping
track of this information in the individual object and debug formats and
also simplifies a fair amount of code.
I'm still not happy with how arch gets passed around in output code, but
there may not be much of an alternative there.
While I'm here, clean up some unused variables and functions and re-enable
the warning for unused variables in configure.ac.
svn path=/trunk/yasm/; revision=1812
build platform files.
While here, fix a few warnings by pushing uintptr_t to a few more register
usages.
Noticed by: rugxulo@gmail.com
svn path=/trunk/yasm/; revision=1786
the C function and data structure wrappers for Pyrex. We now require
Pyrex 0.9.5 to build the Python wrappers, as only >=0.9.5 has working
weakref support. We actually need 0.9.5.1, but it's not yet released
(0.9.5 has a crash bug in enum wrapping that we trigger).
Pyxelator works a lot better with non-anonymous enums/structs, so libyasm
has been scrubbed for this.
Next step: full Yasm data structure inspection.
svn path=/trunk/yasm/; revision=1745
PC-relative relocations (jumps and calls).
- Allow SEG:OFF to be used as just an offset portion (like NASM does).
- Labels in absolute sections that are declared global are given the correct
absolute value in the symbol table.
One difference from NASM:
label equ 0040h:001eh
jmp label
in NASM means the same as:
jmp 001eh (a near jump)
but yasm will treat this the same as:
jmp 0040h:001eh (a far jump)
I'm still not completely happy with this implementation, but it's workable
and fixes all the bugs I've found so far in absolute handling.
svn path=/trunk/yasm/; revision=1634
exception handling. There are now two layers an error or warning goes
through before it hits the user: first an error is logged via
yasm_error_set() (or yasm_warn_set() for a warning). Only one error may
be set, whereas multiple warnings can be set (yasm_warn_set maintains a
linked list). Then, calling yasm_errwarn_propagate() propagates any error
and/or warning(s) to an errwarns structure and associates the
errors/warnings with a line number at that time; this call also clears the
pending errors/warnings and allows new ones to be set. The propagate
function can safely be called when there are no pending error/warnings.
In addition, there are some helper errwarn functions that allow clearing of
an error/warning without propagating, getting it separately, etc.
Still yet to be done: changing most/all uses of yasm_internal_error() into
yasm_error_set(YASM_ERROR_ASSERTION).
The main advantage this change has is making libyasm functions feel much
more library like, and separating the user code line numbers from the inner
function error handling (e.g. intnum create functions only needed the line
number to trigger errors; this is no longer required).
The set/propagate/etc functions use global data structures to avoid passing
around a pointer to every function. This would need to be made thread-local
data in a threaded app. Errwarns containers (that keep associated line
numbers) are no longer global, so multiple source streams can be processed
separately with no conflict (at least if there's only a single thread of
execution).
svn path=/trunk/yasm/; revision=1521
is support for cross-section relative symbol references using "sym-$".
This generates a 32-bit relative relocation similar to those used for
cross-section jumps and calls.
The bugfix is that in Win64 output, RIP-relative relocations do something
special when there is an immediate value (or anything else) between the
value being relocated and the next instruction. E.g.
"shl dword [sym wrt rip], 5" needs to generate a REL32_1 relocation thanks
to the immediate byte following the RIP-relative value.
* symrec.c (sym_type): add SYM_CURPOS to track labels representing the
current assembly position (e.g. $ in NASM, . in GAS).
(yasm_symtab_define_curpos): New function to create symbols of this type.
(yasm_symrec_is_curpos): Check to see if symbol is SYM_CURPOS type.
(yasm_symrec_get_label, yasm_symrec_print): Update to handle SYM_CURPOS.
* symrec.h (yasm_symtab_define_curpos): Prototype.
(yasm_symrec_is_curpos): Prototype.
* gas-bison.y: Use yasm_symtab_define_curpos when defining '.'.
* nasm-bison.y: Use yasm_symtab_define_curpos when defining '$'.
* value.c (yasm_value_finalize_expr): Look for cross-section
"sym-SYM_CURPOS" combinations and generate curpos-relative value if found.
* coretype.h (yasm_value): Add ip_rel member to designate that curpos_rel
is set due to the value being IP-relative (rather than sym-curpos).
* value.h (yasm_value_initialize, yasm_value_init_sym): Initialize ip_rel.
* x86expr.c (yasm_x86__check_ea): Set ip_rel in addition to curpos_rel if
detected WRT rip.
* x86bc.c (x86_bc_insn_tobytes): Use ip_rel instead of curpos_rel when
adjusting for RIP-relative.
* coff-objfmt.c (coff_objfmt_output_value): Properly adjust output and
generate correct relocations for both curpos_rel and ip_rel. This
includes new generation of REL32_1, REL32_2, etc relocations.
* symrec.c, symrec.h (yasm_symtab_define_label2): Delete.
* coff-objfmt.c (stabs_debgfmt_generate_sections): Change to use
yasm_symtab_define_label() instead.
* win32-curpos.asm, win64-curpos.asm, curpos.asm, curpos-err.asm,
elf_gas64_curpos.asm: New tests for above.
svn path=/trunk/yasm/; revision=1423
fixing underlying HAMT implementation to match. Change is to reflect that
traversal stops when subfunction return is nonzero.
* hamt.h (HAMT_traverse): Update doxygen comment for stop on nonzero instead
of stop on zero.
* hamt.c (HAMT_traverse): Implement.
* symrec.c, xdf-objfmt.c, elf-objfmt.c, coff-objfmt.c: Update return values
for subfunctions.
svn path=/trunk/yasm/; revision=1365
.local sym; .comm sym; rather than directly using .lcomm sym. Handle this
usage as well. While we're here, also implement alignment for .lcomm and
refactor .lcomm handling and alignment handling.
* gas-token.re: Recognize .local.
* gas-bison.y (DIR_LOCAL): Implement .local.
(DIR_COMM): Recognize .local ; .comm case and call define_lcomm().
(DIR_LCOMM): Move functionality into..
(define_lcomm): Here, and implement alignment with..
(gas_parser_align): That now takes raw exprs. The valparam part of that is
now implemented in..
(gas_parser_dir_align): Formerly gas_parser_align.
(DIR_ALIGN): Use gas_parser_dir_align() instead.
* symrec.c (yasm_symtab_get): New function to just get a symbol based on name
without actually "referencing" it.
* symrec.h (yasm_symtab_get): Prototype.
* coretype.h (yasm_sym_vis): Add YASM_SYM_DLOCAL for flagging that a symbol
is explicitly flagged as a local symbol (rather than just default that way).
svn path=/trunk/yasm/; revision=1340
then defined, and warn instead of error if a symbol is declared global and
then defined.
* externdef.asm: Test for the warning.
svn path=/trunk/yasm/; revision=1308
undefined symbols extern if unused rather than causing undef errors.
* symrec.c (yasm_symtab_parser_finalize): Implement.
(symtab_finalize_info): New (more data to pass to
(symtab_parser_finalize_checksym): Update finalize helper.
* yasm.c (main): Update call to yasm_symtab_parser_finalize().
svn path=/trunk/yasm/; revision=1232
object formats without creating duplicate lists of symbols. XDF and COFF
were updated; ELF needs to reorder the symbols on its own, so for now it's
not been updated to use the common implementation.
* hamt.c (HAMTEntry, HAMT, HAMT_destroy)
(HAMT_insert): Change SLIST to STAILQ.
* symrec.c (sym_type): Add SYM_SPECIAL.
(yasm_symtab_define_special): New.
(yasm_symrec_declare): New; includes all functionality from symtab_declare.
(yasm_symtab_declare): Call yasm_symrec_declare now.
(yasm_symrec_is_special): New.
* symrec.h: Add prototypes for above.
* xdf-objfmt.c: Use symrec_data instead of declaring xdf_symtab.
* xdflong.hex, xdfprotect.hex, xdfother.hex: Update due to symbol reordering.
* coff-objfmt.c: Use symrec_data instead of declaring coff_symtab.
* elftimes.hex, elfso.hex, elfabssect.hex: Update due to symbol reordering.
* elfglobext.hex, elf-x86id.hex, elftest.hex, elfso64.hex: Likewise.
* stabs-elf.hex: Likewise.
svn path=/trunk/yasm/; revision=1188
similar) to ELF. They are used identically to NASM's ELF shared object
support.
Due to limited WRT support throughout libyasm, this caused a lot of rippling
changes. A major cleanup needs to be performed later to clear some of this
hackiness up.
* elf-machine.h (func_accepts_size_t): Rename to func_accepts_reloc().
(func_map_reloc_info_to_type): Add parameter ssyms for array of special syms.
(elf_machine_ssym): New; for defining machine-specific special syms.
(elf_machine_handler): Change accepts_reloc_size to accepts_reloc. Add new
ssyms and num_ssyms members.
* elf-x86-x86.c (ssym_index): New; this allows nice indexing of ssym arrays.
(elf_x86_x86_accepts_reloc): Rename of elf_x86_x86_accepts_reloc_size. Add
support for various WRT ssyms.
(elf_x86_x86_map_reloc_info_to_type): Add support for various WRT ssyms.
(elf_x86_x86_ssyms): New array of supported special symbols.
(elf_machine_handler_x86_x86): Update for above changes/additions.
* elf-x86-amd64.c (ssym_index, elf_x86_amd64_accepts_reloc)
(elf_x86_amd64_map_reloc_info_to_type, elf_x86_amd64_ssyms)
(elf_machine_handler_x86_amd64): Likewise.
* elf.h (elf_reloc_entry): Add wrt member.
(elf_set_arch): Add symtab parameter.
(elf_is_wrt_sym_relative): New.
(elf_reloc_entry_create): Add wrt parameter.
* elf.c (elf_set_arch): Allocate special syms from machine level.
(elf_is_wrt_sym_relative): New; search special syms, and report whether a
WRT ssym should be symbol-relative or section-relative.
(elf_reloc_entry_create): Pass WRT and ssyms info down to machine level.
* elf-objfmt.c (yasm_objfmt_elf): Add dotdotsym (..sym) symrec member.
(elf_objfmt_create): Pass symtab to elf_set_arch(). Allocate ..sym symbol.
(elf_objfmt_output_reloc): Update for elf_reloc_entry_create() change.
(elf_objfmt_output_expr): Handle WRT ssym. Make relocation symbol-relative
rather than section-relative if either WRT ..sym or WRT ssym that machine
level desires to be symbol-relative.
* symrec.c (yasm_symrec_get_label): Check for NULL sym->value.precbc; this
is now possible due to the user-accessible special symbols that ELF et al
create, which all have NULL precbc's.
* expr.c (yasm_expr_extract_symrec): Recurse into IDENT's to make more exprs
acceptable.
* coretype.h (yasm_output_reloc_func): Remove rel parameter as it shouldn't
be needed and complexifies writing of the reloc functions.
* stabs-dbgfmt.c (stabs_bc_stab_tobytes): Update output_reloc() call.
* elf-objfmt.c (elf_objfmt_output_reloc): Update to match.
* arch.h (yasm_arch_module): Add intnum_fixup_rel() function, change
intnum_tobytes() to not take rel parameter. The rel functionality is being
separated because sometimes it's desirable not to put the data into the
written intnum (e.g. ELF RELA relocations).
(YASM_ARCH_VERSION): Bump due to above change.
(yasm_arch_intnum_fixup_rel): New wrapper.
(yasm_arch_intnum_tobytes): Update wrapper (removing rel).
* lc3bbc.c (yasm_lc3b__intnum_fixup_rel): New, with code from:
(yasm_lc3b__intnum_tobytes): Remove rel code.
* lc3barch.h (yasm_lc3b__intnum_fixup_rel): New.
(yasm_lc3b__intnum_tobytes): Update.
* lc3barch.c (yasm_lc3b_LTX_arch): Reference yasm_lc3b__intnum_fixup_rel().
* x86bc.c (yasm_x86__intnum_fixup_rel): New, with code from:
(yasm_x86__intnum_tobytes): Remove rel code.
* x86arch.h (yasm_x86__intnum_fixup_rel): New.
(yasm_x86__intnum_tobytes): Update.
* x86arch.c (yasm_x86_LTX_arch): Reference yasm_x86__intnum_fixup_rel().
* xdf-objfmt.c (xdf_objfmt_output_expr): Update to use intnum_fixup_rel() /
new intnum_tobytes().
* bin-objfmt.c (bin_objfmt_output_expr): Likewise.
* coff-objfmt.c (coff_objfmt_output_expr): Likewise.
* elf-objfmt.c (elf_objfmt_output_expr: Likewise.
* nasm-listfmt.c (nasm_listfmt_output_expr): Likewise.
* nasm-bison.y: Change precedence of WRT and : operators: instead of being
the strongest binders they are now the weakest. This is needed to correctly
parse and be able to split WRT expressions. WRT handling is still somewhat
of a hack throughout yasm; we'll fix this later.
* x86expr.c (x86_expr_checkea_distcheck_reg): Don't check for WRT here.
(x86_expr_checkea_getregusage): Add new wrt parameter. Use it to handle
"WRT rip" separately from other operators. Recurse if there's a WRT below
the WRT rip; this is to handle cases like ELF-AMD64's elfso64.asm.
(yasm_x86__expr_checkea): Split off top-level WRT's and feed through
separately to x86_expr_checkea_getregusage().
* x86bc.c (x86_bc_insn_tobytes): Ensure the SUB operation for PC-relative
displacements goes BELOW any WRT expression.
(x86_bc_jmp_tobytes): Likewise.
* elfso.asm, elfso.hex, elfso.errwarn: New 32-bit ELF shared object tests.
* modules/objfmts/elf/tests/Makefile.inc: Include in distribution.
* elfso64.asm, elfso64.hex, elfso64.errwarn: New 64-bit ELF shared object
tests. This is not a good example, as the assembled code doesn't work, but
it at least tests the special symbols.
* modules/objfmts/elf/tests/amd64/Makefile.inc: Include in distribution.
svn path=/trunk/yasm/; revision=1168
potential function as of every bytecode, using the information provided
in revision [1147], and filtering out lables with "." or "$".
* symrec.c: don't add symrec to bytecode unless added to table.
* stabs-dbgfmt.c: remove old inefficient code to use new sym lookup.
* tests/*: create first test, using a copy of elftest.asm.
svn path=/trunk/yasm/; revision=1167
NULL-terminated array of labels that point to this bytecode (as the bytecode
previous to the label). NULL if no labels point to this bytecode.
* bytecode.c (yasm_bc_create_common): Initialize symrecs variable to NULL.
* bytecode.c (yasm_bc_destroy): Delete symrecs variable.
* bytecode.h (yasm_bc__add_symrec): Declare new function.
* bytecode.c (yasm_bc__add_symrec): New.
* symrec.c (yasm_symtab_define_label): Call yasm_bc__add_symrec().
This new functionality is needed to make writing certain dbgfmt routines
easier.
svn path=/trunk/yasm/; revision=1147
reporting functions that take a parameter for the line to be displayed in
addition to the the line used for sorting. This allows the "previously
defined" message to use the standard errwarn line resolution functions.
The resulting error messages look like gcc output.
Reported by: Edouard Gomez <ed.gomez@free.fr>
svn path=/trunk/yasm/; revision=1074
As yasm has evolved, various minor additions have been made to libyasm to
support the new features. These minor additions have accumulated, and
some contain significant redundancies. In addition, the core focus of
yasm has begun to move away from the front-end commandline program "yasm"
to focusing on libyasm, a collection of reusable routines for use in all
sorts of programs dealing with code at the assembly level, and the modules
that provide specific features for parsing such code.
This libyasm/module update focuses on cleaning up much of the cruft that
has accumulated in libyasm, standardizing function names, eliminating
redundancies, making many of the core objects more reusable for future
extensions, and starting to make libyasm and the modules thread-safe by
eliminating static variables.
Specific changes include:
- Making a symbol table data structure (no longer global). It follows a
factory model for creating symrecs.
- Label symbols now refer only to bytecodes; bytecodes have a pointer to
their containing section.
- Standardizing on *_create() and *_destroy() for allocation/deallocation.
- Adding a standardized callback mechanism for all data structures that
allow associated data. Allowed the removal of objfmt and
dbgfmt-specific data callbacks in their interfaces.
- Unmodularizing linemgr, but allowing multiple linemap instances (linemgr
is now renamed linemap).
- Remove references to lindex; all virtual lines (from linemap) are now
just "line"s.
- Eliminating the bytecode "type" enum, instead adding a standardized
callback mechanism for custom (and standard internal) bytecode types.
This will make it much easier to add new bytecodes, and eliminate the
possibility of type collisions. This also allowed the removal of the
of_data and df_data bytecodes, as objfmts and dbgfmts can now easily
implement their own bytecodes, and the cleanup of arch's bytecode usage.
- Remove the bytecodehead and sectionhead pseudo-containers, instead
making true containers: section now implements all the functions of
bytecodehead, and the new object data structure implements all the
functions of sectionhead.
- Add object data structure: it's a container that contains sections, a
symbol table, and a line mapping for a single object. Every former use
of sectionhead now takes an object.
- Make arch interface and all standard architectures thread-safe:
yasm_arch_module is the module interface; it contains a create()
function that returns a yasm_arch * to store local yasm_arch data; all
yasm_arch_module functions take the yasm_arch *.
- Make nasm parser thread-safe.
To be done in phase 2: making other module interfaces thread-safe. Note
that while the module interface may be thread-safe, not all modules may be
written in such a fashion (hopefully all the "standard" ones will be, but
this is yet to be determined).
svn path=/trunk/yasm/; revision=1058
- Move config.h and util.h from libyasm (and installed libyasm) to top level.
- Move yasm_* functions from util.h to coretype.h.
- Remove a number of autoconf-related YASM_*_INTERNAL options from libyasm.h.
- Rename YASM_INTERNAL to YASM_LIB_INTERNAL; it now actually means what the
comment describes: enables definitions that violate the yasm_* namespace.
While we're at it, no longer define YASM_LIB_INTERNAL from yasm frontend, so
it's closer to what a real typical libyasm-using application would look like.
svn path=/trunk/yasm/; revision=944
libintl dependency in modules.
Also standardize initialize() and cleanup() functions.
Move replace_extension() from file.c to main.c.
Clean up some extern variable declarations in various places (particularly
nasm-compatible parser).
svn path=/trunk/yasm/; revision=792
the line index. Fixes some minor line number/error message nits due to
incorrect usage of line_index in old global variable method.
svn path=/trunk/yasm/; revision=787
Add delete function for symrec objfmt-specific data to objfmt interface.
Delete declare_data_copy function from objfmt interface (it wasn't being called
from anywhere).
Implement functions missing from dbg objfmt.
svn path=/trunk/yasm/; revision=763
- Add two new bytecode types:
BC_ALIGN (not yet implemented) for performing nice alignment magic.
BC_OBJFMT_DATA for storing objfmt-generated data in more advanced objfmts.
- objfmt structure changes:
Add handling functions for BC_OBJFMT_DATA data.
Allow a number of functions to be NULL.
svn path=/trunk/yasm/; revision=631
just assigned the pointer: but the symrec is deleted (if it's not in the symbol
table) when the expr is deleted. Thus, we need to create a copy of the symrec
instead of just reusing the same value if it's going to be deleted later. This
trickles down to objfmt to copy the objfmt-local data.
svn path=/trunk/yasm/; revision=467
"index". This fixes some problems with assumptions made by various parts of
the code that are invalidated when the line number doesn't always increase (eg.
when the NASM %line directive is used).
Speed fixes are needed to the implementation of the line_* functions in
globals.c before this is finished.
svn path=/trunk/yasm/; revision=424