With GCC, request it to maintain 16 byte alignment, and the existing
entry points already align it via attribute_align_arg.
With clang, do the same as for mingw; disable the aligned stack
and let the assembly functions that require it do the alignment
instead.
Signed-off-by: Martin Storsjö <martin@martin.st>