I don't really know how to explain this briefly...
If you don't decorate this with dllimport, then a Cygwin runtime relocation
is used (a so called 'pseudo-reloc'). As the relocation offset is only 32
bits, this can fail on x86_64 if the DLL happens to be loaded more than 2GB
away from the reference.
If you decorate with dllimport, then access is indirected via a pointer,
imp_square_unsigned, which is fixed up by the loader.