The polynomials have 701, 16-bit values. But poly_Rq_mul was reading 32
bytes at offset 1384 in order to get the last 18 of them. This silently
worked for a long time, but when 7153013019 switched to keeping
variables on the stack it was noticed by Valgrind.
This change fixes the overread. Setting watchpoints at the ends of the
two inputs (and one output) now shows no overreads nor overwrites.
BUG=424
Change-Id: Id86c1407ffce66593541c10feee47213f4b95c5d
Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/48645
Reviewed-by: David Benjamin <davidben@google.com>
The stack consumption of the HRSS functions is causing issues in
stack-constrained environments. Therefore allocate many variables on the
heap. This means that several HRSS_ functions now allocate, and thus can
fail, where they couldn't before. Callers that ignore the return value
and don't have crash-on-failure mallocs will still be safe, although
things will fail to decrypt later on.
Somehow, this actually makes key generation _faster_ on my machine. (I
don't know. Better alignment? Fewer L1 collisions?) The other operations
are slightly slower, as expected.
Before:
Did 17390 HRSS generate operations in 3054088us (5694.0 ops/sec)
Did 225000 HRSS encap operations in 3000512us (74987.2 ops/sec)
Did 87000 HRSS decap operations in 3014525us (28860.3 ops/sec)
After:
Did 21300 HRSS generate operations in 3026637us (7037.5 ops/sec)
Did 221000 HRSS encap operations in 3008911us (73448.5 ops/sec)
Did 84000 HRSS decap operations in 3007622us (27929.0 ops/sec)
Change-Id: I2312df8909af7d8d250c7c483c65038123f21ad9
Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/48345
Commit-Queue: Adam Langley <agl@google.com>
Reviewed-by: David Benjamin <davidben@google.com>