Enable the use of [SU]Int32Size and EnumSize templates for AArch64 (#11102)

Hi,

When benchmarking proto_benchmark from fleetbench on an AArch64 target we found that clang is able to vectorize these functions and they offer better performance than the scalar alternative.

I ran //src/google/protobuf:arena_unittest on aarch64-none-linux-gnu. Should I run any other tests? Also protobuf used to have its own set of benchmarks, but I can't find these when I query all targets with bazel. Let me know if you'd like me to run anything else, I couldn't find instructions on what the full test run is.

Closes #11102

COPYBARA_INTEGRATE_REVIEW=https://github.com/protocolbuffers/protobuf/pull/11102 from avieira-arm:main 5552410a25
PiperOrigin-RevId: 532779004
pull/12841/head
Andre Vieira 2 years ago committed by Copybara-Service
parent 1010f9178f
commit e285d3e307
  1. 6
      src/google/protobuf/wire_format_lite.cc

@ -669,7 +669,7 @@ static size_t VarintSize(const T* data, const int n) {
} else if (SignExtended) {
msb_sum += x >> 31;
}
// clang is so smart that it produces optimal SSE sequence unrolling
// clang is so smart that it produces optimal SIMD sequence unrolling
// the loop 8 ints at a time. With a sequence of 4
// cmpres = cmpgt x, sizeclass ( -1 or 0)
// sum = sum - cmpres
@ -712,7 +712,7 @@ static size_t VarintSize64(const T* data, const int n) {
// and other platforms are untested, in those cases using the optimized
// varint size routine for each element is faster.
// Hence we enable it only for clang
#if defined(__SSE__) && defined(__clang__)
#if (defined(__SSE__) || defined(__aarch64__)) && defined(__clang__)
size_t WireFormatLite::Int32Size(const RepeatedField<int32_t>& value) {
return VarintSize<false, true>(value.data(), value.size());
}
@ -730,7 +730,7 @@ size_t WireFormatLite::EnumSize(const RepeatedField<int>& value) {
return VarintSize<false, true>(value.data(), value.size());
}
#else // !(defined(__SSE4_1__) && defined(__clang__))
#else // !((defined(__SSE__) || defined(__aarch64__) && defined(__clang__))
size_t WireFormatLite::Int32Size(const RepeatedField<int32_t>& value) {
size_t out = 0;

Loading…
Cancel
Save