An upcoming performance improvement in RepeatedPtrField is incompatible with this API. The improvement is projected to accelerate repeated access to the elements of `RepeatedPtrField`, in particular and especially sequential access.
PA: https://protobuf.dev/news/2024-12-13/
PiperOrigin-RevId: 708439051
Note that the oneof case enum (the enum to signals which case is set and not the value of that case) remains a repr(C) enum with no explicit storage type to be ABI compatible with the C++ case enum.
PiperOrigin-RevId: 708315860
Update Message.__dir__ method to ensure all proto fields of a message
are included in the list of attributes. This solves the inconsistency
between the Fast C++ and UPB proto implementation. The Fast C++
implementation was always returning fields while UPB never. Improve
a unit test to detect that.
PiperOrigin-RevId: 708262631
Previously, extensions and unknown fields were stored on opposite ends of a growing buffer:
```
|------unknown fields-------|---------unallocated space------|--extensions---|
```
Unknown fields were appended and extensions were prepended during parse. When either side ran into the other, the buffer was reallocated to fit, rounding up to the nearest power of 2. This meant that for a proto with 70,000 bytes of unknown fields, the total memory consumed could be up to 128+256+512+1024+2048+4096+8192+16384+32768+65536+131072=262016 bytes allocated in the arena. In the more common case of a large, length-delimited field it'd be just 131072 bytes; but as a 3.74x increase or a 1.87x increase, that's a lot of extra memory.
The new representation still does exponential reallocation, but only for pointers to normal arena allocations. We exploit the fact that arena allocations are aligned to store data about whether the pointer is to an extension or a `upb_StringView` of unknown fields in the low bits of the pointer itself. This costs three pointers of overhead per unknown field and one pointer of overhead per extension, but that's a fixed overhead - we won't over-allocate large buffers for large unknown fields. If this overhead proves to be a problem, more compact representations could be implemented.
In addition, because unknown field bytes are now in their own allocations, they are pointer stable - in the future, this will allow us to exploit aliasing (when enabled during parse) for both unknown fields and lazy extensions (parsed from unknown fields), which can greatly reduce memory use for messages with a lot of unknown, string, or bytes fields.
PiperOrigin-RevId: 708058272
In this CL, we add the macro UPB_EXT_PRIMITIVE.
The template specializations are practically identical sans the CppType and UpbFunc called, so we now consolidate via this macro.
Added support for uint32/64, float/double, and bool.
Getting and setting exts of ^ in hpb should all work, and fetch the proper default value as well (if provided in the .proto).
PiperOrigin-RevId: 707897721
Checking whether the object is a numpy array checking is pretty expensive, so we only want to do it when the object is not trivially a float.
This makes appending to a repeated field of float ~3x faster in a common case:
```
_bench_append, 1000000, 284.1
_bench_extend, 1000000, 209.0
_bench_assign, 1000000, 175.8
_bench_pybind11, 1000000, 3.7
```
```
_bench_append, 1000000, 128.4
_bench_extend, 1000000, 67.1
_bench_assign, 1000000, 57.2
_bench_pybind11, 1000000, 3.5
```
PiperOrigin-RevId: 707811151
I've been gradually moving the location of our UPB tests to make them "more
default" (see an example here:
a02ec0f353)
It turns out that reduced some of our open-source coverage around UPB python
unit tests. In this commit, I temporarily hard-code the tests I've migrated,
and eventually I'll change it into a wildcard expansion to be more robust. We
can't do wildcards right now because not all tests in the
google.protobuf.internal namespace support UPB by default yet.
#test-continuous
PiperOrigin-RevId: 707661811
Before this change, the way it works is that we emit one .rs file per input .proto file, and a multi-src proto_library are handled by considering the first file as specially the 'primary' one which specially re-exports everything defined in other files.
After this change, we instead emit the .rs file per .proto file equally, and then we additionally emit a generated.rs file which re-exports all of them.
PiperOrigin-RevId: 707569226
It was causing the table to have a larger than expected load factor for a
single insertion, and rehashing the table right away afterwards.
PiperOrigin-RevId: 707548927
An upcoming performance improvement in RepeatedPtrField is incompatible with this API. The improvement is projected to accelerate repeated access to the elements of `RepeatedPtrField`, in particular and especially sequential access.
PA: https://protobuf.dev/news/2024-12-13/
PiperOrigin-RevId: 707260596
For now the error message describes how to build protoc and
protoc-gen-upb_minitable from source using CMake. Hopefully this will be
temporary, since we should be able to point to prebuilt binaries for both once
we have a real release out.
PiperOrigin-RevId: 707147611