This required some work to unify map entry messages with regular messages, with respect to presence. Before map entry fields could never have presence. Now they can have presence according to normal rules. Note that this only applies to times that the user constructs a map entry directly.
PiperOrigin-RevId: 490611656
We need to sharpen the distinction between messages and extensions in the mini
descriptor encoder, so split the code paths for each.
PiperOrigin-RevId: 480675339
Prior to this CL, users were relying on `field->descriptortype` to get the field type. This almost works, as `field->descriptortype` is almost, but not quite, the field type of the field. In two special cases we deviate from the true field type, for ease of parsing and serialization:
- For open enums, we use `kUpb_FieldType_Int32` instead of `kUpb_FieldType_Enum`, because from the perspective of the wire format, an open enum field is equivalent to int32.
- For proto2 strings, we use `kUpb_FieldType_Bytes` instead of `kUpb_FieldType_String`, because proto2 strings do not perform UTF-8 validation, which makes them equivalent to bytes.
In this CL we add a public API function:
```
// Returns the true field type for this field.
upb_FieldType upb_MiniTableField_Type(const upb_MiniTable_Field* f);
```
This will provide the actual field type for this field.
Note that this CL changes the MiniDescriptor format. Previously MiniDescriptors did not contain enough information to distinguish between Enum/Int32. To remedy this we added a new encoded field type, `kUpb_EncodedType_ClosedEnum`.
PiperOrigin-RevId: 479387672
Optimizes `upb_MiniTable_Enum` for enums with many values (>64) but with relatively dense packing in numeric space.
This CL optimizes both the size and speed of such enums:
- size: 30x code size reduction
- speed: moved from linear search to a constant-time bit test
Negative enum values are still expensive, as they are never put into the bitfield.
PiperOrigin-RevId: 473259819
An enum MiniDescriptor simply encodes a set of valid `int32_t` values, so that the protobuf parser can test whether a given enum value is known or not.
The format implemented here is novel and needs to be documented. In short, the format is:
1. base92 values 0-31: 5-bit mask indicating presence or absence of the next five enum values.
2. base92 values 60-91: varint indicating skip over a region of enum values.
Negative enum values are encoded as their `uint32_t` equivalent.
PiperOrigin-RevId: 442892799