According to https://en.cppreference.com/w/c/program/setjmp automatic variables
modified in a function calling setjmp can have indeterminate values. Instead,
refactor all functions calling setjmp so that the function calling setjmp
doesn’t have any local variables.
Part II: Mini table decoder.
PiperOrigin-RevId: 509644446
Prior to this CL we were allocating a MiniTable for each message and then overwriting it later. This could lead to an inconsistent state, and is unnecessary. This CL adds an extra phase to initialization so that the MiniTable is assigned only one time for each message.
PiperOrigin-RevId: 507617479
Previously we were calling the static method `Equals()` which did not take into effect the custom comparator or string for reporting errors.
PiperOrigin-RevId: 507344425
The initial motivation for this change was to fix a bug found by fuzzing. The old fuzz test (built on `cc_fuzz_target()`) detected an infinite loop if a bytes field default has an unterminated `\x` escape.
To fix this bug while expanding fuzz coverage, I created a fuzz test that verifies that we can do a lossless round trip from descriptor -> DefPool -> descriptor. We use C++ as the source of truth for whether a descriptor is valid or not, and what the canonical serialization back to protobuf form should be.
I wrote the new fuzz test using go/FuzzTest, which makes it easier and more readable to use an arbitrary `FileDescriptorSet` as input, while adding test cases for regressions.
The fuzz test highlighted a handful of errors that I subsequently fixed and added regression tests for:
1. The aforementioned unterminated `\x` bug.
2. We were not propagating the `edition` field.
3. We were missing the CheckIdent() check in a few places.
4. We were rejecting files with empty name, whereas C++ allows this.
5. There were a few bugs with escaping string defaults.
Since FuzzTest is Clang-only, I split the `FUZZ_TEST()` invocation from the regression tests, since the latter are portable and should be run on all platforms. Only `FUZZ_TEST()` itself is in a google3/Clang-only file.
PiperOrigin-RevId: 506997362
We're already doing a proper string sort in SortedEnums as of cl/503574792, but then we follow it up with a sort on the char* pointers.
PiperOrigin-RevId: 506778694
Our pkg_tar rule was missing a few template files for python. Using setup.py will guarantee conformance with python package requirements.
Adds MANIFEST.in to include c header/inc files in the source distribution.
PiperOrigin-RevId: 506638326
This was already broken, but wasn't caught in tests because `python3` wasn't defined on our windows machines. To prevent it from breaking the entire downstream Bazel workspace when `python3` exists, we mark it unsupported.
PiperOrigin-RevId: 506318286
A while back, the targets were all renamed from "manylinux" to "linux" in order to allow for local testing. However, we still need the release artifacts to use "manylinux" so this conditions the name on whether it is a release or local testing.
PiperOrigin-RevId: 505831116
Slight optimization that frees us from needing to backtrack up to the owning file def to extract the proto syntax bit. Costs zero additional storage since we already have available unused bits. Also makes the enum def the single source of truth for determining enum syntax - upb_FieldDef_IsClosedEnum() now just passes off to upb_EnumDef_IsClosed() instead of replicating that code.
PiperOrigin-RevId: 505513429
To correctly handle this case we must add the serialized map entry to the unknown fields. Ideally we could merely preserve the map entry's serialized bytes from the input. However this is tricky to do if we are streaming and the previous buffer where the map entry began is no longer available.
This CL fixes this edge case by using the encoder to re-encode the map entry rather than using the input bytes directly.
While this fix is reasonably simple and reliable, it has two unfortunate properties. One is performance: we now must run the encoder to recreate bytes that we already saw in the input.
The other is dependencies: this fix has the unfortunate property of making the decoder depend on the encoder. In applications that only want the decoder but not the encoder, this will increase binary size. But the practical effects of this are probably minimal (the vast majority of applications that depend on the decoder will also use the encoder).
We can revisit this later and see if there is a better way of preserving the input bytes without re-encoding. For now this fix is simple and correct and fixes the fuzz bug.
PiperOrigin-RevId: 505381927
It is not useful to include the argument, because this message is only printed when the argument type is `kUpb_CType_Message`.
PiperOrigin-RevId: 505230804
Currently these functions are hardwired to always return true, but the upstream
code now checks for failures (which will be implemented soon).
PiperOrigin-RevId: 504943663
The initial motivation for this CL was to fix a bug found by fuzzing. But the fuzz bug pointed out a few edge cases that this CL corrects:
1. The core bug is that we were allowing a map entry sub-message to be linked to a group field. This is not allowed in protobuf schemas, but we did not check for this edge case in `upb_MiniTable_SetSubMessage()`, so we were de facto allowing it. This triggered some bad behavior in the parser whereby we pushed a limit without checking its validity first.
2. To defend against this, I added asserts in `upb_MiniTable_SetSubMessage()` to verify the type of the field we are linking, to ensure that a group field is not linked to a map entry sub-message. But this should probably be changed to return an error instead of relying on asserts for this.
3. I changed the fuzz util code that builds the MiniTable so that it will never violate this new invariant. The fuzz util code now can run into situations where a group field has no valid non-map-entry sub-message to select. In those cases it will simply not register any sub-message for that field.
4. Previously group did not support leaving sub-messages unregistered. Previously I added this feature for sub-messages but not for groups. There is no reason why dynamic tree shaking should not work for group fields, so I extended the feature to support groups also.
PiperOrigin-RevId: 504913630
According to https://en.cppreference.com/w/c/program/setjmp automatic variables
modified in a function calling setjmp can have indeterminate values. Instead,
refactor all functions calling setjmp so that the function calling setjmp
doesn’t have any local variables.
Part V: Definition protocol buffer converters.
PiperOrigin-RevId: 504817971
According to https://en.cppreference.com/w/c/program/setjmp automatic variables
modified in a function calling setjmp can have indeterminate values. Instead,
refactor all functions calling setjmp so that the function calling setjmp
doesn’t have any local variables.
Part III: Definition pool builders.
PiperOrigin-RevId: 504817954
According to https://en.cppreference.com/w/c/program/setjmp automatic variables
modified in a function calling setjmp can have indeterminate values. Instead,
refactor all functions calling setjmp so that the function calling setjmp
doesn’t have any local variables.
Part VI: Wire encoder/decoder.
PiperOrigin-RevId: 504810940
According to https://en.cppreference.com/w/c/program/setjmp automatic variables
modified in a function calling setjmp can have indeterminate values. Instead,
refactor all functions calling setjmp so that the function calling setjmp
doesn’t have any local variables.
Part IV: Comparison utility.
PiperOrigin-RevId: 504563703
According to https://en.cppreference.com/w/c/program/setjmp automatic variables
modified in a function calling setjmp can have indeterminate values. Instead,
refactor all functions calling setjmp so that the function calling setjmp
doesn’t have any local variables.
Part VI: Code generator.
PiperOrigin-RevId: 504563663
According to https://en.cppreference.com/w/c/program/setjmp automatic variables
modified in a function calling setjmp can have indeterminate values. Instead,
refactor all functions calling setjmp so that the function calling setjmp
doesn’t have any local variables.
Part I: JSON encoder/decoder. The code was previously in compliance, but a
convention of avoiding non-const local variables in functions calling setjmp
will make the compliance more obvious.
PiperOrigin-RevId: 502927863
This is a temporary workaround for the fact that Copybara can only observe a maximum of 30 GitHub Actions success events.
This is necessary to unblock upb CLs.
PiperOrigin-RevId: 502908993