Introduce a upb_MessageValue_Zero() function to use for the cases we do want a zero'd union (typically a zero MessageValue union is not needed)
PiperOrigin-RevId: 672592274
The goal of the `names.h` convention is to have a single canonical place where a code generator can define the set of symbols it exports to other code generators, and a canonical place where the name mangling logic is implemented.
Each upb code generator now has its own `names.h` file defining the symbols that it owns & exports:
* `third_party/upb/upb_generator/c/names.h` (for `foo.upb.h` files)
* `third_party/upb/upb_generator/minitable/names.h` (for `foo.upb_minitable.h` files)
* `third_party/upb/upb_generator/reflection/names.h` (for `foo.upbdefs.h` files)
This is a significant improvement over the previous situation where the name mangling functions were co-mingled in `common.h`/`mangle.h`, or sprinkled throughout the generators, with no clear structure for which code generator owns which symbols.
With this structure in place, the visibility lists for the various `names.h` files provide a clear dependency graph for how different generators depend on each other. In general, we want to keep dependencies on the "C" code generator to a minimum, since it is the largest and most complicated of upb's generated APIs, and is also the most prone to symbol name clashes.
Note that upb's `names.h` headers are somewhat unusual, in that we do not want them to depend on C++'s reflection or upb's reflection. Most `names.h` headers in protobuf would use types like `proto2::Descriptor`, but we don't want upb to depend on C++ reflection, especially during its bootstrapping process. We also don't want to force users to build upb defs just to use these name mangling functions. So we use only plain string types like `absl::string_view` and `std::string`.
PiperOrigin-RevId: 672397247
This also increases compliance by adding `default_applicable_licenses` to several `BUILD` files that previously did not have it.
PiperOrigin-RevId: 670784686
This CL is mostly a no-op, except that now google3-only code is actually stripped from OSS, instead of being preserved in `# begin:google_only` blocks.
This follows the conventions of the greater Copybara ecosystem.
PiperOrigin-RevId: 669513564
This simplifies upb by removing differences between google3 and OSS.
This also points upb at the protobuf license, instead of keeping a separate copy around for upb.
PiperOrigin-RevId: 669447145
This just behaves the same as the pre-existing upb_Message_SetMessage(), but with the intendended naming style (upb_Message_SetMessage should accept an arena and support extensions).
PiperOrigin-RevId: 667998428
Putting it into BUILD files unintentionally forces it on all our downstream users. Instead, we just want to enable this during testing and let them choose for themselves in their builds.
Note, that this expands the scope of -Werror to our entire repo for CI, so a bunch of fixes and opt-outs had to be applied to get this change passing.
Closed#14714
PiperOrigin-RevId: 666903224
Fixes the warning:
```
warning: a function declaration without a prototype is deprecated in all versions of C [-Wstrict-prototypes]
extern const upb_MiniTable* google__protobuf__OneofDescriptorProto_msg_init();
^
void
```
PiperOrigin-RevId: 664971019
Our bootstrapping setup compiles multiple versions of the generated code for `descriptor.proto` and `plugin.proto`, one for each stage of the bootstrap. For source files (`.c`), we can always select the correct version of the file in the BUILD rules, but for header files we need to make sure the correct stage's file is always selected via `#include`.
Previously we used `cc_library(includes=[])` to make it appear as though our bootstrapped headers had the same names as the "real" headers. This allowed a lot of the code to be agnostic to whether a bootstrap header was being used, which simplified things because we did not have to change the code performing the `#include`.
Unfortunately, due to build system limitations, this sometimes led to the incorrect header getting included. This should not have been possible, because we had a clean BUILD graph that should have removed all ambiguity about which header should be available. But in non-sandboxed builds, the compiler was able to find headers that were not actually in `deps=[]`, and worse it preferred those headers over the headers that actually were in `deps=[]`. This led to unintended results and errors about layering check violations.
This CL fixes the problem by removing all use of `includes=[]`. We now spell a full pathname to all bootstrap headers, so this class of errors is no longer possible. Unfortunately this adds some complexity, as we have to hard-code these full paths in several places.
A nice improvement in this CL is that `bootstrap_upb_proto_library()` can now only be used for bootstrapping; it only exposes the `descriptor_bootstrap.h` / `plugin_bootstrap.h` files. Anyone wanting to use the normal `net/proto2/proto/descriptor.upb.h` file should depend on `//net/proto2/proto:descriptor_upb_c_proto` target instead.
PiperOrigin-RevId: 664953196
We were failing to propagate the DefPool's platform to the MiniDescriptor builder. This caused upb's code generators to incorrectly generate a field rep of `kUpb_FieldRep_8Byte` for pointer-typed extension fields instead of the 32-bit clean output:
```
UPB_SIZE(kUpb_FieldRep_4Byte, kUpb_FieldRep_8Byte)
```
PiperOrigin-RevId: 653263168
General test for it is done in Rust, and then extensions are tested in UPB as they're not currently supported in Rust-upb.
PiperOrigin-RevId: 651113583
This should significantly reduce the size of large arenas. Previously, a large arena would nearly double in size if the most recent block filled up. This could end up wasting large amounts of memory. After this CL, we will waste at most the max block size, which defaults to 32k.
This more or less matches the behavior of the C++ arena.
PiperOrigin-RevId: 647802280
When compiling the upb codebase, I randomly noticed there was one usage of directly calling `longjmp` though it seems the convention is to use the `UPB_LONGJMP` macro. AFAICT, the macro is mostly a wrapper to improve behavior on MacOS, so this may improve behavior there, otherwise it should be just making the code more consistent.
Closes#17201
COPYBARA_INTEGRATE_REVIEW=https://github.com/protocolbuffers/protobuf/pull/17201 from anuraaga:patch-1 14cc027ef2
PiperOrigin-RevId: 645420330
Since statically tree shaken messages can never later become linked, we should not need to use any of the special code in the decoder. By using a distinct "empty" message type, we avoid triggering any of this special behavior. This avoids bugs around hazzers and other presence checks.
Also fixed a bug in the cmake staleness test that was causing test failures.
PiperOrigin-RevId: 643036818
This was previously fixed in C++ (https://github.com/protocolbuffers/protobuf/issues/16549), but not ported to other languages. Delimited field encoding can be inherited by fields where it's invalid, such as non-messages and maps. In these cases, the encoding should be ignored and length-prefixed should be used.
PiperOrigin-RevId: 642792988
The functionality is enabled when the proto_one_output_per_message option used by C++ Lite is enabled.
This mirrors the behavior of C++ lite protos.
PiperOrigin-RevId: 642327960
The second assert in _upb_EncodeRoundTripFloat is raised if val is a nan. This fix just returns the output of first spnprintf.
I am not sure how changes to this repo are made so feel free to ignore this CL.
To test this, you could
1. Define a proto with a float field
message Test {
float val = 1;
}
2. In a python script, import the library and then set the val to nan and try to print it.
proto = Test(val=float('nan'))
print(proto)
This will cause a coredump due to assertion error:
assert.h assertion failed at third_party/upb/upb/lex/round_trip.c:46 in void _upb_EncodeRoundTripFloat(float, char *, size_t): strtof(buf, NULL) == val
Added the corresponding change to double too
PiperOrigin-RevId: 637127851