Introduce a upb_MessageValue_Zero() function to use for the cases we do want a zero'd union (typically a zero MessageValue union is not needed)
PiperOrigin-RevId: 672592274
The goal of the `names.h` convention is to have a single canonical place where a code generator can define the set of symbols it exports to other code generators, and a canonical place where the name mangling logic is implemented.
Each upb code generator now has its own `names.h` file defining the symbols that it owns & exports:
* `third_party/upb/upb_generator/c/names.h` (for `foo.upb.h` files)
* `third_party/upb/upb_generator/minitable/names.h` (for `foo.upb_minitable.h` files)
* `third_party/upb/upb_generator/reflection/names.h` (for `foo.upbdefs.h` files)
This is a significant improvement over the previous situation where the name mangling functions were co-mingled in `common.h`/`mangle.h`, or sprinkled throughout the generators, with no clear structure for which code generator owns which symbols.
With this structure in place, the visibility lists for the various `names.h` files provide a clear dependency graph for how different generators depend on each other. In general, we want to keep dependencies on the "C" code generator to a minimum, since it is the largest and most complicated of upb's generated APIs, and is also the most prone to symbol name clashes.
Note that upb's `names.h` headers are somewhat unusual, in that we do not want them to depend on C++'s reflection or upb's reflection. Most `names.h` headers in protobuf would use types like `proto2::Descriptor`, but we don't want upb to depend on C++ reflection, especially during its bootstrapping process. We also don't want to force users to build upb defs just to use these name mangling functions. So we use only plain string types like `absl::string_view` and `std::string`.
PiperOrigin-RevId: 672397247
This also increases compliance by adding `default_applicable_licenses` to several `BUILD` files that previously did not have it.
PiperOrigin-RevId: 670784686
This CL is mostly a no-op, except that now google3-only code is actually stripped from OSS, instead of being preserved in `# begin:google_only` blocks.
This follows the conventions of the greater Copybara ecosystem.
PiperOrigin-RevId: 669513564
Fixes the warning:
```
warning: a function declaration without a prototype is deprecated in all versions of C [-Wstrict-prototypes]
extern const upb_MiniTable* google__protobuf__OneofDescriptorProto_msg_init();
^
void
```
PiperOrigin-RevId: 664971019
Our bootstrapping setup compiles multiple versions of the generated code for `descriptor.proto` and `plugin.proto`, one for each stage of the bootstrap. For source files (`.c`), we can always select the correct version of the file in the BUILD rules, but for header files we need to make sure the correct stage's file is always selected via `#include`.
Previously we used `cc_library(includes=[])` to make it appear as though our bootstrapped headers had the same names as the "real" headers. This allowed a lot of the code to be agnostic to whether a bootstrap header was being used, which simplified things because we did not have to change the code performing the `#include`.
Unfortunately, due to build system limitations, this sometimes led to the incorrect header getting included. This should not have been possible, because we had a clean BUILD graph that should have removed all ambiguity about which header should be available. But in non-sandboxed builds, the compiler was able to find headers that were not actually in `deps=[]`, and worse it preferred those headers over the headers that actually were in `deps=[]`. This led to unintended results and errors about layering check violations.
This CL fixes the problem by removing all use of `includes=[]`. We now spell a full pathname to all bootstrap headers, so this class of errors is no longer possible. Unfortunately this adds some complexity, as we have to hard-code these full paths in several places.
A nice improvement in this CL is that `bootstrap_upb_proto_library()` can now only be used for bootstrapping; it only exposes the `descriptor_bootstrap.h` / `plugin_bootstrap.h` files. Anyone wanting to use the normal `net/proto2/proto/descriptor.upb.h` file should depend on `//net/proto2/proto:descriptor_upb_c_proto` target instead.
PiperOrigin-RevId: 664953196
We were failing to propagate the DefPool's platform to the MiniDescriptor builder. This caused upb's code generators to incorrectly generate a field rep of `kUpb_FieldRep_8Byte` for pointer-typed extension fields instead of the 32-bit clean output:
```
UPB_SIZE(kUpb_FieldRep_4Byte, kUpb_FieldRep_8Byte)
```
PiperOrigin-RevId: 653263168
This was previously fixed in C++ (https://github.com/protocolbuffers/protobuf/issues/16549), but not ported to other languages. Delimited field encoding can be inherited by fields where it's invalid, such as non-messages and maps. In these cases, the encoding should be ignored and length-prefixed should be used.
PiperOrigin-RevId: 642792988
This "feature" hasn't been implemented yet, but this puts a placeholder down to prevent compatibility issues in future editions. Once we provide versioning support on individual feature values, we don't want them becoming usable from edition 2023 protos
PiperOrigin-RevId: 635956805
All of this information is still available by merging fixed_features and overridable_features. This new split will make validation easier for runtimes that need to do dynamic builds.
PiperOrigin-RevId: 625815212
The new fields fixed_features and overridable_features can be simply merged to recover the old aggregate defaults. By splitting them though, plugins and runtimes get some extra information about lifetimes for enforcement.
PiperOrigin-RevId: 625527117
The only public target here is the edition defaults helper macro, which can be used by external runtimes and plugins. None of this code is C++-specific though, and should be organized higher up. Appropriate aliases are also placed at the top level for public targets
PiperOrigin-RevId: 625392504
This updates all our text parsers and serializers to better handle tag-delimited fields under editions. Under proto2, groups were the only tag-delimited fields possible, and the group name (i.e. the message type) was guaranteed to be unique. Text-format and various generators used this instead of the synthetic field name (lower-cased group name) to represent these fields.
Under editions, we've removed group syntax and allowed any message field to be tag-delimited. This breaks those cases when adding new tag-delimited fields where the message type might not be unique or correspond to the field name. Code generators have already been fixed to treat "group-like" fields using the old behavior, and treat new fields like any other sub-message.
This change addresses the text-format issue. Text parsers will accept *either* the type or field name for "group-like" fields, and only the field name for every other message field. Text serializers will continue to emit the message name for "group-like" fields, but also use the field name for everything else.
This creates some awkward capitalization behavior for fields that happen to *look* like proto2 groups, but it won't lead to any conflicts or invalid encodings. A feature will likely be added to edition 2024 to allow for migration off this legacy behavior.
PiperOrigin-RevId: 622260327
This is needed to make protobuf/bazel package minimal for other proto rules.
Keep 4 public bzl files in upb/bazel. They end up under protobuf/bazel, and they are legitimately used by other repositories.
Move upb_proto_library_internal/* under bazel/private. Those are utilities used in the rules. Moving them one level deeper makes protobuf/bazel package clean for other rules.
Move build_defs.bzl and amalgamation under /upb/bazel. Those are utilities used in the build.
Move lua.BUILD and python* uner /python/dist. Those are used in the WORKSPACE dependency setup.
PiperOrigin-RevId: 621442236