The goal of the `names.h` convention is to have a single canonical place where a code generator can define the set of symbols it exports to other code generators, and a canonical place where the name mangling logic is implemented.
Each upb code generator now has its own `names.h` file defining the symbols that it owns & exports:
* `third_party/upb/upb_generator/c/names.h` (for `foo.upb.h` files)
* `third_party/upb/upb_generator/minitable/names.h` (for `foo.upb_minitable.h` files)
* `third_party/upb/upb_generator/reflection/names.h` (for `foo.upbdefs.h` files)
This is a significant improvement over the previous situation where the name mangling functions were co-mingled in `common.h`/`mangle.h`, or sprinkled throughout the generators, with no clear structure for which code generator owns which symbols.
With this structure in place, the visibility lists for the various `names.h` files provide a clear dependency graph for how different generators depend on each other. In general, we want to keep dependencies on the "C" code generator to a minimum, since it is the largest and most complicated of upb's generated APIs, and is also the most prone to symbol name clashes.
Note that upb's `names.h` headers are somewhat unusual, in that we do not want them to depend on C++'s reflection or upb's reflection. Most `names.h` headers in protobuf would use types like `proto2::Descriptor`, but we don't want upb to depend on C++ reflection, especially during its bootstrapping process. We also don't want to force users to build upb defs just to use these name mangling functions. So we use only plain string types like `absl::string_view` and `std::string`.
PiperOrigin-RevId: 672397247
Our bootstrapping setup compiles multiple versions of the generated code for `descriptor.proto` and `plugin.proto`, one for each stage of the bootstrap. For source files (`.c`), we can always select the correct version of the file in the BUILD rules, but for header files we need to make sure the correct stage's file is always selected via `#include`.
Previously we used `cc_library(includes=[])` to make it appear as though our bootstrapped headers had the same names as the "real" headers. This allowed a lot of the code to be agnostic to whether a bootstrap header was being used, which simplified things because we did not have to change the code performing the `#include`.
Unfortunately, due to build system limitations, this sometimes led to the incorrect header getting included. This should not have been possible, because we had a clean BUILD graph that should have removed all ambiguity about which header should be available. But in non-sandboxed builds, the compiler was able to find headers that were not actually in `deps=[]`, and worse it preferred those headers over the headers that actually were in `deps=[]`. This led to unintended results and errors about layering check violations.
This CL fixes the problem by removing all use of `includes=[]`. We now spell a full pathname to all bootstrap headers, so this class of errors is no longer possible. Unfortunately this adds some complexity, as we have to hard-code these full paths in several places.
A nice improvement in this CL is that `bootstrap_upb_proto_library()` can now only be used for bootstrapping; it only exposes the `descriptor_bootstrap.h` / `plugin_bootstrap.h` files. Anyone wanting to use the normal `net/proto2/proto/descriptor.upb.h` file should depend on `//net/proto2/proto:descriptor_upb_c_proto` target instead.
PiperOrigin-RevId: 664953196
Picking up #14981 from @dawidcha after several months of radio silence.
Quoting the OP of that PR:
> I have been collaborating with the grpc developers to make it possible to build that library as a Windows DLL - a couple of PRs were already merged, more like [grpc/grpc#34345](https://github.com/grpc/grpc/pull/34345) are pending.
>
> The grpc library incorporates some upb-generated, and upbdefs-generated code into grpc.dll, which is referenced by other code that consumes the library. Since this is now a DLL, that code doesn't know how to link to these generated symbols because they are not annotated with __declspec(dllimport).
>
> This PR aims to fix that by introducing a parameter 'dllexport_tag' to the upb and upbdefs plugins. That parameter should be a string e.g. MYAPP_DLL and when set, the extern symbols are annotated with a macro with that name. This can either be set externally to __declspec(dllimport) or, as is usual practice, when compiling code into a DLL, the macro <dllexport_tag>_EXPORT (i.e. MYAPP_DLL_EXPORT) is defined, and when consuming the DLL <dllexport_tag>_IMPORT is defined if neither are defined then the MYAPP_DLL macro becomes empty string which is what you want for building a static library.
>
> This is a continuation of #14230
>
> Fixes: #14255
Towards #13726Closes#14981Closes#17079
COPYBARA_INTEGRATE_REVIEW=https://github.com/protocolbuffers/protobuf/pull/17079 from h-vetinari:add_dll_tags 34927b1dde
PiperOrigin-RevId: 642622258
This makes the file layout a bit more consistent with the `protos ->
protos_generator` pattern. I also replaced the `upbc` namespace with
`upb::generator`.
PiperOrigin-RevId: 569264372
This change moves almost everything in the `upb/` directory up one level, so
that for example `upb/upb/generated_code_support.h` becomes just
`upb/generated_code_support.h`. The only exceptions I made to this were that I
left `upb/cmake` and `upb/BUILD` where they are, mostly because that avoids
conflict with other files and the current locations seem reasonable for now.
The `python/` directory is a little bit of a challenge because we had to merge
the existing directory there with `upb/python/`. I made `upb/python/BUILD` into
the BUILD file for the merged directory, and it effectively loads the contents
of the other BUILD file via `python/build_targets.bzl`, but I plan to clean
this up soon.
PiperOrigin-RevId: 568651768
The new rules are:
- `upb_minitable_proto_library()`: contains the MiniTables only
- `upb_c_proto_library()`: Contains the C API. Depends on the MiniTables
This involved splitting upb code generation into two separate aspects, one for MiniTables and one for the C API.
PiperOrigin-RevId: 565518070
A couple weeks ago we moved upb into the protobuf Git repo, and this change
continues the merger of the two repos by making them into a single Bazel repo.
This was mostly a matter of deleting upb's WORKSPACE file and fixing up a bunch
of references to reflect the new structure.
Most of the changes are pretty mechanical, but one thing that needed more
invasive changes was the Python script for generating CMakeLists.txt,
make_cmakelists.py. The WORKSPACE file it relied on no longer exists with this
change, so I updated it to hardcode the information it needed from that file.
PiperOrigin-RevId: 564810016
This is the second attempt to fix our Git history. This should allow
"git blame" to work correctly in the upb/ directory even though our
automation unexpectedly blew away that directory.
In this CL I'd like to call existing C++ Protobuf API from the V0 Rust API. Since parts of the C++ API are defined inline and using (obviously) C++ name mangling, we need to create a "thunks.cc" file that:
1) Generates code for C++ API function we use from Rust
2) Exposes these functions without any name mangling (meaning using `extern "C"`)
In this CL we add Bazel logic to generate "thunks" file, compile it, and propagate its object to linking. We also add logic to protoc to generate this "thunks" file.
The protoc logic is rather rudimentary still. I hope to focus on protoc code quality in my followup work on V0 Rust API using C++ kernel.
PiperOrigin-RevId: 523479839