This change moves almost everything in the `upb/` directory up one level, so
that for example `upb/upb/generated_code_support.h` becomes just
`upb/generated_code_support.h`. The only exceptions I made to this were that I
left `upb/cmake` and `upb/BUILD` where they are, mostly because that avoids
conflict with other files and the current locations seem reasonable for now.
The `python/` directory is a little bit of a challenge because we had to merge
the existing directory there with `upb/python/`. I made `upb/python/BUILD` into
the BUILD file for the merged directory, and it effectively loads the contents
of the other BUILD file via `python/build_targets.bzl`, but I plan to clean
this up soon.
PiperOrigin-RevId: 568651768
A couple weeks ago we moved upb into the protobuf Git repo, and this change
continues the merger of the two repos by making them into a single Bazel repo.
This was mostly a matter of deleting upb's WORKSPACE file and fixing up a bunch
of references to reflect the new structure.
Most of the changes are pretty mechanical, but one thing that needed more
invasive changes was the Python script for generating CMakeLists.txt,
make_cmakelists.py. The WORKSPACE file it relied on no longer exists with this
change, so I updated it to hardcode the information it needed from that file.
PiperOrigin-RevId: 564810016
This is the second attempt to fix our Git history. This should allow
"git blame" to work correctly in the upb/ directory even though our
automation unexpectedly blew away that directory.
This brings upb into line with C++. PHP already checks this
internally, so this should not be an issue there. Ruby on the
other hand does not currently check this, so this change will
cause our Ruby implementation to reject some programs that
would otherwise have been accepted.
This matches an API already present in proto2
(const DescriptorPool* FileDescriptor::pool()).
However there is a slightly subtle implication here.
In proto2, the relationship between Descriptor and
MessageFactory is 1:many. You can create as many
DynamicMessageFactory instances as you want, and
each one will have its own independent DynamicMessage
prototype and computed layout for the same underlying
Descriptor. In practice the layouts will all be the same,
but one thing that could be distinct is that each can
have its own extension pool, which is a DescriptorPool
that will be searched for extensions when parsing.
In contrast, upb does not have a separate "message
factory" abstraction. That means that each upb_msgdef
has a single distinct layout, in other words a 1:1
correspondence between descriptor and layout. This means
that there is no way to create multiple message types
for the same descriptor that have distinct extension
pools. If you want a different set of extensions, you
must create a separate upb_symtab with a distinct set
of descriptors.
This change further entrenches that upb_filedef:upb_symtab
is a 1:1 relationship. A single upb_filedef cannot be a
member of multiple symbol tables. In practice this was
already true (there is no way to add a single filedef to
multiple symbol tables) but this change codifies this
1:1 relationship.
We used to use a separate "add table" during the upb_symtab_addfile()
operation to make it easier to back out the file if it contained
errors. But this created unnecessary work of re-adding the same symbols
to the main symtab once everything was validated.
Instead we directly add symbols to the main symbols table. If there is
an error in validation, we remove precisely the set of symbols that
were already added.
This also requires using a separate arena for each file. We can fuse
it with the symtab's main arena if the operation is successful.
LoadDescriptor_Upb 61.2µs ± 4% 53.5µs ± 1% -12.50% (p=0.000 n=12+12)
LoadAdsDescriptor_Upb 4.43ms ± 1% 3.06ms ± 0% -31.00% (p=0.000 n=12+12)
LoadDescriptor_Proto2 257µs ± 0% 259µs ± 0% +1.00% (p=0.000 n=12+12)
LoadAdsDescriptor_Proto2 13.9ms ± 1% 13.9ms ± 1% ~ (p=0.128 n=12+12)
* WIP.
* WIP.
* Tests are passing.
* Recover some perf: LIKELY doesn't propagate through functions. :(
* Added some more benchmarks.
* Simplify & optimize upb_arena_realloc().
* Only add owned blocks to the freelist.
* More optimization/simplification.
* Re-fixed the bug.
* Revert unintentional changes to parser.rl.
* Revert Lua changes for now.
* Revert the arena fuse changes for now.
* Added last_size to the arena representation.
* Re-applied Lua changes.
* Implemented upb_arena_fuse().
* Fix the compile by re-ordering statements.
* Improve comments.
* [textformat]: added missing newline when a message opens.
* Added tostring() support to Lua that prints to text format.
Also fixed a gnarly bug that this exposed.
Map parsing/serializing relies on map entries always
having a predictable order. The code that generates
layout was not respecting this in the case of string
keys and primitive values.