This matches an API already present in proto2
(const DescriptorPool* FileDescriptor::pool()).
However there is a slightly subtle implication here.
In proto2, the relationship between Descriptor and
MessageFactory is 1:many. You can create as many
DynamicMessageFactory instances as you want, and
each one will have its own independent DynamicMessage
prototype and computed layout for the same underlying
Descriptor. In practice the layouts will all be the same,
but one thing that could be distinct is that each can
have its own extension pool, which is a DescriptorPool
that will be searched for extensions when parsing.
In contrast, upb does not have a separate "message
factory" abstraction. That means that each upb_msgdef
has a single distinct layout, in other words a 1:1
correspondence between descriptor and layout. This means
that there is no way to create multiple message types
for the same descriptor that have distinct extension
pools. If you want a different set of extensions, you
must create a separate upb_symtab with a distinct set
of descriptors.
This change further entrenches that upb_filedef:upb_symtab
is a 1:1 relationship. A single upb_filedef cannot be a
member of multiple symbol tables. In practice this was
already true (there is no way to add a single filedef to
multiple symbol tables) but this change codifies this
1:1 relationship.
We used to use a separate "add table" during the upb_symtab_addfile()
operation to make it easier to back out the file if it contained
errors. But this created unnecessary work of re-adding the same symbols
to the main symtab once everything was validated.
Instead we directly add symbols to the main symbols table. If there is
an error in validation, we remove precisely the set of symbols that
were already added.
This also requires using a separate arena for each file. We can fuse
it with the symtab's main arena if the operation is successful.
LoadDescriptor_Upb 61.2µs ± 4% 53.5µs ± 1% -12.50% (p=0.000 n=12+12)
LoadAdsDescriptor_Upb 4.43ms ± 1% 3.06ms ± 0% -31.00% (p=0.000 n=12+12)
LoadDescriptor_Proto2 257µs ± 0% 259µs ± 0% +1.00% (p=0.000 n=12+12)
LoadAdsDescriptor_Proto2 13.9ms ± 1% 13.9ms ± 1% ~ (p=0.128 n=12+12)
* Added -Wextra and -Wshorten-64-to-32 and fixed resulting errors.
* Disable -Wshorten-32-to-64 since Kokoro is missing Clang.
* Fixed -Wextra warnings for gcc.
* Reordered UPB_UNUSED() to come after declarations.
* Added another -pedantic fix and log CC version.
* Fix compile error and conditionally run use_bazel.sh.
* Moved set -e after use_bazel.sh.
* Fixed typo in conditional.
- A new PHP-specific upb amalgamation. It contains everything related to upb_msg, but leaves out all of the old handlers-related interfaces and encoders/decoders.
# Schema/Defs Changes
- Changed `upb_fielddef_msgsubdef()` and `upb_fielddef_enumsubdef()` to return `NULL` instead of assert-failing if the field is not a message or enum.
- Added `upb_msgdef_iswrapper()`, to test whether this is a wrapper well-known type.
# Decoder
- Decoder bugfix: when we parse a submessage inside a oneof, we need to clear out any previous data, so we don't misinterpret it as a pointer to an existing submessage.
# JSON Decoder
- Allowed well-known types at the top level to have their special processing.
- Fixed a bug that could occur when parsing nested empty lists/objects, eg `[[]]`.
- Made the "ignore unknown" option also be permissive about unknown enumerators by setting them to 0.
# JSON Encoder
- Allowed well-known types at the top level to have their special processing.
- Removed all spaces after `:` and `,` characters, to match the old encoder and pass goldenfile tests.
# Message / Reflection
- Changed `upb_msg_hasoneof()` -> `upb_msg_whichoneof()`. The new function returns the `upb_fielddef*` of whichever oneof is set.
- Implemented `upb_msg_clearfield()` and added/implemented `upb_msg_clear()`.
- Added `upb_msg_discardunknown()`. Part of me thinks this should go in a util library instead of core reflection since it is a recursive algorithm.
# Compiler
- Always emit descriptors as an array instead of as a string, to avoid exceeding maximum string lengths. If this becomes a speed issue later we can go back to two separate paths.
* resolvename is declared to return a bool value, but instead can return NULL. MSVC 2019 does not like that an throws a compile error. Fixed by returning false instead of NULL.
* When compiling with MSVC 2019, the UPB_ASSUME macro expands out to:
do {} if (false && (ok))
That isn't valid C code. Fixed by adding an elif for MSVC that uses __assume(0), which is similar to gcc's __builtin_unreachable according to http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/p0627r0.pdf.
* WIP.
* WIP.
* Tests are passing.
* Recover some perf: LIKELY doesn't propagate through functions. :(
* Added some more benchmarks.
* Simplify & optimize upb_arena_realloc().
* Only add owned blocks to the freelist.
* More optimization/simplification.
* Re-fixed the bug.
* Revert unintentional changes to parser.rl.
* Revert Lua changes for now.
* Revert the arena fuse changes for now.
* Added last_size to the arena representation.
* Fixed compile errors.
* Fixed compile error and changed benchmarks to do one allocation.
* Added support for proto3 optional to defs.
* Added proto3 optional support. Untested at the moment.
* Changes to support proto3 optional.
* Fixed real oneof count for messages with no fields.
* Fixed compile error and test.
* Added comment about why I'm commenting out the assert.
This makes both the C (.h) and C++ (.hpp) files read nicer
and keeps the core of upb C-only.
Existing users of the C++ wrappers will have to add manual
#includes of the .hpp files.
* WIP.
* Passes most tests.
* A few fixes.
* A few optimizations.
* Some more optimiation.
* Update Protobuf to v3.11.4 and Abseil to LTS 2020-02-25
* Use longjmp instead of explicit error checks at every level.
* Used macros for better documentation of ops.
* Fixed bug with map parsing. All tests are passing except a few conformance tests.
* Fixed remaining bugs, all conformance tests pass.
Also ported all of upb to a single UPB_PTR_AT() macro instead of
having multiple .c files define their own.
* Formatted with clang-format.
* Fixes to compile on Linux.
* A few more compile fixes.
* Script to benchmark changes.
* Fixed parenthesis bug in op calculation.
* Updated generated descriptor files.
* WIP.
* Removed trailing enum to fix the Linux build.
* Respect packed=false to fix conformance failures in new protobuf version.
* Small simplification.
* Fixes to decoder.
* Removed stray comment.
Co-authored-by: Yannic Bonenberger <contact@yannic-bonenberger.com>
Map parsing/serializing relies on map entries always
having a predictable order. The code that generates
layout was not respecting this in the case of string
keys and primitive values.
* WIP, first version of encoder.
* More progress on text encoder.
* A lot of progress on the text printer.
* Added textencode header file.
* Text encoder now passes conformance tests.
These aren't very stringent though, and more testing is needed.
* Print text into static buffer. Passes all conformance tests.
* Fixed kokoro errors.
* Fix for indent depth when printing map fields.
upb/json/parser.rl: In function 'end_member.isra.150':
bazel-out/k8-opt/bin/upb.c:5536:13: error: 'sel' may be used uninitialized in this function [-Werror=maybe-uninitialized]
upb_func *ret = (upb_func *)h->table[s].func;
* Removed reflection and other extraneous things from the core library.
* Added missing files and ran buildifier.
* New CMakeLists.txt.
* Made table its own cc_library() for internal usage.