This change adds support for a OneofDef (upb_oneofdef), which represents
a 'oneof' as introduced by Protocol Buffers. This is semantically a
union type that contains fields and in turn may be added to a
MessageDef. This change does not alter parsing or the handler
abstraction in any way, because a oneof has impact only at a higher
semantic level (i.e., any sort of storage of the fields in a message
object), which is user-specific with respect to upb.
- Added a JSON test that round-trips (parses then re-serializes) several
test messages, ensuring that the re-serialized form matches the
original exactly.
- Added support for printing and parsing symbolic enum names (rather than
integer values) in JSON.
- Updated JSON printer to properly handle string fields that come in
multiple pieces. ('bytes' fields still do not support this, and this
work is more challenging because it requires making the base64 encoder
resumable. Base64 encoding is not separable at an input-byte
granularity, unlike string escaping.)
- Fixed a < vs. <= bug in UTF-8 encoding generation (oops).
Notable changes:
- We now only build things by default that require
no dependencies. So you can build upb even if you
don't have Lua or Google protobuf installed.
- Checked in a pre-built version of the JIT, so you
don't need Lua installed at build time to run DynASM.
It will still notice if you change the .dasc file and
attempt to re-run DynASM in that case.
- The build system now builds all modules of upb into
separate libraries, reflecting the modularity that
is already inherent in upb's design. This should
make it easier to trim the fat.
- removed the GDB JIT interface. I wasn't using it
much; using a .so is easier and more robust.
- rewritten decoder; interpreted decoder is bytecode-based,
JIT decoder no longer falls back to the interpreter.
- C++ improvements: C++11-compatible iterators, upb::reffed_ptr
for RAII refcounting, better upcast/downcast support.
- removed the gross upb_value abstraction from public upb.h.
- Better error reporting for upb::Def setters.
- error reporting for upb::Handlers setters.
- made the start/endmsg handlers a little less special-cased.
Major changes:
- Got rid of all bytestream interfaces in favor of
using regular handlers.
- new Pipeline object represents a upb pipeline, does
bump allocation internally to manage memory.
- proto2 support now can handle extensions.
Many things have changed and been simplified.
The memory-management story for upb_def and upb_handlers
is much more robust; upb_def and upb_handlers should be
fairly stable interfaces now. There is still much work
to do for the runtime component (upb_sink).
Many improvements, too many to mention. One significant
perf regression warrants investigation:
omitfp.parsetoproto2_googlemessage1.upb_jit: 343 -> 252 (-26.53)
plain.parsetoproto2_googlemessage1.upb_jit: 334 -> 251 (-24.85)
25% regression for this benchmark is bad, but since I don't think
there's any fundamental design issue that caused it I'm going to
go ahead with the commit anyway. Can investigate and fix later.
Other benchmarks were neutral or showed slight improvement.
This doesn't reflect any material change in
how I will be working on upb, and I have no
problem making this change. It's still open
source under the BSD license, and I'll still
be working on it well beyond the hours that
constitute a normal job.