This is a sync of our internal developing of JSON parsing and
serialization. It implements native understanding of MapEntry
submessages, so that map fields with (key, value) pairs are serialized
as JSON maps (objects) natively rather than as arrays of objects with
'key' and 'value' fields. The parser also now understands how to emit
handler calls corresponding to MapEntry objects when processing a map
field.
This sync also picks up a bugfix in `table.c` to handle an alloc-failed
case.
This change adds support for a OneofDef (upb_oneofdef), which represents
a 'oneof' as introduced by Protocol Buffers. This is semantically a
union type that contains fields and in turn may be added to a
MessageDef. This change does not alter parsing or the handler
abstraction in any way, because a oneof has impact only at a higher
semantic level (i.e., any sort of storage of the fields in a message
object), which is user-specific with respect to upb.
There are a number of tweaks to get this to work:
- The #include dependence graph wasn't quite complete, and I had to add
a few #includes to get the tool to work.
- I had to change a number of symbol names to avoid conflicts between
'static' definitions in different .c files. This could be avoided if
the tool were smart enough to rename static symbols to have unique
prefixes instead, but (i) this requires semantic understanding of C,
and (ii) the macro-defined static functions (e.g., handlers for
primitive types in several places) would probably trip this up.
Verified that the resulting upb.h/upb.c compiles and doesn't have any
unresolved references.
- Added a JSON test that round-trips (parses then re-serializes) several
test messages, ensuring that the re-serialized form matches the
original exactly.
- Added support for printing and parsing symbolic enum names (rather than
integer values) in JSON.
- Updated JSON printer to properly handle string fields that come in
multiple pieces. ('bytes' fields still do not support this, and this
work is more challenging because it requires making the base64 encoder
resumable. Base64 encoding is not separable at an input-byte
granularity, unlike string escaping.)
- Fixed a < vs. <= bug in UTF-8 encoding generation (oops).
This avoids requiring protoc to be installed
just to build/run the basic tests.
When upb has its own .proto file parser we should
be able to remove this precompiled version from the
repository.
Notable changes:
- We now only build things by default that require
no dependencies. So you can build upb even if you
don't have Lua or Google protobuf installed.
- Checked in a pre-built version of the JIT, so you
don't need Lua installed at build time to run DynASM.
It will still notice if you change the .dasc file and
attempt to re-run DynASM in that case.
- The build system now builds all modules of upb into
separate libraries, reflecting the modularity that
is already inherent in upb's design. This should
make it easier to trim the fat.
- removed the GDB JIT interface. I wasn't using it
much; using a .so is easier and more robust.
- rewritten decoder; interpreted decoder is bytecode-based,
JIT decoder no longer falls back to the interpreter.
- C++ improvements: C++11-compatible iterators, upb::reffed_ptr
for RAII refcounting, better upcast/downcast support.
- removed the gross upb_value abstraction from public upb.h.