* Implement json decoding for Any message.
type url may not appear as the first value in json. As a result,
other data cannot be resolved before resolving type url. To solve that,
this change caches the start and end position of unparsed values and
resolve them in end_any_object when type url has been resolved.
* Handle Any in switch
* Update json parser size
* Fix comments
* Sync upstream
* Add dependency on upb_pb for upb_json
* Debug failed test
* Fix cmake
* Update test generated files
* Remove debug tests
* Fix json ignore unknown
Previously, there were several problems with ignoring unknown in json.
1) After finding a field is unknown, the parser's state is not changed. Thus, there is no way to distinguish whether the parser is dealing with an unknown field or it's just a top level message.
2) Several method didn't respect unknown field, e.g., start_object, end_bool, start_array.
* Update json parser size
* Update json parser size
* Fix json encoding for wrappers, ListValue, Struct and Value.
* Add well_known_type field in upb_msgdef to specify type of well known messages.
* Remove comma at end of enum definition.
* Group number wrappers
* Fix comments
* Refactoring to use is_wellknown_{msg/field}
* Split upb::Arena/upb::Allocator from upb::Environment.
This will allow arenas and allocators to be used
independently of environments, which will be important
for an upcoming change (a message representation).
Overall this design feels cleaner that the previous
Environment/SeededAllocator design.
As part of this change, moved all allocations in upb
to use a global allocator instead of hard-coding
malloc/free. This will allow injecting OOM faults
for more robust testing.
One place that doesn't use the global allocator is
the tracked ref code. Instead of its previous approach
of CHECK_OOM() after every malloc() or table insert, it
simply uses an allocator that does this automatically.
I moved Allocator/Arena/Environment into upb.h.
This seems principled since these are the only types
in upb whose size is directly exposed to users, since
they form the basis of memory allocation strategy.
* Cleaned up some header includes and fixed more malloc -> upb_gmalloc().
* Changes from PR review.
* Don't use UINTPTR_MAX or UINT64_MAX.
* Punt on adding line/file for now.
* We actually can't store (uint64_t)-1, update comment and test.
* JSON parser: always accept both name variants, flag controls which we generate.
My previous commit was based on wrong information about the
proto3 JSON spec. It turns out we need to accept both field
name formats all the time, and a runtime flag should control
which we generate.
* Documented the preserve_proto_fieldnames option.
* Fix bugs in PR.
This was previously broken -- it would try to set
the status object on the parser, but the pointer
was never initialized. Also it didn't report
errors properly to the environment object.
A large part of this change contains surface-level
porting, like moving variable declarations to the
top of the block.
However there are a few more substantial things too:
- moved internal-only struct definitions to a separate
file (structdefs.int.h), for greater encapsulation
and ABI compatibility.
- removed the UPB_UPCAST macro, since it requires access
to the internal-only struct definitions. Replaced uses
with calls to inline, type-safe casting functions.
- removed the UPB_DEFINE_CLASS/UPB_DEFINE_STRUCT macros.
Class and struct definitions are now more explicit -- you
get to see the actual class/struct keywords in the source.
The casting convenience functions have been moved into
UPB_DECLARE_DERIVED_TYPE() and UPB_DECLARE_DERIVED_TYPE2().
- the new way that we duplicate base methods in derived types
is also more convenient and requires less duplication.
It is also less greppable, but hopefully that is not
too big a problem.
Compiler flags (-std=c89 -pedantic) should help to rigorously
enforce that the code is free of C99-isms.
A few functions are not available in C89 (strtoll). There
are temporary, hacky solutions in place.
This is a sync of our internal developing of JSON parsing and
serialization. It implements native understanding of MapEntry
submessages, so that map fields with (key, value) pairs are serialized
as JSON maps (objects) natively rather than as arrays of objects with
'key' and 'value' fields. The parser also now understands how to emit
handler calls corresponding to MapEntry objects when processing a map
field.
This sync also picks up a bugfix in `table.c` to handle an alloc-failed
case.
This change adds support for a OneofDef (upb_oneofdef), which represents
a 'oneof' as introduced by Protocol Buffers. This is semantically a
union type that contains fields and in turn may be added to a
MessageDef. This change does not alter parsing or the handler
abstraction in any way, because a oneof has impact only at a higher
semantic level (i.e., any sort of storage of the fields in a message
object), which is user-specific with respect to upb.
There are a number of tweaks to get this to work:
- The #include dependence graph wasn't quite complete, and I had to add
a few #includes to get the tool to work.
- I had to change a number of symbol names to avoid conflicts between
'static' definitions in different .c files. This could be avoided if
the tool were smart enough to rename static symbols to have unique
prefixes instead, but (i) this requires semantic understanding of C,
and (ii) the macro-defined static functions (e.g., handlers for
primitive types in several places) would probably trip this up.
Verified that the resulting upb.h/upb.c compiles and doesn't have any
unresolved references.
- Added a JSON test that round-trips (parses then re-serializes) several
test messages, ensuring that the re-serialized form matches the
original exactly.
- Added support for printing and parsing symbolic enum names (rather than
integer values) in JSON.
- Updated JSON printer to properly handle string fields that come in
multiple pieces. ('bytes' fields still do not support this, and this
work is more challenging because it requires making the base64 encoder
resumable. Base64 encoding is not separable at an input-byte
granularity, unlike string escaping.)
- Fixed a < vs. <= bug in UTF-8 encoding generation (oops).