* Put oneofs in the same table as fields.
Oneofs and fields are not allowed to have names that conflict,
so we might as well put them all in the same table. This also
allows an efficient operation that looks for both fields and
oneofs in a single lookup.
Added support for OneofDef to Lua to allow testing of this.
* Addressed PR comments.
* Changed schema for JSON test to be defined in a .proto file.
Before we had lots of code to build these schemas manually,
but this was verbose and made it difficult to add to the
schema easily. Now we can just write a .proto file and
adding fields is easy.
To avoid making the tests depend on upbc (and thus Lua)
we check in the generated schema.
* Made protobuf-compiler a dependency of "make genfiles."
* For genfiles download recent protoc that can handle proto3.
* Only use new protoc for genfiles.
* Split upb::Arena/upb::Allocator from upb::Environment.
This will allow arenas and allocators to be used
independently of environments, which will be important
for an upcoming change (a message representation).
Overall this design feels cleaner that the previous
Environment/SeededAllocator design.
As part of this change, moved all allocations in upb
to use a global allocator instead of hard-coding
malloc/free. This will allow injecting OOM faults
for more robust testing.
One place that doesn't use the global allocator is
the tracked ref code. Instead of its previous approach
of CHECK_OOM() after every malloc() or table insert, it
simply uses an allocator that does this automatically.
I moved Allocator/Arena/Environment into upb.h.
This seems principled since these are the only types
in upb whose size is directly exposed to users, since
they form the basis of memory allocation strategy.
* Cleaned up some header includes and fixed more malloc -> upb_gmalloc().
* Changes from PR review.
* Don't use UINTPTR_MAX or UINT64_MAX.
* Punt on adding line/file for now.
* We actually can't store (uint64_t)-1, update comment and test.
* JSON parser: always accept both name variants, flag controls which we generate.
My previous commit was based on wrong information about the
proto3 JSON spec. It turns out we need to accept both field
name formats all the time, and a runtime flag should control
which we generate.
* Documented the preserve_proto_fieldnames option.
* Fix bugs in PR.
It is entirely optional: MessageDef/EnumDef can still exist
on their own. But this can represent a def's file when it is
desirable to do so (eg. for code generators).
This approach will require that we change the way we handle
extensions. But I think it will be a good change overall.
Specifically, we previously handled extensions by duplicating
the extended message and then adding the extension as a regular
field to the duplicated message. This required also duplicating
any messages that could reach the extended message.
In the new world we will need a way of declaring and looking up
extensions separately from the message being extended.
This change also involves some notable changes to the generated
code:
- files are now called foo.upbdefs.h instead of foo.upb.h.
This reflects the fact that we might possibly generate several
different output files for a .proto file, and this one is just
for defs.
- we no longer generate selectors in the .h file.
- the upbdefs.c no longer vends a SymbolTable. Now it vends the
individual messages (and possibly a FileDef later). I think this
will compose better once we can generate files where one
generated files imports another.
We also make the descriptor reader vend a list of FileDefs now.
This is the best conceptual match for parsing a FileDescriptorSet.
It is clear now that oneofs are different form other
defs in one important way: they are not top-level
constructs that can stand on their own in a SymbolTable,
for example. If you are iterating over a list of Defs
in a SymbolTable, there is no chance you will run into
a oneof. To reflect this reality, OneofDef no longer
derives from Def, and the UPB_DEF_ONEOF is no longer
an enum value that needs to be handled.
https://google.github.io/styleguide/cppguide.html#Constructor_Initializer_Lists
```
// If the signature and initializer list are not all on one line,
// you must wrap before the colon and indent 4 spaces:
MyClass::MyClass(int var)
: some_var_(var), some_other_var_(var + 1) {
DoSomething();
}
```
and
```
// As with any other code block, the close curly can be on the same
// line as the open curly, if it fits.
MyClass::MyClass(int var)
: some_var_(var) {}
```
$ make Q= googlepb
g++ -O3 -std=c++98 -pedantic -Wno-long-long -Wall -Wextra -Wpointer-arith -Wno-unused-private-field -I. -DNDEBUG -c -o obj/upb/bindings/googlepb/bridge.o upb/bindings/googlepb/bridge.cc
In file included from ./upb/handlers.h:22,
from ./upb/bindings/googlepb/bridge.h:42,
from upb/bindings/googlepb/bridge.cc:8:
./upb/def.h: In constructor ‘upb::Pointer<upb::Def>::Pointer(upb::Def*)’:
./upb/def.h:39: error: class ‘upb::Pointer<upb::Def>’ does not have any field named ‘PointerBase’
./upb/def.h:39: error: no matching function for call to ‘upb::PointerBase<upb::Def, upb::RefCounted>::PointerBase()’
./upb/upb.h:246: note: candidates are: upb::PointerBase<T, Base>::PointerBase(T*) [with T = upb::Def, Base = upb::RefCounted]
./upb/upb.h:244: note: upb::PointerBase<upb::Def, upb::RefCounted>::PointerBase(const upb::PointerBase<upb::Def, upb::RefCounted>&)
...
the generated code looks like:
template <> class Pointer<upb::Def> : public PointerBase<upb::Def, upb::RefCounted> {
public: explicit Pointer(upb::Def* ptr) : PointerBase(ptr) {}
};
..
which falls into https://gcc.gnu.org/bugzilla/show_bug.cgi?id=189
(http://stackoverflow.com/questions/8887864/template-base-constructor-call-in-member-initialization-list-error)
changing the generated code to:
template <> class Pointer<upb::Def> : public PointerBase<upb::Def, upb::RefCounted> {
public: explicit Pointer(upb::Def* ptr) : PointerBase<upb::Def, upb::RefCounted>(ptr) {}
};
makes it compile at least on 4.4.7 that we are testing with:
$ gcc -v
Using built-in specs.
Target: x86_64-redhat-linux
Configured with: ../configure --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --with-bugurl=http://bugzilla.redhat.com/bugzilla --enable-bootstrap --enable-shared --enable-threads=posix --enable-checking=release --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions
--enable-gnu-unique-object --enable-languages=c,c++,objc,obj-c++,java,fortran,ada --enable-java-awt=gtk --disable-dssi --with-java-home=/usr/lib/jvm/java-1.5.0-gcj-1.5.0.0/jre --enable-libgcj-multifile --enable-java-maintainer-mode --with-ecj-jar=/usr/share/java/eclipse-ecj.jar --disable-libjava-multilib
--with-ppl --with-cloog --with-tune=generic --with-arch_32=i686 --build=x86_64-redhat-linux
Thread model: posix
gcc version 4.4.7 20120313 (Red Hat 4.4.7-4) (GCC)
Prior to this change, if an error was returned, it would be
guaranteed to always return a short byte count. Now the two
concepts are a bit more orthogonal. There are cases where
the entire input is consumed even though an error was
encountered.
Prior to this change:
parse(buf, len) -> len + N
...would indicate that the next N bytes of the input are not
needed, *and* would advance the decoding position by this
much.
After this change:
parse(buf, len) -> len + N
parse(NULL, N) -> N
...can be used to achieve the same thing. But skipping the
N bytes is not explicitly performed by the user. A user that
doesn't want/need to skip can just say:
parsed = parse(buf, len);
if (parsed < len) {
// Handle suspend, advance stream by "parsed".
} else {
// Stream was advanced by "len" (even if parsed > len).
}
Updated unit tests to test this new behavior, and refactored
test utility code a bit to support it.
This was previously broken -- it would try to set
the status object on the parser, but the pointer
was never initialized. Also it didn't report
errors properly to the environment object.