Includes are now via upb/foo.h.
Files specific to the protobuf format are
now in upb/pb (the core library is concerned
with message definitions, handlers, and
byte streams, but knows nothing about any
particular serializationf format).
It can successfully parse SpeedMessage1.
Preliminary results: 750MB/s on Core2 2.4GHz.
This number is 2.5x proto2.
This isn't apples-to-apples, because
proto2 is parsing to a struct and we are
just doing stream parsing, but for apps
that are currently using proto2, this is the
improvement they would see if they could
move to stream-based processing.
Unfortunately perf-regression-test.py is
broken, and I'm not 100% sure why. It would
be nice to fix it first (to ensure that
there are no performance regressions for
the table-based decoder) but I'm really
impatient to get the JIT checked in.
This is a significant change to the upb_stream
protocol, and should hopefully be the last
significant change.
All callbacks are now registered ahead-of-time
instead of having delegated callbacks registered
at runtime, which makes it much easier to
aggressively optimize ahead-of-time (like with a
JIT).
Other impacts of this change:
- You no longer need to have loaded descriptor.proto
as a upb_def to load other descriptors! This means
the special-case code we used for bootstrapping is
no longer necessary, and we no longer need to link
the descriptor for descriptor.proto into upb.
- A client can now register any upb_value as what
will be delivered to their value callback, not
just a upb_fielddef*. This should allow for other
clients to get more bang out of the streaming
decoder.
This change unfortunately causes a bit of a performance
regression -- I think largely due to highly
suboptimal code that GCC generates when structs
are returned by value. See:
http://blog.reverberate.org/2011/03/19/when-a-compilers-slow-code-actually-bites-you/
On the other hand, once we have a JIT this should
no longer matter.
Performance numbers:
plain.parsestream_googlemessage1.upb_table: 374 -> 396 (5.88)
plain.parsestream_googlemessage2.upb_table: 616 -> 449 (-27.11)
plain.parsetostruct_googlemessage1.upb_table_byref: 268 -> 269 (0.37)
plain.parsetostruct_googlemessage1.upb_table_byval: 215 -> 204 (-5.12)
plain.parsetostruct_googlemessage2.upb_table_byref: 307 -> 281 (-8.47)
plain.parsetostruct_googlemessage2.upb_table_byval: 297 -> 272 (-8.42)
omitfp.parsestream_googlemessage1.upb_table: 423 -> 410 (-3.07)
omitfp.parsestream_googlemessage2.upb_table: 679 -> 483 (-28.87)
omitfp.parsetostruct_googlemessage1.upb_table_byref: 287 -> 282 (-1.74)
omitfp.parsetostruct_googlemessage1.upb_table_byval: 226 -> 219 (-3.10)
omitfp.parsetostruct_googlemessage2.upb_table_byref: 315 -> 298 (-5.40)
omitfp.parsetostruct_googlemessage2.upb_table_byval: 297 -> 287 (-3.37)
The symtab that contains them is now hidden, and
you can look them up by name but there is no access
to the symtab itself, so there is no risk of
mutating it (by extending it, adding other defs
to it, etc).
Unfortunately this degrades hash table lookup performance by
about 8%, which affects the streaming benchmark for googlemessage1
by about 5%. We could get this back at the cost of some memory,
but it would be nice to avoid that.
Sources and sinks communicate by means of a
upb_handlers object, which encapsulates a set of
handler callbacks and will possibly offer richer
semantics in the future like giving specific
fields different callbacks.
The upb_handlers protocol supports delegation, so
sets of handlers can be written in reusable ways.
For example, if a set of handlers is written to
handle a specific .proto type, those handlers can
be used whether that type is at the top level or
whether it is a sub-message of a higher-level type.
Delegation allows the streaming protocol to
properly compose.
The "field" entry was only being used to determine
whether we were inside a group, but the "end_offset"
member contains enough information to tell us that.