Fleshed out DESIGN.md a bit more.

pull/13171/head
Joshua Haberman 3 years ago
parent a52fb79965
commit 975ea595f8
  1. 75
      DESIGN.md
  2. 5
      tests/test.proto

@ -160,43 +160,42 @@ together.
together, their lifetimes are irreversibly joined, such that none of the arena together, their lifetimes are irreversibly joined, such that none of the arena
blocks in either arena will be freed until *both* arenas are freed with blocks in either arena will be freed until *both* arenas are freed with
`upb_arena_free()`. This is useful when joining two messages from separate `upb_arena_free()`. This is useful when joining two messages from separate
arenas, making one a sub-message of the other. Fuse is an a very cheap arenas (making one a sub-message of the other). Fuse is an a very cheap
operation, and an unlimited number of arenas can be fused together efficiently. operation, and an unlimited number of arenas can be fused together efficiently.
## Binary Parsing and Serialzation ## Reflection and Descriptors
For binary format parsing and serializing, we use tables of fields known as upb offers a fully-featured reflection library. There are two main ways of
*mini-tables*. (The "mini" distinguishes them from "fast tables", which are using reflection:
a larger and more optimized table format used by the fast parser in
`upb/decode_fast.c`.) 1. You can load descriptors from strings using `upb_symtab_addfile()`.
The upb runtime will dynamically create mini-tables like what the upb compiler
The format of mini-tables is defined in `upb/msg_internal.h`. As the name would have created if you had compiled this type into a `.upb.c` file.
suggests, the format of these mini-tables is internal-only, consumed by the 2. You can load descriptors using generated `.upbdefs.h` interfaces.
parser and serializer, but not available for general use by users. The format This will load reflection that references the corresponding `.upb.c`
of these tables is strongly aimed at making the parser and serializer as fast mini-tables instead of building a new mini-table on the fly. This lets
as possible, and this sometimes involves changing them in backward-incompatible you reflect on generated types that are linked into your program.
ways.
upb's design for descriptors is similar to protobuf C++ in many ways, with
These tables define field numbers, field types, and offsets for every field. the following correspondences:
It is important that these offsets match the offsets used in the generated
accessors, for obvious reasons. | C++ Type | upb type |
| ---------| ---------|
The generated `.upb.h` interface exposes wrappers for parsing and serialization | `google::protobuf::DescriptorPool` | `upb_symtab`
that automatically pass the appropriate mini-tables to the parser and serializer: | `google::protobuf::Descriptor` | `upb_msgdef`
| `google::protobuf::FieldDescriptor` | `upb_fielddef`
```c | `google::protobuf::OneofDescriptor` | `upb_oneofdef`
#include "google/protobuf/descriptor.upb.h" | `google::protobuf::EnumDescriptor` | `upb_enumdef`
| `google::protobuf::FileDescriptor` | `upb_filedef`
bool ParseDescriptor(const char *pb_data, size_t pb_size) { | `google::protobuf::ServiceDescriptor` | `upb_servicedef`
// Arena where all messages, arrays, maps, etc. will be allocated. | `google::protobuf::MethodDescriptor` | `upb_methoddef`
upb_arena *arena = upb_arena_new();
Like in C++ descriptors (defs) are created by loading a
// This will pass the mini-table to upb_decode(). `google_protobuf_FileDescriptorProto` into a `upb_symtab`. This creates and
google_protobuf_DescriptorProto* descriptor = links all of the def objects corresponding to that `.proto` file, and inserts
google_protobuf_DescriptorProto_parse(pb_data, pb_size, arena); the names into a symbol table so they can be looked up by name.
bool ok = descriptor != NULL; Once you have loaded some descriptors into a `upb_symtab`, you can create and
upb_arena_free(arena); manipulate messages using the interfaces defined in `upb/reflection.h`. If your
return ok; descriptors are linked to your generated layouts using option (2) above, you can
} safely access the same messages using both reflection and generated interfaces.
```

@ -6,3 +6,8 @@ package upb_test;
message MapTest { message MapTest {
map<string, double> map_string_double = 1; map<string, double> map_string_double = 1;
} }
message MessageName {
optional int32 field1 = 1;
optional int32 field2 = 2;
}

Loading…
Cancel
Save