Merge pull request #404 from haberman/readme

Updated some docs and removed/rearranged some obsolete stuff.
pull/13171/head
Joshua Haberman 3 years ago committed by GitHub
commit 723c9723ea
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
  1. 8
      BUILD
  2. 42
      CONTRIBUTING.md
  3. 72
      DESIGN.md
  4. 150
      README.md
  5. 6
      bazel/BUILD
  6. 0
      bazel/amalgamate.py
  7. 5
      bazel/build_defs.bzl
  8. 46
      examples/bazel/BUILD
  9. 14
      examples/bazel/WORKSPACE
  10. 7
      examples/bazel/foo.proto
  11. 43
      examples/bazel/test_binary.c

@ -233,18 +233,12 @@ cc_library(
# copybara:strip_for_google3_begin
py_binary(
name = "amalgamate",
srcs = ["tools/amalgamate.py"],
)
upb_amalgamation(
name = "gen_amalgamation",
outs = [
"upb.c",
"upb.h",
],
amalgamator = ":amalgamate",
libs = [
":upb",
":fastdecode",
@ -267,7 +261,6 @@ upb_amalgamation(
"php-upb.c",
"php-upb.h",
],
amalgamator = ":amalgamate",
libs = [
":upb",
":fastdecode",
@ -293,7 +286,6 @@ upb_amalgamation(
"ruby-upb.c",
"ruby-upb.h",
],
amalgamator = ":amalgamate",
libs = [
":upb",
":fastdecode",

@ -1,7 +1,37 @@
## <a name="cla"></a> Signing the CLA
Please sign the [Google Contributor License Agreement
(CLA)](https://cla.developers.google.com/)
before sending pull requests. For any code changes to be
accepted, the CLA must be signed. It's a quick process, I
promise!
# How to Contribute
We'd love to accept your patches and contributions to this project. There are
just a few small guidelines you need to follow.
## Get in touch
If your idea will take you more than, say, 30 minutes to
implement, please get in touch first via the issue tracker
to touch base about your plan. That will give an
opportunity for early feedback and help avoid wasting your
time.
## Contributor License Agreement
Contributions to this project must be accompanied by a Contributor License
Agreement. You (or your employer) retain the copyright to your contribution;
this simply gives us permission to use and redistribute your contributions as
part of the project. Head over to <https://cla.developers.google.com/> to see
your current agreements on file or to sign a new one.
You generally only need to submit a CLA once, so if you've already submitted one
(even if it was for a different project), you probably don't need to do it
again.
## Code Reviews
All submissions, including submissions by project members, require review. We
use GitHub pull requests for this purpose. Consult
[GitHub Help](https://help.github.com/articles/about-pull-requests/) for more
information on using pull requests.
## Community Guidelines
This project follows [Google's Open Source Community
Guidelines](https://opensource.google/conduct/).

@ -1,72 +0,0 @@
μpb Design
----------
μpb has the following design goals:
- C89 compatible.
- small code size (both for the core library and generated messages).
- fast performance (hundreds of MB/s).
- idiomatic for C programs.
- easy to wrap in high-level languages (Python, Ruby, Lua, etc) with
good performance and all standard protobuf features.
- hands-off about memory management, allowing for easy integration
with existing VMs and/or garbage collectors.
- offers binary ABI compatibility between apps, generated messages, and
the core library (doesn't require re-generating messages or recompiling
your application when the core library changes).
- provides all features that users expect from a protobuf library
(generated messages in C, reflection, text format, etc.).
- layered, so the core is small and doesn't require descriptors.
- tidy about symbol references, so that any messages or features that
aren't used by a C program can have their code GC'd by the linker.
- possible to use protobuf binary format without leaking message/field
names into the binary.
μpb accomplishes these goals by keeping a very small core that does not contain
descriptors. We need some way of knowing what fields are in each message and
where they live, but instead of descriptors, we keep a small/lightweight summary
of the .proto file. We call this a `upb_msglayout`. It contains the bare
minimum of what we need to know to parse and serialize protobuf binary format
into our internal representation for messages, `upb_msg`.
The core then contains functions to parse/serialize a message, given a `upb_msg*`
and a `const upb_msglayout*`.
This approach is similar to [nanopb](https://github.com/nanopb/nanopb) which
also compiles message definitions to a compact, internal representation without
names. However nanopb does not aim to be a fully-featured library, and has no
support for text format, JSON, or descriptors. μpb is unique in that it has a
small core similar to nanopb (though not quite as small), but also offers a
full-featured protobuf library for applications that want reflection, text
format, JSON format, etc.
Without descriptors, the core doesn't have access to field names, so it cannot
parse/serialize to protobuf text format or JSON. Instead this functionality
lives in separate modules that depend on the module implementing descriptors.
With the descriptor module we can parse/serialize binary descriptors and
validate that they follow all the rules of protobuf schemas.
To provide binary compatibility, we version the structs that generated messages
use to create a `upb_msglayout*`. The current initializers are
`upb_msglayout_msginit_v1`, `upb_msglayout_fieldinit_v1`, etc. Then
`upb_msglayout*` uses these as its internal representation. If upb changes its
internal representation for a `upb_msglayout*`, it will also include code to
convert the old representation to the new representation. This will use some
more memory/CPU at runtime to convert between the two, but apps that statically
link μpb will never need to worry about this.
TODO
----
1. revise our generated code until it is in a state where we feel comfortable
committing to API/ABI stability for it. In particular there is an open
question of whether non-ABI-compatible field accesses should have a
fastpath different from the ABI-compatible field access.
1. Add missing features (maps, extensions, unknown fields).
1. Flesh out C++ wrappers.
1. *(lower-priority)*: revise all of the existing encoders/decoders and
handlers. We probably will want to keep handlers, since they let us decouple
encoders/decoders from `upb_msg`, but we need to simplify all of that a LOT.
Likely we will want to make handlers only per-message instead of per-field,
except for variable-length fields.

@ -1,124 +1,62 @@
# μpb - a small protobuf implementation in C
|Platform|Build Status|
|--------|------------|
|macOS|[![Build Status](https://storage.googleapis.com/upb-kokoro-results/status-badge/macos.png)](https://fusion.corp.google.com/projectanalysis/summary/KOKORO/prod%3Aupb%2Fmacos%2Fcontinuous)|
|ubuntu|[![Build Status](https://storage.googleapis.com/upb-kokoro-results/status-badge/ubuntu.png)](https://fusion.corp.google.com/projectanalysis/summary/KOKORO/prod%3Aupb%2Fubuntu%2Fcontinuous)|
μpb (often written 'upb') is a small protobuf implementation written in C.
upb generates a C API for creating, parsing, and serializing messages
as declared in `.proto` files. upb is heavily arena-based: all
messages always live in an arena (note: the arena can live in stack or
static memory if desired). Here is a simple example:
```c
#include "conformance/conformance.upb.h"
void foo(const char* data, size_t size) {
upb_arena *arena;
/* Generated message type. */
conformance_ConformanceRequest *request;
conformance_ConformanceResponse *response;
arena = upb_arena_new();
request = conformance_ConformanceRequest_parse(data, size, arena);
response = conformance_ConformanceResponse_new(arena);
switch (conformance_ConformanceRequest_payload_case(request)) {
case conformance_ConformanceRequest_payload_protobuf_payload: {
upb_strview payload = conformance_ConformanceRequest_protobuf_payload(request);
// ...
break;
}
case conformance_ConformanceRequest_payload_NOT_SET:
fprintf(stderr, "conformance_upb: Request didn't have payload.\n");
break;
default: {
static const char msg[] = "Unsupported input format.";
conformance_ConformanceResponse_set_skipped(
response, upb_strview_make(msg, sizeof(msg)));
break;
}
}
/* Frees all messages on the arena. */
upb_arena_free(arena);
}
```
# μpb: small, fast C protos
API and ABI are both subject to change! Please do not distribute
as a shared library for this reason (for now at least).
μpb (often written 'upb') is a small
[protobuf](https://github.com/protocolbuffers/protobuf) implementation written
in C.
## Using upb in your project
upb is the core runtime for protobuf languages extensions in
[Ruby](https://github.com/protocolbuffers/protobuf/tree/master/ruby),
[PHP](https://github.com/protocolbuffers/protobuf/tree/master/php), and (soon)
Python.
Currently only Bazel is supported (CMake support is partial and incomplete
but full CMake support is an eventual goal).
While upb offers a C API, the C API & ABI **are not stable**. For this reason,
upb is not generally offered as a C library for direct consumption, and there
are no releases.
To use upb in your Bazel project, first add upb to your `WORKSPACE` file,
either as a `git_repository()` or as a `new_local_repository()` with a
Git Submodule. (For an example, see `examples/bazel/ in this repo).
## Features
```python
# Add this to your WORKSPACE file.
load("@bazel_tools//tools/build_defs/repo:git.bzl", "git_repository")
upb has comparable speed to protobuf C++, but is an order of magnitude smaller
in code size.
git_repository(
name = "upb",
remote = "https://github.com/protocolbuffers/upb.git",
commit = "d16bf99ac4658793748cda3251226059892b3b7b",
)
Like the main protobuf implementation in C++, it supports:
load("@upb//bazel:workspace_deps.bzl", "upb_deps")
- a generated API (in C)
- reflection
- binary & JSON wire formats
- text format serialization
- all standard features of protobufs (oneofs, maps, unknown fields, etc.)
- full conformance with the protobuf conformance tests
upb_deps()
```
upb also supports some features that C++ does not:
Then in your BUILD file you can add `upb_proto_library()` rules that
generate code for a corresponding `proto_library()` rule. For
example:
```python
# Add this to your BUILD file.
load("@upb//bazel:upb_proto_library.bzl", "upb_proto_library")
proto_library(
name = "foo_proto",
srcs = ["foo.proto"],
)
upb_proto_library(
name = "foo_upbproto",
deps = [":foo_proto"],
)
cc_binary(
name = "test_binary",
srcs = ["test_binary.c"],
deps = [":foo_upbproto"],
)
```
- **optional reflection:** generated messages are agnostic to whether
reflection will be linked in or not.
- **no global state:** no pre-main registration or other global state.
- **fast reflection-based parsing:** messages loaded at runtime parse
just as fast as compiled-in messages.
However there are some features it does not support:
Then in your `.c` file you can #include the generated header:
- proto2 extensions (coming soon!)
- text format parsing
- deep descriptor verification: upb's descriptor validation is not as exhaustive
as `protoc`.
```c
#include "foo.upb.h"
## Install
/* Insert code that uses generated types. */
For Ruby, use [RubyGems](https://rubygems.org/gems/google-protobuf):
```
$ gem install google-protobuf
```
## Lua bindings
For PHP, use [PECL](https://pecl.php.net/package/protobuf):
This repo has some Lua bindings for the core library. These are
experimental and very incomplete. These are currently included in
order to validate that the C API is suitable for wrapping. As the
project matures these Lua bindings may become publicly available.
```
$ sudo pecl install protobuf
```
## Contact
## Contributing
Author: Josh Haberman ([jhaberman@gmail.com](mailto:jhaberman@gmail.com),
[haberman@google.com](mailto:haberman@google.com))
Please see [CONTRIBUTING.md](CONTRIBUTING.md).

@ -22,3 +22,9 @@
# ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
# SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
py_binary(
name = "amalgamate",
srcs = ["amalgamate.py"],
visibility = ["//:__pkg__"],
)

@ -149,15 +149,16 @@ def _upb_amalgamation(ctx):
outputs = ctx.outputs.outs,
arguments = [ctx.bin_dir.path + "/", ctx.attr.prefix] + [f.path for f in srcs] + ["-I" + root for root in _get_real_roots(inputs)],
progress_message = "Making amalgamation",
executable = ctx.executable.amalgamator,
executable = ctx.executable._amalgamator,
)
return []
upb_amalgamation = rule(
attrs = {
"amalgamator": attr.label(
"_amalgamator": attr.label(
executable = True,
cfg = "host",
default = "//bazel:amalgamate",
),
"prefix": attr.string(
default = "",

@ -1,46 +0,0 @@
# Copyright (c) 2009-2021, Google LLC
# All rights reserved.
#
# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions are met:
# * Redistributions of source code must retain the above copyright
# notice, this list of conditions and the following disclaimer.
# * Redistributions in binary form must reproduce the above copyright
# notice, this list of conditions and the following disclaimer in the
# documentation and/or other materials provided with the distribution.
# * Neither the name of Google LLC nor the
# names of its contributors may be used to endorse or promote products
# derived from this software without specific prior written permission.
#
# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
# ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
# WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
# DISCLAIMED. IN NO EVENT SHALL Google LLC BE LIABLE FOR ANY
# DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
# (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
# LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
# ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
# SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
load("@rules_proto//proto:defs.bzl", "proto_library")
load("@upb//bazel:upb_proto_library.bzl", "upb_proto_library")
licenses(["notice"])
proto_library(
name = "foo_proto",
srcs = ["foo.proto"],
)
upb_proto_library(
name = "foo_upbproto",
deps = [":foo_proto"],
)
cc_binary(
name = "test_binary",
srcs = ["test_binary.c"],
copts = ["-std=c99"],
deps = [":foo_upbproto"],
)

@ -1,14 +0,0 @@
workspace(name = "upb_example")
load("@bazel_tools//tools/build_defs/repo:git.bzl", "git_repository")
git_repository(
name = "upb",
remote = "https://github.com/protocolbuffers/upb.git",
commit = "d16bf99ac4658793748cda3251226059892b3b7b",
)
load("@upb//bazel:workspace_deps.bzl", "upb_deps")
upb_deps()

@ -1,7 +0,0 @@
syntax = "proto2";
message Foo {
optional int64 time = 1;
optional string greeting = 2;
}

@ -1,43 +0,0 @@
/*
* Copyright (c) 2009-2021, Google LLC
* All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions are met:
* * Redistributions of source code must retain the above copyright
* notice, this list of conditions and the following disclaimer.
* * Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in the
* documentation and/or other materials provided with the distribution.
* * Neither the name of Google LLC nor the
* names of its contributors may be used to endorse or promote products
* derived from this software without specific prior written permission.
*
* THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
* ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
* WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
* DISCLAIMED. IN NO EVENT SHALL Google LLC BE LIABLE FOR ANY
* DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
* (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
* LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
* ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
* (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
* SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
*/
#include <time.h>
#include "examples/bazel/foo.upb.h"
int main() {
upb_arena *arena = upb_arena_new();
Foo* foo = Foo_new(arena);
const char greeting[] = "Hello, World!\n";
Foo_set_time(foo, time(NULL));
/* Warning: the proto will not copy this, the string data must outlive
* the proto. */
Foo_set_greeting(foo, upb_strview_makez(greeting));
upb_arena_free(arena);
}
Loading…
Cancel
Save