From ccbc80117ccd98ad529c0a29ec801433d06ab3d1 Mon Sep 17 00:00:00 2001 From: Protobuf Team Bot Date: Wed, 6 Sep 2023 08:16:26 -0700 Subject: [PATCH] Adds "Protobuf Editions Design: Features" to the GitHub code repository. PiperOrigin-RevId: 563109674 --- docs/design/editions/README.md | 5 +- .../protobuf-editions-design-features.md | 317 ++++++++++++++++++ 2 files changed, 320 insertions(+), 2 deletions(-) create mode 100644 docs/design/editions/protobuf-editions-design-features.md diff --git a/docs/design/editions/README.md b/docs/design/editions/README.md index 66671a4afd..5ecf5eb1fd 100644 --- a/docs/design/editions/README.md +++ b/docs/design/editions/README.md @@ -21,5 +21,6 @@ The following topics are in this repository: * [What are Protobuf Editions?](what-are-protobuf-editions.md) * [Life of an Edition](life-of-an-edition.md) -* [Editions: Life of a FeatureSet](editions-life-of-a-featureset.md) -* [Protobuf Design: Options Attributes](protobuf-design-options-attributes.md) +* [Protobuf Editions Design: Features](protobuf-editions-design-features.md) +* [Editions: Life of a Featureset](editions-life-of-a-featureset.md) +* [Edition Naming](edition-naming.md) diff --git a/docs/design/editions/protobuf-editions-design-features.md b/docs/design/editions/protobuf-editions-design-features.md new file mode 100644 index 0000000000..f2a98f5d5e --- /dev/null +++ b/docs/design/editions/protobuf-editions-design-features.md @@ -0,0 +1,317 @@ +# Protobuf Editions Design: Features + +**Author:** [@haberman](https://github.com/haberman), +[@fowles](https://github.com/fowles) + +**Approved:** 2022-10-13 + +A proposal to use custom options as our way of defining and representing +features. + +## Background + +The [Protobuf Editions](what-are-protobuf-editions.md) project uses "editions" +to allow Protobuf to safely evolve over time. An edition is formally a set of +"features" with a default value per feature. The set of features or a default +value for a feature can only change with the introduction of a new edition. +Features define the specific points of change and evolution on a per entity +basis within a .proto file (entities being files, messages, fields, or any other +lexical element in the file). The design in this doc supplants an earlier design +which used strings for feature definition. + +Protobuf already supports +[custom options](https://protobuf.dev/programming-guides/proto2#customoptions) +and we will leverage these to provide a rich syntax without introducing new +syntactic forms into Protobuf. + +## Sample Usage + +Here is a small sample usage of features to give a flavor for how it looks + +``` +edition = "2023"; + +package experimental.users.kfm.editions; + +import "net/proto2/proto/features_cpp.proto"; + +option features.repeated_field_encoding = EXPANDED; +option features.enum = OPEN; +option features.(pb.cpp).string_field_type = STRING; +option features.(pb.cpp).namespace = "kfm::proto_experiments"; + +message Lab { + // `Mouse` is open as it inherits the file's value. + enum Mouse { + UNKNOWN_MOUSE = 0; + PINKY = 1; + THE_BRAIN = 2; + } + repeated Mouse mice = 1 [features.repeated_field_encoding = PACKED]; + + string name = 2; + string address = 3 [features.(pb.cpp).string_field_type = CORD]; + string function = 4 [features.(pb.cpp).string_field_type = STRING_VIEW]; +} + +enum ColorChannel { + // Turn off the option from the surrounding file + option features.enum = CLOSED; + + UNKNOWN_COLOR_CHANNEL = 0; + RED = 1; + BLUE = 2; + GREEN = 3; + ALPHA = 4; +} +``` + +## Language-Specific Features + +We will use extensions to manage features specific to individual code +generators. + +``` +// In net/proto2/proto/descriptor.proto: +syntax = "proto2"; +package proto2; + +message Features { + ... + extensions 1000; // for features_cpp.proto + extensions 1001; // for features_java.proto +} + +``` + +This will allow third-party code generators to use editions for their own +evolution as long as they reserve a single extension number in +`descriptor.proto`. Using this from a .proto file would look like this: + +``` +edition = "2023"; + +import "third_party/protobuf/compiler/cpp/features_cpp.proto" + +message Bar { + optional string str = 1 [features.(pb.cpp).string_field_type = true]; +} +``` + +## Inheritance + +To support inheritance, we will specify a single `Features` message that extends +every kind of option: + +``` +// In net/proto2/proto/descriptor.proto: +syntax = "proto2"; +package proto2; + +message Features { + ... +} + +message FileOptions { + optional Features features = ..; +} + +message MessageOptions { + optional Features features = ..; +} +// All the other `*Options` protos. +``` + +At the implementation level, feature inheritance is exactly the behavior of +`MergeFrom` + +``` +void InheritFrom(const Features& parent, Features* child) { + Features tmp(parent); + tmp.MergeFrom(child); + child->Swap(&tmp); +} +``` + +which means that custom backends will be able to faithfully implement +inheritance without difficulty. + +## Target Attributes + +While inheritance can be useful for minimizing changes or pushing defaults +broadly, it can be overused in ways that would make simple refactoring of +`.proto` files harder. Additionally, not all features are meaningful on all +entities (for example `features.enum = OPEN` is meaningless on a field). + +To avoid these issues, we will introduce "target" attributes on features +(similar in concept to the "target" attribute on Java annotations). + +``` +enum FeatureTargetType { + FILE = 0; + MESSAGE = 1; + ENUM = 2; + FIELD = 3; + ... +}; +``` + +These will restrict the set of entities to which a feature may be attached. + +``` +message Features { + ... + + enum EnumType { + OPEN = 0; + CLOSED = 1; + } + optional EnumType enum = 2 [ + target = ENUM + ]; +} +``` + +## Retention + +To reduce the size of descriptors in protobuf runtimes, features will be +permitted to specify retention rules (again similar in concept to "retention" +attributes on Java annotations). + +``` +enum FeatureRetention { + SOURCE = 0; + RUNTIME = 1; +} +``` + +## Specification of an Edition + +An edition is, effectively, an instance of the `Feature` proto which forms the +base for performing inheritance using `MergeFrom`. This allows `protoc` and +specific language generators to leverage existing formats (like text-format) for +specifying the value of features at a given edition. + +Although naively we would think that field defaults are the right approach, this +does not quite work, because the default is editions-dependent. Instead, we +propose adding the following to the protoc-provided `features.proto`: + +``` +message Features { + // ... + message EditionDefault { + optional string edition = 1; + optional string default = 2; // Textproto value. + } + + extend FieldOptions { + // Ideally this is a map, but map extensions are not permitted... + repeated EditionDefault edition_defaults = 9001; + } +} +``` + +To build the edition defaults for a particular edition `current` in the context +of a particular file `foo.proto`, we execute the following algorithm: + +1. Construct a new `Features feats;`. +2. For each field in `Features`, take the value of the + `Features.edition_defaults` option (call it `defaults`), and sort it by the + value of `edition` (per the total order for edition names, + [Life of an Edition](life-of-an-edition.md)). +3. Binsearch for the latest edition in `defaults` that is earlier or equal to + `current`. + 1. If the field is of singular, scalar type, use that value as the value of + the field in `feats`. + 2. Otherwise, the value of the field in `feats` is given by merging all of + the values less than `current`, starting from the oldest edition. +4. For the purposes of this algorithm, `Features`'s fields all behave as if + they were `required`; failure to find a default explicitly via the editions + default search mechanism should result in a compilation error, because it + means the file's edition is too old. +5. For each extension of `Features` that is visible from `foo.proto` via + imports, perform the same algorithm as above to construct the editions + default for that extension message, and add it to `feat`. + +This algorithm has the following properties: + +* Language-scoped features are discovered via imports, which is how they need + to be imported for use in a file in the first place. +* Every value is set explicitly, so we correctly reject too-old files. +* Files from "the future" will not be rejected out of hand by the algorithm, + allowing us to provide a flag like `--allow-experimental-editions` for ease + of allowing backends to implement a new edition. + +## Edition Zero Features + +Putting the parts together, we can offer a potential `Feature` message for +edition zero: [Edition Zero Features](edition-zero-features.md). + +``` +message Features { + enum FieldPresence { + EXPLICIT = 0; + IMPLICIT = 1; + LEGACY_REQUIRED = 2; + } + optional FieldPresence field_presence = 1 [ + retention = RUNTIME, + target = FIELD, + (edition_defaults) = { + edition: "2023", default: "EXPLICIT" + } + ]; + + enum EnumType { + OPEN = 0; + CLOSED = 1; + } + optional EnumType enum = 2 [ + retention = RUNTIME, + target = ENUM, + (edition_defaults) = { + edition: "2023", default: "OPEN" + } + ]; + + enum RepeatedFieldEncoding { + PACKED = 0; + EXPANDED = 1; + } + optional RepeatedFieldEncoding repeated_field_encoding = 3 [ + retention = RUNTIME, + target = FIELD, + (edition_defaults) = { + edition: "2023", default: "PACKED" + } + ]; + + enum StringFieldValidation { + REQUIRED = 0; + HINT = 1; + SKIP = 2; + } + optional StringFieldValidation string_field_validation = 4 [ + retention = RUNTIME, + target = FIELD, + (edition_defaults) = { + edition: "2023", default: "REQUIRED" + } + ]; + + enum MessageEncoding { + LENGTH_PREFIXED = 0; + DELIMITED = 1; + } + optional MessageEncoding message_encoding = 5 [ + retention = RUNTIME, + target = FIELD, + (edition_defaults) = { + edition: "2023", default: "LENGTH_PREFIXED" + } + ]; + + extensions 1000; // for features_cpp.proto + extensions 1001; // for features_java.proto +} +```