# Protobuf Editions Design: Features **Author:** [@haberman](https://github.com/haberman), [@fowles](https://github.com/fowles) **Approved:** 2022-10-13 A proposal to use custom options as our way of defining and representing features. ## Background The [Protobuf Editions](what-are-protobuf-editions.md) project uses "editions" to allow Protobuf to safely evolve over time. An edition is formally a set of "features" with a default value per feature. The set of features or a default value for a feature can only change with the introduction of a new edition. Features define the specific points of change and evolution on a per entity basis within a .proto file (entities being files, messages, fields, or any other lexical element in the file). The design in this doc supplants an earlier design which used strings for feature definition. Protobuf already supports [custom options](https://protobuf.dev/programming-guides/proto2#customoptions) and we will leverage these to provide a rich syntax without introducing new syntactic forms into Protobuf. ## Sample Usage Here is a small sample usage of features to give a flavor for how it looks ``` edition = "2023"; package experimental.users.kfm.editions; import "net/proto2/proto/features_cpp.proto"; option features.repeated_field_encoding = EXPANDED; option features.enum = OPEN; option features.(pb.cpp).string_field_type = STRING; option features.(pb.cpp).namespace = "kfm::proto_experiments"; message Lab { // `Mouse` is open as it inherits the file's value. enum Mouse { UNKNOWN_MOUSE = 0; PINKY = 1; THE_BRAIN = 2; } repeated Mouse mice = 1 [features.repeated_field_encoding = PACKED]; string name = 2; string address = 3 [features.(pb.cpp).string_field_type = CORD]; string function = 4 [features.(pb.cpp).string_field_type = STRING_VIEW]; } enum ColorChannel { // Turn off the option from the surrounding file option features.enum = CLOSED; UNKNOWN_COLOR_CHANNEL = 0; RED = 1; BLUE = 2; GREEN = 3; ALPHA = 4; } ``` ## Language-Specific Features We will use extensions to manage features specific to individual code generators. ``` // In net/proto2/proto/descriptor.proto: syntax = "proto2"; package proto2; message Features { ... extensions 1000; // for features_cpp.proto extensions 1001; // for features_java.proto } ``` This will allow third-party code generators to use editions for their own evolution as long as they reserve a single extension number in `descriptor.proto`. Using this from a .proto file would look like this: ``` edition = "2023"; import "third_party/protobuf/compiler/cpp/features_cpp.proto" message Bar { optional string str = 1 [features.(pb.cpp).string_field_type = true]; } ``` ## Inheritance To support inheritance, we will specify a single `Features` message that extends every kind of option: ``` // In net/proto2/proto/descriptor.proto: syntax = "proto2"; package proto2; message Features { ... } message FileOptions { optional Features features = ..; } message MessageOptions { optional Features features = ..; } // All the other `*Options` protos. ``` At the implementation level, feature inheritance is exactly the behavior of `MergeFrom` ``` void InheritFrom(const Features& parent, Features* child) { Features tmp(parent); tmp.MergeFrom(child); child->Swap(&tmp); } ``` which means that custom backends will be able to faithfully implement inheritance without difficulty. ## Target Attributes While inheritance can be useful for minimizing changes or pushing defaults broadly, it can be overused in ways that would make simple refactoring of `.proto` files harder. Additionally, not all features are meaningful on all entities (for example `features.enum = OPEN` is meaningless on a field). To avoid these issues, we will introduce "target" attributes on features (similar in concept to the "target" attribute on Java annotations). ``` enum FeatureTargetType { FILE = 0; MESSAGE = 1; ENUM = 2; FIELD = 3; ... }; ``` These will restrict the set of entities to which a feature may be attached. ``` message Features { ... enum EnumType { OPEN = 0; CLOSED = 1; } optional EnumType enum = 2 [ target = ENUM ]; } ``` ## Retention To reduce the size of descriptors in protobuf runtimes, features will be permitted to specify retention rules (again similar in concept to "retention" attributes on Java annotations). ``` enum FeatureRetention { SOURCE = 0; RUNTIME = 1; } ``` ## Specification of an Edition An edition is, effectively, an instance of the `Feature` proto which forms the base for performing inheritance using `MergeFrom`. This allows `protoc` and specific language generators to leverage existing formats (like text-format) for specifying the value of features at a given edition. Although naively we would think that field defaults are the right approach, this does not quite work, because the default is editions-dependent. Instead, we propose adding the following to the protoc-provided `features.proto`: ``` message Features { // ... message EditionDefault { optional string edition = 1; optional string default = 2; // Textproto value. } extend FieldOptions { // Ideally this is a map, but map extensions are not permitted... repeated EditionDefault edition_defaults = 9001; } } ``` To build the edition defaults for a particular edition `current` in the context of a particular file `foo.proto`, we execute the following algorithm: 1. Construct a new `Features feats;`. 2. For each field in `Features`, take the value of the `Features.edition_defaults` option (call it `defaults`), and sort it by the value of `edition` (per the total order for edition names, [Life of an Edition](life-of-an-edition.md)). 3. Binsearch for the latest edition in `defaults` that is earlier or equal to `current`. 1. If the field is of singular, scalar type, use that value as the value of the field in `feats`. 2. Otherwise, the value of the field in `feats` is given by merging all of the values less than `current`, starting from the oldest edition. 4. For the purposes of this algorithm, `Features`'s fields all behave as if they were `required`; failure to find a default explicitly via the editions default search mechanism should result in a compilation error, because it means the file's edition is too old. 5. For each extension of `Features` that is visible from `foo.proto` via imports, perform the same algorithm as above to construct the editions default for that extension message, and add it to `feat`. This algorithm has the following properties: * Language-scoped features are discovered via imports, which is how they need to be imported for use in a file in the first place. * Every value is set explicitly, so we correctly reject too-old files. * Files from "the future" will not be rejected out of hand by the algorithm, allowing us to provide a flag like `--allow-experimental-editions` for ease of allowing backends to implement a new edition. ## Edition Zero Features Putting the parts together, we can offer a potential `Feature` message for edition zero: [Edition Zero Features](edition-zero-features.md). ``` message Features { enum FieldPresence { EXPLICIT = 0; IMPLICIT = 1; LEGACY_REQUIRED = 2; } optional FieldPresence field_presence = 1 [ retention = RUNTIME, target = FIELD, (edition_defaults) = { edition: "2023", default: "EXPLICIT" } ]; enum EnumType { OPEN = 0; CLOSED = 1; } optional EnumType enum = 2 [ retention = RUNTIME, target = ENUM, (edition_defaults) = { edition: "2023", default: "OPEN" } ]; enum RepeatedFieldEncoding { PACKED = 0; EXPANDED = 1; } optional RepeatedFieldEncoding repeated_field_encoding = 3 [ retention = RUNTIME, target = FIELD, (edition_defaults) = { edition: "2023", default: "PACKED" } ]; enum StringFieldValidation { REQUIRED = 0; HINT = 1; SKIP = 2; } optional StringFieldValidation string_field_validation = 4 [ retention = RUNTIME, target = FIELD, (edition_defaults) = { edition: "2023", default: "REQUIRED" } ]; enum MessageEncoding { LENGTH_PREFIXED = 0; DELIMITED = 1; } optional MessageEncoding message_encoding = 5 [ retention = RUNTIME, target = FIELD, (edition_defaults) = { edition: "2023", default: "LENGTH_PREFIXED" } ]; extensions 1000; // for features_cpp.proto extensions 1001; // for features_java.proto } ```