Protocol Buffers - Google's data interchange format (grpc依赖) https://developers.google.com/protocol-buffers/
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
 
 
 
 
 
 

4.1 KiB

Edition Evolution

Author: @mcy

Approved: 2022-07-06

Overview

Protobuf Editions give us a mechanism for pulling protobuf files into the future by upgrading their edition. Features are flags associated with syntax items of a .proto file; they can be either set explicitly, or they can be implied by the edition. Features can either correspond to behavior in protoc's frontend (e.g. proto:closed_enums), or to a specific backend (cpp:string_view).

This document seeks to answer:

  • When do we create an edition?
  • How do backends inform protoc of their defaults?

Proposal

Total Ordering of Editions

NOTE: This topic is largely superseded by Edition Naming.

The FileDescriptorProto.edition field is a string, so that we can avoid nasty surprises around needing to mint multiple editions per year: even if we mint edition = "2022";, we can mint edition = "2022b"; in a pinch.

However, we might not own some third-party backends, and they might be unaware of when we decide to mint editions, and might want to mint editions on their own. Suppose that I maintain the Haskell protobuf backend, and I decide to add the haskell:more_monads feature. How do I get this into an edition?

We propose defining a total order on editions. This means that a backend can pick the default not by looking at the edition, but by asking "is this proto older than this edition, where I introduced this default?"

The total order is thus: the edition string is split on '.'. Each component is then ordered by a.len < b.len && a < b. This ensures that 9 < 10, for example.

By convention, we will make the edition be either the year, like 2022, or the year followed by a revision, like 2022.1. Thus, we have the following total ordering on editions:

2022 < 2022.0 < 2022.1 < ... < 2022.9 < 2022.10 < ... < 2023 < ... < 2024 < ...

This means that backends (even though we don't particularly recommend it) can change defaults as often as they like. Thus, if I decide that haskell:more_monads becomes true in 2023, I simply ask file.EditionIsLaterThan("2023"). If it becomes false in 2023.1, a future backend can ask file.EditionIsBetween("2023", "2023.1").

Creating an Edition

In a sense, every edition already exists; it's just a matter of defining features on it.

If the feature is a proto: feature, protoc intrinsically knows it, and it is implemented in the frontend.

If the feature is a backend feature, the backend must be able to produce some kind of proto like message Edition { repeated Feature defaults = 1; } that describes what a specific edition must look like, based on less-than/is-between predicates like those above. That information can be used by protoc to display the full set of features it knows about given its backends. (The backend must, of course, use this information to make relevant codegen decisions.)

What about Editions "From the Future"?

Suppose that in version v5.0 of my Haskell backend I introduced haskell:more_monads, and this has a runtime component to it; that is, the feature must be present in the descriptor to be able to handle parsing the message correctly.

However, suppose I have an older service running v4.2. It is compiled with a proto that was already edition 2023 at build time (alternatively, it dynamically loads a proto). My v5.0 client sends it an incompatible message from "the future". Because a parse failure would be an unacceptable service degradation, we have a couple of options:

  • Editions cannot introduce a feature that requires readers to accept new encodings.
    • Similarly, editions cannot add restrictions that constrain past parsers.
  • Editions may introduce such features, but they must somehow fit into some kind of build horizon.

The former is a reasonable-sounding but ultimately unacceptable position, since it means we cannot use editions if we wanted to, say, make it so that message fields are encoded as groups rather than length-prefixed chunks. The alternative is to define some kind of build horizon.