This is a very narrow edge case where touching a packed extension via generated APIs first, and then doing so reflectively will trigger a DCHECK. Otherwise, reflective APIs will work but not use packed encoding for the extension. This was likely a pre-existing bug dating back to proto3, where it would only be visible on custom options (the only extensions allowed in proto3).
To help qualify this and uncover similar issues, unittest.proto was migrated to editions. This turned up some other minor issues in DebugString and python.
PiperOrigin-RevId: 675785611
* Fix cord handling in DynamicMessage and oneofs.
This fixes a memory corruption vulnerability for anyone using cord with dynamically built descriptor pools.
* Silence expected ubsan failures from absl::Cord
---------
Co-authored-by: Mike Kruskal <mkruskal@google.com>
Pure python and upb do not support it and filtered out the test. This API does
not exists in any other language(except protobuf c++).
GetDebugString() for cpp extension will be removed in Jan 2025
PiperOrigin-RevId: 662640110
-Raise warnings for deprecated google/protobuf/service.py APIs.
service.py APIs are marked as deprecated since 2010. These APIs will be
removed in Jan 2025
PiperOrigin-RevId: 653280370
Python dict is now able to be assigned (by create and copy, not reference) and compared with the Protobuf Struct field.
Python list is now able to be assigned (by create and copy, not reference) and compared with the Protobuf ListValue field.
example usage:
dictionary = {'key1': 5.0, 'key2': {'subkey': 11.0, 'k': False},}
list_value = [6, 'seven', True, False, None, dictionary]
msg = more_messages_pb2.WKTMessage(
optional_struct=dictionary, optional_list_value=list_value
)
self.assertEqual(msg.optional_struct, dictionary)
self.assertEqual(msg.optional_list_value, list_value)
PiperOrigin-RevId: 646099987
This was previously fixed in C++ (https://github.com/protocolbuffers/protobuf/issues/16549), but not ported to other languages. Delimited field encoding can be inherited by fields where it's invalid, such as non-messages and maps. In these cases, the encoding should be ignored and length-prefixed should be used.
PiperOrigin-RevId: 642792988
There is a special case where message factories can be confused: if a module
written in C++ with pybind11 links against a self-recursive message, and that
message is part of another message loaded from Python, then the confusion
will happen.
Example:
# This one is also linked into the C++ module.
message SelfRecursive {
optional SelfRecursive self_recursive = 1;
}
# This one is used only in Python and not linked.
message OnlyUsedInPython {
optional SelfRecursive self_recursive = 2;
}
The caching through message_factory::RegisterMessageClass then happens on one
instance of the factory, but traversal with the lookup in another.
This occurs in the pure Python and upb implementations that have their own
default descriptor pools (and thus message factory).
Fix this by using the already passed message factory to registering the
message class to cache.
A test accounts for this case to avoid regressions.
PiperOrigin-RevId: 642551744
Timestamp and Duration are now have more support with datetime and timedelta:
- Allows assign python datetime to protobuf DateTime field in addition to current FromDatetime/ToDatetime (Note: will throw exceptions for the differences in supported ranges)
- Allows assign python timedelta to protobuf Duration field in addition to current FromTimedelta/ToTimedelta
- Calculation between Timestamp, Duration, datetime and timedelta will also be supported.
example usage:
from datetime import datetime, timedelta
from event_pb2 import Event
e = Event(start_time=datetime(year=2112, month=2, day=3),
duration=timedelta(hours=10))
duration = timedelta(hours=10))
end_time = e.start_time + timedelta(hours=4)
e.duration = end_time - e.start_time
PiperOrigin-RevId: 640639168
Added the following methods:
serialize_length_prefixed(message: Message, output: io.BytesIO) -> None
parse_length_prefixed(message_class: Type[Message], input_bytes: io.BytesIO) -> Message
The output of serialize_length_prefixed should be BytesIO or custom buffered IO that data should be written to. The output stream must be buffered, e.g. using
https://docs.python.org/3/library/io.html#buffered-streams.
PiperOrigin-RevId: 638375900
`DynamicCastMessage`/`DownCastMessage`.
The target does not necessarily need to be a generated type. For example, it
also supports `Message` itself. This makes the API friendlier to generic code and less verbose.
Replace all uses of dynamic_cast/down_cast/**ToGenerated with the new names.
Also, remove checks for RTTI in tests where we only need the casts to work. They don't need RTTI anymore.
PiperOrigin-RevId: 638278948
The second assert in _upb_EncodeRoundTripFloat is raised if val is a nan. This fix just returns the output of first spnprintf.
I am not sure how changes to this repo are made so feel free to ignore this CL.
To test this, you could
1. Define a proto with a float field
message Test {
float val = 1;
}
2. In a python script, import the library and then set the val to nan and try to print it.
proto = Test(val=float('nan'))
print(proto)
This will cause a coredump due to assertion error:
assert.h assertion failed at third_party/upb/upb/lex/round_trip.c:46 in void _upb_EncodeRoundTripFloat(float, char *, size_t): strtof(buf, NULL) == val
Added the corresponding change to double too
PiperOrigin-RevId: 637127851
- add google.protobuf.proto module
- wrap generated SerializeToString and ParseFromString to the new module:
def serialize(message: Message, deterministic: bool=None) -> bytes:
"""Return the serialized proto."""
def parse(message_class: typing.Type[Message], payload: bytes) -> Message:
"""Given a serialized proto, deserialize it into a Message."""
PiperOrigin-RevId: 632223409
(Second attempt. The first attempt missed ListValue)
The “in” operator will be consistent with HasField but a little different with Proto Plus.
The detail behavior of “in” operator in Nextgen
* For WKT Struct (to be consist with old Struct behavior):
-Raise TypeError if not pass a string
-Check if the key is in the struct.fields
* For WKT ListValue (to be consist with old behavior):
-Check if the key is in the list_value.values
* For other messages:
-Raise ValueError if not pass a string
-Raise ValueError if the string is not a field
-For Oneof: Check any field under the oneof is set
-For has-presence field: check if set
-For non-has-presence field (include repeated fields): raise ValueError
PiperOrigin-RevId: 631143378
The new fields fixed_features and overridable_features can be simply merged to recover the old aggregate defaults. By splitting them though, plugins and runtimes get some extra information about lifetimes for enforcement.
PiperOrigin-RevId: 625527117
Features need to be validated within the pool being built, since the generated pool only contains extensions linked into the binary (e.g. protoc or a runtime building dynamic protos). The generated pool may be missing extensions used in this proto or it may have version skew. Moving to the build pool requires reflective parsing, which in general can't be done from inside the pool's database lock. This required some refactoring to add a post-build validation phase outside of the lock.
For now, the feature support spec is optional and the checks only are only applied when it's present. Follow-up changes will add these specs to our existing features and then require them for all FeatureSet extensions.
PiperOrigin-RevId: 623630219
This updates all our text parsers and serializers to better handle tag-delimited fields under editions. Under proto2, groups were the only tag-delimited fields possible, and the group name (i.e. the message type) was guaranteed to be unique. Text-format and various generators used this instead of the synthetic field name (lower-cased group name) to represent these fields.
Under editions, we've removed group syntax and allowed any message field to be tag-delimited. This breaks those cases when adding new tag-delimited fields where the message type might not be unique or correspond to the field name. Code generators have already been fixed to treat "group-like" fields using the old behavior, and treat new fields like any other sub-message.
This change addresses the text-format issue. Text parsers will accept *either* the type or field name for "group-like" fields, and only the field name for every other message field. Text serializers will continue to emit the message name for "group-like" fields, but also use the field name for everything else.
This creates some awkward capitalization behavior for fields that happen to *look* like proto2 groups, but it won't lead to any conflicts or invalid encodings. A feature will likely be added to edition 2024 to allow for migration off this legacy behavior.
PiperOrigin-RevId: 622260327
In the _RegularMessageToJsonObject method, there is no str type conversion for non-bytes type values in the repeated content. This causes an exception in the MessageToJson method, as the jsons data does not allow for the occurrence of byte array type property values.
BUG EG (contains proto and python code):
------------common.proto------------
message SafetyInfo{//
repeated LoginDevice deviceList = 1;
}
message LoginDevice {
optional string uuid = 1 [default = ""];
optional string deviceName = 2 [default = ""];
optional string deviceType = 3 [default = ""];
required uint32 lastTime = 4;
}
------------test code(python)------------
from google.protobuf.json_format import MessageToJson
import common_pb2 # generate by common.proto
pb = common_pb2.SafetyInfo()
pb_hex = "0a4e0a2039323833663530356533363332396338356638623866343832613561323061651211636d4a38426c307a4d20446576696365731a0f6950686f6e6520694f532031332e3720dc85a9b00628010a410a206233356161326632366236343966313536393466663761336263303434323163120950432dcdacd0cbb4ef1a0a57696e646f777320313020d5d2c6e705280010001800200028003001" pb.ParseFromString(bytes.fromhex(pb_hex))
print(pb)
print(MessageToJson(pb))
Closes#16382
COPYBARA_INTEGRATE_REVIEW=https://github.com/protocolbuffers/protobuf/pull/16382 from zhangzibao:main 3af2569265
PiperOrigin-RevId: 622156823
The “in” operator will be consistent with HasField but a little different with Proto Plus.
The detail behavior of “in” operator in Nextgen for Struct (to be consist with old Struct behavior):
-Raise TypeError if not pass a string
-Check if the key is in the struct.fields
The detail behavior of “in” operator in Nextgen(for other message):
-Raise ValueError if not pass a string
-Raise ValueError if the string is not a field
-For Oneof: Check any field under the oneof is set
-For has-presence field: check if set
-For non-has-presence field (include repeated fields): raise ValueError
PiperOrigin-RevId: 621240977