* Fix cord handling in DynamicMessage and oneofs.
This fixes a memory corruption vulnerability for anyone using cord with dynamically built descriptor pools.
* Update staleness. Run using Bazel 6.3.2 docker image
* Silence expected ubsan failures from absl::Cord
---------
Co-authored-by: Mike Kruskal <mkruskal@google.com>
This should address the issue in #14600 by avoiding a conflict on with the
`build` directory created by setup.py on a case-insensitive filesystem.
Closes#14600.
PiperOrigin-RevId: 579859556
Previously we were allocating memory on the message's arena every time we performed a `map[key]` or `map.get(key)` operation. This is unnecessary, as the key's data is only needed ephemerally, for the duration of the lookup, and we can therefore alias the Python object's string data instead of copying it.
This required fixing a bug in the convert.c operation. Previously in the `arena==NULL` case, if the user passes a bytes object instead of a unicode string, the code would return a pointer to a temporary Python object that had already been freed, leading to use-after-free. I fixed this by referencing the bytes object's data directly, and using utf8_range to verify the UTF-8.
Fixes: https://github.com/protocolbuffers/protobuf/issues/14571
PiperOrigin-RevId: 578563555
This PR is fixing small inconsistencies in the paths where the protobuf compiler is searched for in the protobuf python setup.py.
Closes#14318
COPYBARA_INTEGRATE_REVIEW=https://github.com/protocolbuffers/protobuf/pull/14318 from ottmar-zittlau:main 598d13406f
PiperOrigin-RevId: 573315569
Python 3.12 has removed the `distutils` module, so we need to stop relying on
it. Most of the parts we were using had a straightforward replacement in
`setuptools` or elsewhere in the Python standard library. I couldn't find a
good replacement for `distutils.command.clean`, though. We were only using it
to enable `python setup.py clean`, so I just removed that functionality since
directly invoking setup.py is deprecated anyway.
While I was looking at this I realized that our python/release.sh script is
unused, so I also removed that.
PiperOrigin-RevId: 571035275
The attempt at a more optimized approach doesn't round-trip all values of `datetime` on all platforms because `datetime.fromtimestamp(tzinfo)` is limited by the range of values accepted by `time.gmtime`, which can be substantially narrower than `datetime.min` to `datetime.max`. (The documentation notes that either `OverflowError` or `OSError` can be raised in that case, and that often this is limited to 1970 through 2038, versus 1 to 9999. See also https://github.com/python/cpython/issues/110042, the use of `gmtime` here seems unnecessary when the tzinfo supports the entire range.) So, supporting that whole range would require need fallback logic that uses this general approach anyways, which then requires a redundant set of tests for error behavior that amounts to a reimplementation of the whole function.
In addition, `datetime.fromtimestamp` doesn't support the full precision of `datetime` (https://github.com/python/cpython/issues/109849), which required adding additional code and an additional assumption (that neither tz offsets were sub-second nor tz changes mid-second).
Added test-cases for `datetime.min` in addition to the ones for `datetime.max`. Adjusted the examples and variable names slightly.
PiperOrigin-RevId: 569259168
The previous code did not work because timedelta arithmetic does not do timezone conversions. The result of adding a timedelta to a datetime has the same fixed UTC offset as the original datetime. This resulted in the correct timezone not being applied by `Timestamp.ToDatetime(tz)` whenever the UTC offset for the timezone at the represented moment was not the same as the UTC offset for that timezone at the epoch.
Instead, construct the datetime directly from the seconds part of the timestamp. It would be nice to include the nanoseconds as well (truncated to datetime's millisecond precision, but without unnecessary loss of precision). However, that doesn't work, since there isn't a way to construct a datetime from an integer-milliseconds timestamp, just float-seconds, which doesn't allow some datetime values to round-trip correctly. (This does assume that `tzinfo.utcoffset(dt).microseconds` is always zero, and that the value returned by `tzinfo.utcoffset(dt)` doesn't change mid-second. Neither of these is necessarily the case (see https://github.com/python/cpython/issues/49538), though I'd hope they hold in practice.)
This does take some care to still handle non-standard Timestamps where nanos is more than 1,000,000,000 (i.e. more than a second), since previously ToDatetime handled that, as do the other To* conversion methods.
The bug doesn't manifest for UTC (or any fixed-offset timezone), so it can be worked around in a way that will be correct before and after the fix by replacing `ts.ToDatetime(tz)` with `ts.ToDatetime(datetime.timezone.utc).astimezone(tz)`.
PiperOrigin-RevId: 569012931
This change moves almost everything in the `upb/` directory up one level, so
that for example `upb/upb/generated_code_support.h` becomes just
`upb/generated_code_support.h`. The only exceptions I made to this were that I
left `upb/cmake` and `upb/BUILD` where they are, mostly because that avoids
conflict with other files and the current locations seem reasonable for now.
The `python/` directory is a little bit of a challenge because we had to merge
the existing directory there with `upb/python/`. I made `upb/python/BUILD` into
the BUILD file for the merged directory, and it effectively loads the contents
of the other BUILD file via `python/build_targets.bzl`, but I plan to clean
this up soon.
PiperOrigin-RevId: 568651768
This restores the Python wheel CI runs from the old upb repo with only minor
changes. I had to update a path in one of the `py_wheel` rules and also make a
slight tweak to ensure that the `descriptor.upb_minitable.{h,c}` files make it
into the source wheels. The change in text_format_test.py is not strictly
necessary but is a small simplification I made while I was trying to debug an
issue with CRLF newlines.
I had to update test_util.py to use `importlib` to access the golden files from
the installed `protobuftests` package. I suspect the previous incarnation of
thse test runs was somehow reading the goldens from the repo checkout, but I
think the intention is to read them from `protobuftests` instead. This was a
bit tricky to get working because Python versions before 3.9 do not support
`importlib.resources.files()`. I set up the code to fall back on
`importlib.resources.open_binary()` in that case, but that function does not
support subdirectories, so this required putting an `__init__.py` file inside
the `testdata` directory to make sure it is treated as a Python package.
PiperOrigin-RevId: 567366695
I am getting ready to move almost everything under the upb/ directory up one
level to integrate upb better into its new location in the protobuf repo. This
change makes a few tweaks to prepare for that:
- Delete upb's LICENSE and CONTRIBUTING.md files since we already have similar
files at the top level.
- Rename `//python:python_version` so that it won't conflict later with
`//upb/python:python_version`.
- Move the contents of python/BUILD.bazel out to a Bazel macro to facilitate
merging that BUILD.bazel file with upb/python/BUILD.
PiperOrigin-RevId: 567119840
A couple weeks ago we moved upb into the protobuf Git repo, and this change
continues the merger of the two repos by making them into a single Bazel repo.
This was mostly a matter of deleting upb's WORKSPACE file and fixing up a bunch
of references to reflect the new structure.
Most of the changes are pretty mechanical, but one thing that needed more
invasive changes was the Python script for generating CMakeLists.txt,
make_cmakelists.py. The WORKSPACE file it relied on no longer exists with this
change, so I updated it to hardcode the information it needed from that file.
PiperOrigin-RevId: 564810016
The Python comparison protocol requires that if an object doesn't know how to
compare itself to an object of a different type, it returns NotImplemented
rather than False. The interpreter will then try performing the comparison using
the other operand. This translates, for protos, to:
If a proto message doesn't know how to compare itself to an object of
non-message type, it returns NotImplemented. This way, the interpreter will then
try performing the comparison using the comparison methods of the other object,
which may know how to compare itself to a message. If not, then Python will
return the combined result (e.g., if both objects don't know how to perform
__eq__, then the equality operator `==` return false).
This change allows one to compare a proto with custom matchers such as mock.ANY
that the message doesn't know how to compare to, regardless of whether
mock.ANY is on the right-hand side or left-hand side of the equality (prior to
this change, it only worked with mock.ANY on the left-hand side).
Fixes https://github.com/protocolbuffers/protobuf/issues/9173
PiperOrigin-RevId: 561728156
GetOptions on fields (which parse the _serialized_options) will be called for the first time of parse or serialize instead of Build time.
Note: GetOptions on messages are still called in Build time because of message_set_wire_format. If message options are needed in descriptor.proto, a parse error will be raised in GetOptions(). We can check the file to not invoke GetOptions() for descriptor.proto as long as message_set_wire_format not needed in descriptor.proto.
Other options except message options do not invoke GetOptions() in Build time
PiperOrigin-RevId: 560741182