Users should use the add-on unknown_fields.py support.
Old usage example:
unknown_field_set = msg.UnknownFields()
New usage should be:
from google.protobuf import unknown_fields
unknown_field_set = unknown_fields.UnknownFieldSet(msg)
PiperOrigin-RevId: 589969095
This has been replaced by edition and feature getters. Any code depending on syntax will likely be broken by editions files, and should be migrated to the finer-grained feature helpers.
PiperOrigin-RevId: 589866745
As a prerequisite of enabling go/protobuf-weak-speed-messages downcasts using built-in cast operations need to be changed to cast functions recommended by protobuf team (go/protobuf-downcast-recommendation)
**More information**: go/protobuf-stripping-lsc
#codehealth
PiperOrigin-RevId: 582977861
This also fixes a few minor bugs in the editions implementation that were caught in python/conformance tests, and adds a new SetFeatureSetDefaults API to the def pool for consistency with C++ and other python implementations.
PiperOrigin-RevId: 581384108
This was already fully implemented in C++, but we need to expose features methods to the Python runtime to fully enable it. This also enables conformance and unit-testing for C++.
PiperOrigin-RevId: 581364355
This makes third_party/utf8_range no longer a Git subtree, but instead the
permanent location and source of truth for utf8_range. It is also now
incorporated into the @com_google_protobuf Bazel repo. Utf8_range still has its
own separate CMake build for now, though.
PiperOrigin-RevId: 580682733
There's a test run in test_python.yml that is non-trivial to get working with
Python 3.12 due to some refactoring of our Docker images that would be needed.
But this change updates everything else to add coverage for Python 3.12.
The main changes necessary to get the builds working were to upgrade some Pip
packages via requirements.txt, including in a patch to `rules_fuzzing` that I
plan to upstream soon. I also had to take an explicit dependency on
`setuptools`.
I removed tox.ini, since it was outdated and we have not been actively
maintaining it.
PiperOrigin-RevId: 580548224
This change only covers pure python, and follow-up changes will handle C++/upb variants and actually enable editions support. The C++ one works (as evident from the conformance tests), but needs some APIs added to allow for testing.
PiperOrigin-RevId: 580304039
This should address the issue in #14600 by avoiding a conflict on with the
`build` directory created by setup.py on a case-insensitive filesystem.
Closes#14600.
PiperOrigin-RevId: 579859556
Add PyObject_GC_UnTrack() in deallocation functions for Python types that
have PyTPFLAGS_HAVE_GC set, either explicitly or by inheriting from a type
with GC set. Not untracking before clearing instance data introduces
potential race conditions (if GC happens to run between the partial clearing
and the actual deallocation) and produces a warning under Python 3.11.
(The warning then triggered an assertion failure, which only showed up when
building in Py_DEBUG mode; this therefor also fixes that assertion failure.)
PiperOrigin-RevId: 579827001
Previously we were allocating memory on the message's arena every time we performed a `map[key]` or `map.get(key)` operation. This is unnecessary, as the key's data is only needed ephemerally, for the duration of the lookup, and we can therefore alias the Python object's string data instead of copying it.
This required fixing a bug in the convert.c operation. Previously in the `arena==NULL` case, if the user passes a bytes object instead of a unicode string, the code would return a pointer to a temporary Python object that had already been freed, leading to use-after-free. I fixed this by referencing the bytes object's data directly, and using utf8_range to verify the UTF-8.
Fixes: https://github.com/protocolbuffers/protobuf/issues/14571
PiperOrigin-RevId: 578563555
Previously we were allocating memory on the message's arena every time we performed a `map[key]` or `map.get(key)` operation. This is unnecessary, as the key's data is only needed ephemerally, for the duration of the lookup, and we can therefore alias the Python object's string data instead of copying it.
This required fixing a bug in the convert.c operation. Previously in the `arena==NULL` case, if the user passes a bytes object instead of a unicode string, the code would return a pointer to a temporary Python object that had already been freed, leading to use-after-free. I fixed this by referencing the bytes object's data directly, and using utf8_range to verify the UTF-8.
Fixes: https://github.com/protocolbuffers/protobuf/issues/14571
PiperOrigin-RevId: 578563555
Because pure python builds all descriptors at runtime via reflection, it's unable to parse options during the build of descriptor.proto (i.e. before we've built the options schemas). We always lazily parse these options to avoid this, but that still means options can't be *used* during this build. Since the current build process makes heavy use of features (which previously just relied on syntax), this poses a problem for editions.
To get around this, we just embed the resolved features directly into the gencode for this one file. This will allow us to skip feature resolution for these descriptors and still consider features in their build.
PiperOrigin-RevId: 577495949
This PR is fixing small inconsistencies in the paths where the protobuf compiler is searched for in the protobuf python setup.py.
Closes#14318
COPYBARA_INTEGRATE_REVIEW=https://github.com/protocolbuffers/protobuf/pull/14318 from ottmar-zittlau:main 598d13406f
PiperOrigin-RevId: 573315569
Python 3.12 has removed the `distutils` module, so we need to stop relying on
it. Most of the parts we were using had a straightforward replacement in
`setuptools` or elsewhere in the Python standard library. I couldn't find a
good replacement for `distutils.command.clean`, though. We were only using it
to enable `python setup.py clean`, so I just removed that functionality since
directly invoking setup.py is deprecated anyway.
While I was looking at this I realized that our python/release.sh script is
unused, so I also removed that.
PiperOrigin-RevId: 571035275
The attempt at a more optimized approach doesn't round-trip all values of `datetime` on all platforms because `datetime.fromtimestamp(tzinfo)` is limited by the range of values accepted by `time.gmtime`, which can be substantially narrower than `datetime.min` to `datetime.max`. (The documentation notes that either `OverflowError` or `OSError` can be raised in that case, and that often this is limited to 1970 through 2038, versus 1 to 9999. See also https://github.com/python/cpython/issues/110042, the use of `gmtime` here seems unnecessary when the tzinfo supports the entire range.) So, supporting that whole range would require need fallback logic that uses this general approach anyways, which then requires a redundant set of tests for error behavior that amounts to a reimplementation of the whole function.
In addition, `datetime.fromtimestamp` doesn't support the full precision of `datetime` (https://github.com/python/cpython/issues/109849), which required adding additional code and an additional assumption (that neither tz offsets were sub-second nor tz changes mid-second).
Added test-cases for `datetime.min` in addition to the ones for `datetime.max`. Adjusted the examples and variable names slightly.
PiperOrigin-RevId: 569259168
The previous code did not work because timedelta arithmetic does not do timezone conversions. The result of adding a timedelta to a datetime has the same fixed UTC offset as the original datetime. This resulted in the correct timezone not being applied by `Timestamp.ToDatetime(tz)` whenever the UTC offset for the timezone at the represented moment was not the same as the UTC offset for that timezone at the epoch.
Instead, construct the datetime directly from the seconds part of the timestamp. It would be nice to include the nanoseconds as well (truncated to datetime's millisecond precision, but without unnecessary loss of precision). However, that doesn't work, since there isn't a way to construct a datetime from an integer-milliseconds timestamp, just float-seconds, which doesn't allow some datetime values to round-trip correctly. (This does assume that `tzinfo.utcoffset(dt).microseconds` is always zero, and that the value returned by `tzinfo.utcoffset(dt)` doesn't change mid-second. Neither of these is necessarily the case (see https://github.com/python/cpython/issues/49538), though I'd hope they hold in practice.)
This does take some care to still handle non-standard Timestamps where nanos is more than 1,000,000,000 (i.e. more than a second), since previously ToDatetime handled that, as do the other To* conversion methods.
The bug doesn't manifest for UTC (or any fixed-offset timezone), so it can be worked around in a way that will be correct before and after the fix by replacing `ts.ToDatetime(tz)` with `ts.ToDatetime(datetime.timezone.utc).astimezone(tz)`.
PiperOrigin-RevId: 569012931
This change moves almost everything in the `upb/` directory up one level, so
that for example `upb/upb/generated_code_support.h` becomes just
`upb/generated_code_support.h`. The only exceptions I made to this were that I
left `upb/cmake` and `upb/BUILD` where they are, mostly because that avoids
conflict with other files and the current locations seem reasonable for now.
The `python/` directory is a little bit of a challenge because we had to merge
the existing directory there with `upb/python/`. I made `upb/python/BUILD` into
the BUILD file for the merged directory, and it effectively loads the contents
of the other BUILD file via `python/build_targets.bzl`, but I plan to clean
this up soon.
PiperOrigin-RevId: 568651768