Adds test coverage for invalid empty strings (e.g. ""), non-numeric strings (e.g. "abc"), and partially-numeric strings (e.g. "12abc"), as well as valid exponential numeric strings ("1e5)
We will target enforcing non-conformant cases that should have failed to parse but didn't in upb in v30.x (our ~annual breaking release in some languages). Conformance failures to accept input we previously failed on can be landed at any time.
PiperOrigin-RevId: 694269337
Empty strings are invalid numeric values. upb must fail to convert JSON objects that contain empty string values for numbers, but it currently does not.
PiperOrigin-RevId: 681623866
This change moves almost everything in the `upb/` directory up one level, so
that for example `upb/upb/generated_code_support.h` becomes just
`upb/generated_code_support.h`. The only exceptions I made to this were that I
left `upb/cmake` and `upb/BUILD` where they are, mostly because that avoids
conflict with other files and the current locations seem reasonable for now.
The `python/` directory is a little bit of a challenge because we had to merge
the existing directory there with `upb/python/`. I made `upb/python/BUILD` into
the BUILD file for the merged directory, and it effectively loads the contents
of the other BUILD file via `python/build_targets.bzl`, but I plan to clean
this up soon.
PiperOrigin-RevId: 568651768
This is the second attempt to fix our Git history. This should allow
"git blame" to work correctly in the upb/ directory even though our
automation unexpectedly blew away that directory.
This PR removes the DSL from the code generator, in anticipation of splitting the DSL out into a separate package.
Given a .proto file like:
```proto
syntax = "proto3";
package pkg;
message TestMessage {
optional int32 i32 = 1;
optional TestMessage msg = 2;
}
```
Generated code before:
```ruby
# Generated by the protocol buffer compiler. DO NOT EDIT!
# source: protoc_explorer/main.proto
require 'google/protobuf'
Google::Protobuf::DescriptorPool.generated_pool.build do
add_file("test.proto", :syntax => :proto3) do
add_message "pkg.TestMessage" do
proto3_optional :i32, :int32, 1
proto3_optional :msg, :message, 2, "pkg.TestMessage"
end
end
end
module Pkg
TestMessage = ::Google::Protobuf::DescriptorPool.generated_pool.lookup("pkg.TestMessage").msgclass
end
```
Generated code after:
```ruby
# frozen_string_literal: true
# Generated by the protocol buffer compiler. DO NOT EDIT!
# source: test.proto
require 'google/protobuf'
descriptor_data = "\n\ntest.proto\x12\x03pkg\"S\n\x0bTestMessage\x12\x10\n\x03i32\x18\x01 \x01(\x05H\x00\x88\x01\x01\x12\"\n\x03msg\x18\x02 \x01(\x0b\x32\x10.pkg.TestMessageH\x01\x88\x01\x01\x42\x06\n\x04_i32B\x06\n\x04_msgb\x06proto3"
begin
Google::Protobuf::DescriptorPool.generated_pool.add_serialized_file(descriptor_data)
rescue TypeError => e
# <compatibility code, see below>
end
module Pkg
TestMessage = ::Google::Protobuf::DescriptorPool.generated_pool.lookup("pkg.TestMessage").msgclass
end
```
This change fixes nearly all remaining conformance problems that existed previously. This is a side effect of moving from the DSL (which is lossy) to a serialized descriptor (which preserves all information).
## Backward Compatibility
This change should be 100% compatible with Ruby Protobuf >= 3.18.0, released in Sept 2021. Additionally, it should be compatible with all existing users and deployments. However there is some special compatibility code I inserted to achieve this level of backward compatibility.
Without the compatibility code, there is an edge case that could break backward compatibility. The existing code is lax in a way that the new code would be more strict.
When we use a full serialized descriptor, it will contain a list of all `.proto` files imported by this file (whereas the DSL never added dependencies properly): dfb71558a2/src/google/protobuf/descriptor.proto (L65-L66)
`add_serialized_file` will verify that all dependencies listed in the descriptor were previously added with `add_serialized_file`. Generally that should be fine, because the generated code will contain Ruby `require` statements for all dependencies, and the descriptor will fail to load anyway if the types we depend on were not previously defined in the DescriptorPool.
But there is a potential for problems if there are ambiguities around file paths. For example, consider the following scenario:
```proto
// foo/bar.proto
syntax = "proto2";
message Bar {}
```
```proto
// foo/baz.proto
syntax = "proto2";
import "bar.proto";
message Baz {
optional Bar bar = 1;
}
```
If you invoke `protoc` like so, it will work correctly:
```
$ protoc --ruby_out=. -Ifoo foo/bar.proto foo/baz.proto
$ RUBYLIB=. ruby baz_pb.rb
```
However if you invoke `protoc` like so, and didn't have any compatibility code, it would fail to load:
```
$ protoc --ruby_out=. -I. -Ifoo foo/baz.proto
$ protoc --ruby_out=. -I. -Ifoo foo/bar.proto
$ RUBYLIB=foo ruby foo/baz_pb.rb
foo/baz_pb.rb:10:in `add_serialized_file': Unable to build file to DescriptorPool: Depends on file 'bar.proto', but it has not been loaded (Google::Protobuf::TypeError)
from foo/baz_pb.rb:10:in `<main>'
```
The problem is that `bar.proto` is being referred to by two different canonical names: `bar.proto` and `foo/bar.proto`. This is a user error: each import should always be referred to by a consistent full path. Hopefully user errors of this sort are rare, but it is hard to know without trying.
The code in this PR prints a warning using `warn` if we detect that this edge case has occurred. We will plan to remove this compatibility code in the next major version.
Closes#12319
COPYBARA_INTEGRATE_REVIEW=https://github.com/protocolbuffers/protobuf/pull/12319 from haberman:ruby-gencode-binary 5c0e8f20b1
PiperOrigin-RevId: 524129023
Implemented in java, c++, python and upb. Also added conformance test.
https://developers.google.com/protocol-buffers/docs/reference/google.protobuf#google.protobuf.Value
where it says:
attempting to serialize NaN or Infinity results in error. (We can't serialize these as string "NaN" or "Infinity" values like we do for regular fields, because they would parse as string_value, not number_value).
PiperOrigin-RevId: 500828964
Implemented in java, c++, python and upb. Also added conformance test.
https://developers.google.com/protocol-buffers/docs/reference/google.protobuf#google.protobuf.Value
where it says:
attempting to serialize NaN or Infinity results in error. (We can't serialize these as string "NaN" or "Infinity" values like we do for regular fields, because they would parse as string_value, not number_value).
PiperOrigin-RevId: 500139380
* Added staleness test for ruby-upb.{c,h} and updated.
* Removed file comment markers, too much trouble for too little benefit.
* Ran clang-format.
* Updated ruby-upb.{c,h}.
* Added missing table code to amalgamation.
* Updated to latest upb, patch no longer needed.
* Reverted changes to third_party sub-modules.
* Added missing unicode file.
* Removed conformance failures for Ruby.
* Some more updates to PHP testing infrastructure (#8576)
* WIP.
* Added build config for all of the tests.
* Use ../src/protoc if it is available, for cases where Bazel isn't available.
* Added test_php.sh.
* Fix for the broken macOS tests.
* Move all jobs to use php80 instead of lots of separate jobs.
* Only pass -t flag if we are running in a terminal.
* Updated php_all job to use new Docker stuff.
* Fixed PHP memory leaks and arginfo errors (#8614)
* Fixed a bunch of incorrect arginfo and a few incorrect error messages.
* Passes mem check test with no leaks!
* WIP.
* Fix build warning that was causing Bazel build to fail.
* Added compatibility code for PHP <8.0.
* Added test_valgrind target and made tests Valgrind-clean.
* Updated Valgrind test to fail if memory leaks are detected.
* Removed intermediate shell script so commands are easier to cut, paste, and modify.
* Passing all Valgrind tests!
* Hoist addref into ObjCache_Get().
* Removed special case of map descriptors by keying object map on upb_msgdef.
* Removed all remaining RETURN_ZVAL() macros.
* Removed all explicit reference add/del operations.
* Added REFCOUNTING.md to Makefile.am.
* Updated upb version and fixed PHP to not get unset message field. (#8621)
* Updated upb version and fixed PHP to not get unset message field.
* Updated changelog.
* Fixed preproc test to handle old versions of Clang withot __has_attribute().
* A second try at fixing __has_attribute().
* Copy __has_attribute() fix to cc file also.
* Updated failure list for PHP for fixed test.
* Updated version of upb for Ruby (#8624)
* Updated upb.
* Preserve legacy behavior for unset messages.
* Updated failure list.
* Updated CHANGES.txt.
* Added erroneously-deleted test file.
* Fixed condition on compatibility code.
* Re-introduced deleted file again, and fixed Rakefile to not delete it.
* Fix generation of test protos.
* WIP.
* WIP.
* WIP.
* WIP.
* WIP.
* WIP.
* Added some missing files.
* WIP.
* WIP.
* Updated upb.
* Extension loads, but crashes immediately.
* Gets through the test suite without SEGV!
Still a lot of bugs to fix, but it is a major step!
214 tests, 378 assertions, 37 failures, 147 errors, 0 pendings, 0 omissions, 0 notifications
14.0187% passed
* Test and build for Ruby 3.0
* Fixed a few more bugs, efficient #inspect is almost done.
214 tests, 134243 assertions, 30 failures, 144 errors, 0 pendings, 0 omissions, 0 notifications
18.6916% passed
* Fixed message hash initialization and encode depth checking.
214 tests, 124651 assertions, 53 failures, 70 errors, 0 pendings, 0 omissions, 0 notifications
42.5234% passed
* A bunch of fixes to failing tests, now 70% passing.
214 tests, 202091 assertions, 41 failures, 23 errors, 0 pendings, 0 omissions, 0 notifications
70.0935% passed
* More than 80% of tests are passing now.
214 tests, 322331 assertions, 30 failures, 9 errors, 0 pendings, 0 omissions, 0 notifications
81.7757% passed
Unfortunately there is also a sporadic bug/segfault hanging around
that appears to be GC-related.
* Add linux/ruby30 and macos/ruby30
* Use rvm master for 3.0.0-preview2
* Over 90% of tests are passing!
214 tests, 349898 assertions, 15 failures, 1 errors, 0 pendings, 0 omissions, 0 notifications
92.5234% passed
* Passes all tests!
214 tests, 369388 assertions, 0 failures, 0 errors, 0 pendings, 0 omissions, 0 notifications
100% passed
* A bunch of cleanup.
1. Removed a bunch of internal-only symbols from headers.
2. Required a frozen check to get a non-const pointer to a map or array.
3. De-duplicated the code to get a type argument for Map/RepeatedField.
* Removed a bunch more stuff from protobuf.h. There is an intermittent assert failure.
Intermittent failure:
ruby: ../../../../ext/google/protobuf_c/protobuf.c:263: ObjectCache_Add: Assertion `rb_funcall(obj_cache2, (__builtin_constant_p("[]") ? __extension__ ({ static ID rb_intern_id_cache; if (!rb_intern_id_cache) rb_intern_id_cache = rb_intern2((("[]")
), (long)strlen(("[]"))); (ID) rb_intern_id_cache; }) : rb_intern("[]")), 1, key_rb) == val' failed
* Removed a few more things from protobuf.h.
* Ruby 3.0.0-preview2 to 3.0.0
* Require rake-compiler-dock >= 1.1.0
* More progress, fighting with the object cache.
* Passes on all Ruby versions!
* Updated and clarified comment regarding WeakMap.
* Fixed the wyhash compile.
* Fixed conformance tests for Ruby.
Conformance results now look like:
RUBYLIB=../ruby/lib:. ./conformance-test-runner --enforce_recommended --failure_list failure_list_ruby.txt --text_format_failure_list text_format_failure_list_ruby.txt ./conformance_ruby.rb
CONFORMANCE TEST BEGIN ====================================
CONFORMANCE SUITE PASSED: 1955 successes, 0 skipped, 58 expected failures, 0 unexpected failures.
CONFORMANCE TEST BEGIN ====================================
CONFORMANCE SUITE PASSED: 0 successes, 111 skipped, 8 expected failures, 0 unexpected failures.
Fixes include:
- Changed Ruby compiler to no longer reject proto2 maps.
- Changed Ruby compiler to emit a warning when proto2 extensions are
present instead of rejecting the .proto file completely.
- Fixed conformance tests to allow proto2 and look up message by name
instead of hardcoding a specific list of messages.
- Fixed conformance test to support the "ignore unknown" option for
JSON.
- Fixed conformance test to properly report serialization errors.
* Removed debug printf and fixed #inspect for floats.
* Fixed compatibility test to have proper semantics for #to_json.
* Updated Makefile.am with new file list.
* Don't try to copy wyhash when inside Docker.
* Fixed bug where we would forget that a sub-object is frozen in Ruby >=2.7.
* Avoid exporting unneeded symbols and refactored a bit of code.
* Some more refactoring.
* Simplified and added more comments.
* Some more comments and simplification. Added a missing license block.
Co-authored-by: Masaki Hara <hara@wantedly.com>
* Add binary conformance tests for map fields
* Update failure list
* Fix php conformance test
* Fix php conformance test
In 32-bit platform, int64 should be string. However, map iterator returns string key as integer.
* Add more test cases for map
* Update failure list
* Test singular fields are encoded in canonical way
* Defautl values in proto3 should not be encoded.
* Values should be converted to the canonical representation (e.g.,
long int64 value may be truncated for int32 field)
* Update failure list
* Update failure list
* Add conformance tests for explicit packed/unpacked fields
* Fix typo
* Update failure lists
* Update failure list
* Use enum class to make enum scoped
* Add binary conformance test for default repeated fields
1) Both packed and unpacked encoding should be accepted for parsing.
2) Encode should follow the default way for the syntax.
* Uncomment test
* Remove is_primitive
* Add failed tests to failure lists.
* Add failed test to failure list
* Use binary format to specify expected value
Text format cannot distinguish whether repeated field is packed or not.
* Change method name from ToHexString to ToOctString
* Add failed test to failure list
* Add failed test to php's failure list
* Fix comments
* Replace strptime with custom implementation
* Fix ruby strptime
* Fix test
* Fix ruby conformance test
* Use mktime
* Remove EmptyFieldMask from failed conformance test list