Ruby <2.7 does not allow non-finalizable objects to be WeakMap
keys: https://bugs.ruby-lang.org/issues/16035
We work around this by using a secondary map for Ruby <2.7 which
maps the non-finalizable integer to a distinct object.
For now we accept that the entries in the secondary map wil never
be collected. If this becomes a problem we can perform a GC pass
every so often that looks at the contents of the object cache to
decide what can be deleted from the secondary map.
* WIP.
* WIP.
* WIP.
* WIP.
* WIP.
* WIP.
* Added some missing files.
* WIP.
* WIP.
* Updated upb.
* Extension loads, but crashes immediately.
* Gets through the test suite without SEGV!
Still a lot of bugs to fix, but it is a major step!
214 tests, 378 assertions, 37 failures, 147 errors, 0 pendings, 0 omissions, 0 notifications
14.0187% passed
* Test and build for Ruby 3.0
* Fixed a few more bugs, efficient #inspect is almost done.
214 tests, 134243 assertions, 30 failures, 144 errors, 0 pendings, 0 omissions, 0 notifications
18.6916% passed
* Fixed message hash initialization and encode depth checking.
214 tests, 124651 assertions, 53 failures, 70 errors, 0 pendings, 0 omissions, 0 notifications
42.5234% passed
* A bunch of fixes to failing tests, now 70% passing.
214 tests, 202091 assertions, 41 failures, 23 errors, 0 pendings, 0 omissions, 0 notifications
70.0935% passed
* More than 80% of tests are passing now.
214 tests, 322331 assertions, 30 failures, 9 errors, 0 pendings, 0 omissions, 0 notifications
81.7757% passed
Unfortunately there is also a sporadic bug/segfault hanging around
that appears to be GC-related.
* Add linux/ruby30 and macos/ruby30
* Use rvm master for 3.0.0-preview2
* Over 90% of tests are passing!
214 tests, 349898 assertions, 15 failures, 1 errors, 0 pendings, 0 omissions, 0 notifications
92.5234% passed
* Passes all tests!
214 tests, 369388 assertions, 0 failures, 0 errors, 0 pendings, 0 omissions, 0 notifications
100% passed
* A bunch of cleanup.
1. Removed a bunch of internal-only symbols from headers.
2. Required a frozen check to get a non-const pointer to a map or array.
3. De-duplicated the code to get a type argument for Map/RepeatedField.
* Removed a bunch more stuff from protobuf.h. There is an intermittent assert failure.
Intermittent failure:
ruby: ../../../../ext/google/protobuf_c/protobuf.c:263: ObjectCache_Add: Assertion `rb_funcall(obj_cache2, (__builtin_constant_p("[]") ? __extension__ ({ static ID rb_intern_id_cache; if (!rb_intern_id_cache) rb_intern_id_cache = rb_intern2((("[]")
), (long)strlen(("[]"))); (ID) rb_intern_id_cache; }) : rb_intern("[]")), 1, key_rb) == val' failed
* Removed a few more things from protobuf.h.
* Ruby 3.0.0-preview2 to 3.0.0
* Require rake-compiler-dock >= 1.1.0
* More progress, fighting with the object cache.
* Passes on all Ruby versions!
* Updated and clarified comment regarding WeakMap.
* Fixed the wyhash compile.
* Fixed conformance tests for Ruby.
Conformance results now look like:
RUBYLIB=../ruby/lib:. ./conformance-test-runner --enforce_recommended --failure_list failure_list_ruby.txt --text_format_failure_list text_format_failure_list_ruby.txt ./conformance_ruby.rb
CONFORMANCE TEST BEGIN ====================================
CONFORMANCE SUITE PASSED: 1955 successes, 0 skipped, 58 expected failures, 0 unexpected failures.
CONFORMANCE TEST BEGIN ====================================
CONFORMANCE SUITE PASSED: 0 successes, 111 skipped, 8 expected failures, 0 unexpected failures.
Fixes include:
- Changed Ruby compiler to no longer reject proto2 maps.
- Changed Ruby compiler to emit a warning when proto2 extensions are
present instead of rejecting the .proto file completely.
- Fixed conformance tests to allow proto2 and look up message by name
instead of hardcoding a specific list of messages.
- Fixed conformance test to support the "ignore unknown" option for
JSON.
- Fixed conformance test to properly report serialization errors.
* Removed debug printf and fixed #inspect for floats.
* Fixed compatibility test to have proper semantics for #to_json.
* Updated Makefile.am with new file list.
* Don't try to copy wyhash when inside Docker.
* Fixed bug where we would forget that a sub-object is frozen in Ruby >=2.7.
* Avoid exporting unneeded symbols and refactored a bit of code.
* Some more refactoring.
* Simplified and added more comments.
* Some more comments and simplification. Added a missing license block.
Co-authored-by: Masaki Hara <hara@wantedly.com>
Message and Repeated field override clone so that it uses the internal implementation of dup but Map is missing this and only implements dup. This can lead to unexpected behavior since two out of three complex types behave correctly.
* Fix a typo
* Fix lots of spelling errors
* Fix a few more spelling mistakes
* s/parsable/parseable/
* Don't touch the third party files
* Cloneable is the preferred C# term
* Copyable is the preferred C++ term
* Revert "s/parsable/parseable/"
This reverts commit 534ecf7675.
* Revert unparseable->unparsable corrections
* WIP.
* WIP.
* Builds and runs. Tests need to be updated to test presence.
* Ruby: proto3 presence is passing all tests.
* Fixed a bug where empty messages has the wrong oneof count.
Previously if you assigned 'nil' to a submessage in proto2
the field would be set to 'nil' but would still have its hasbit
set. This was a clear bug so I'm fixing it outright, even though
it is an observable behavior change.
This patch has almost no change in behaviour where users have not
patched the implementation of new on either a specific proto object
class, or `Google::Protobuf::MessageExts::ClassMethods`. The default
implementation of `new`, and `rb_class_new_instance` have the same
behaviour.
By default when we call `new` on a class in Ruby, it goes to the `Class`
class's implementation:
```ruby
class Foo
end
>> Foo.method(:new).owner
=> Class
```
the `Class` implementation of `new` is (pseudocode, it's actually in c):
```ruby
class Class
def new(*args, &blk)
instance = alloc
instance.initialize(*args, &blk)
instance
end
end
```
`rb_class_new_instance` does the same thing, it calls down to
[`rb_class_s_new`](https://github.com/ruby/ruby/blob/v2_5_5/object.c#L2147),
which calls `rb_class_alloc`, then `rb_obj_call_init`.
`rb_funcall` is a variadic c function for calling a ruby method on an object,
it takes:
* A `VALUE` on to which the method should be called
* An `ID` which should be an ID of a method, usually created with `rb_intern`,
to get an ID from a string
* An integer, the number of arguments calling the method with,
* A number of `VALUE`s, to send to the method call.
`rb_funcall` is the same as calling a method directly in Ruby, and will perform
ancestor chain respecting method lookup.
This means that in C extensions, if nobody has defined the `new` method on any
classes or modules in a class's inheritance chain calling
`rb_class_new_instance` is the same as calling `rb_funcall(klass,
rb_intern("new"))`, *however* this removes the ability for users to define or
monkey patch their own constructors in to the objects created by the C
extension.
In Ads, we define [`new`](https://git.io/JvFC9) on
`Google::Protobuf::MessageExts::ClassMethods` to allow us to insert a
monkeypatch which makes it possible to assign primitive values to wrapper type
fields (e.g. Google::Protobuf::StringValue). The monkeypatch we apply works for
objects that we create for the user via the `new` method. Before this commit,
however, the patch does not work for the `decode` method, for the reasons
outlined above. Before this commit, protobuf uses `rb_class_new_instance`.
After this commit, we use `rb_funcall(klass, rb_intern("new"), 0);` to construct
protobuf objects during decoding. While I haven't measured it this will have
a very minor performance impact for decoding (`rb_funcall` will have to go to the
method cache, which `rb_class_new_instance` will not). This does however do
the "more rubyish" thing of respecting the protobuf object's inheritance chain
to construct them during decoding.
I have run both Ads and Cloud's test suites for Ruby libraries against this
patch, as well as the protobuf Ruby gem's test suite locally.
* Set execute bit on files if and only if they begin with (#!).
Git only tracks the 'x' (executable) bit on each file. Prior to this
CL, our files were a random mix of executable and non-executable.
This change imposes some order by making files executable if and only
if they have shebang (#!) lines at the beginning.
We don't have any executable binaries checked into the repo, so
we shouldn't need to worry about that case.
* Added fix_permissions.sh script to set +x iff a file begins with (#!).
* Add failing tests for issues with wrapped values where the value is the default
* Add test for wrapped values without a value set
* Bugfix for wrapper types with default values.
The previous optimizations for wrapper types had a bug that prevented
wrappers from registering as "present" if the "value" field was not
present on the wire.
In practice the "value" field will not be serialized when it is zero,
according to proto3 semantics, but due to the optimization this
prevented it from creating a new object to represent the presence of the
field.
The fix is to ensure that if the wrapper message is present on the wire,
we always initialize its value to zero.
Co-authored-by: Joshua Haberman <jhaberman@gmail.com>
Co-authored-by: Dan Quan <dan@quan.io>
* Add failing tests for issues with wrapped values where the value is the default
* Add test for wrapped values without a value set
* Bugfix for wrapper types with default values.
The previous optimizations for wrapper types had a bug that prevented
wrappers from registering as "present" if the "value" field was not
present on the wire.
In practice the "value" field will not be serialized when it is zero,
according to proto3 semantics, but due to the optimization this
prevented it from creating a new object to represent the presence of the
field.
The fix is to ensure that if the wrapper message is present on the wire,
we always initialize its value to zero.
Co-authored-by: Dan Quan <dan@quan.io>
The only case that doesn't work is decoding a wrapper type from JSON
at the top level. This doesn't make sense and probably no users do it
I changed it to throw.
`OneOfDescriptor_each` is registered as a Ruby method which takes zero
parameters, which means it should take one argument.
When Ruby invokes `OneOfDescriptor_each`, it calls it with one parameter
only, which is one less than what `OneOfDescriptor_each` takes before
this commit. Calling a function with the wrong number of argument is
technically undefined behavior.
See also: §6.5.2.2, N1256
We were creating a map decoding frame when starting the *map*,
but clearing the GC slot when finishing each *map entry*. This
means that the decoding frame could be collected in the meantime.
* Rolled forward again with "Updated upb from defcleanup branch..."
Revert "Revert "Updated upb from defcleanup branch and modified Ruby to use it (#5539)" (#5848)"
This reverts commit 1568deab40.
* A few more merge fixes.
* Updated for defcleanup2 branch.
* Fixed upb to define upb_decode().
* Fixed names of nested messages.
* Revert submodule.
* Set -std=gnu90 and fixed warnings/errors.
Some of our Kokoro tests seem to run with this level of warnings,
and the source strives to be gnu90 compatible. Enforcing it for
every build removes the possibility of some errors showing up in
Kokoro/Travis tests only.
* Fixed remaining warnings with gnu90 mode.
I tried to match warning flags with what Ruby appears to do
in our Kokoro tests.
* Initialize values registered by rb_gc_register_address().
* Fixed subtle GC bug.
We need to initialize this marked value before creating the instance.
* Truly fix the GC bug.
* Updated upb for mktime() fix.
* Removed XOPEN_SOURCE as we are not using strptime().
* Removed fixed tests from the conformance failure list for Ruby.
* Fixed memory error related to oneof def names.
* Picked up new upb changes re: JSON printing.
* Uncomment concurrent decoding test.
Prior to this CL, creating an empty message object would create
two empty string objects for every declared field. First we
created a unique string object for the field's default. Then
we created yet another string object when we assigned the
default value into the message: we called #encode to ensure
that the string would have the correct encoding and be frozen.
I optimized these unnecessary objects away with two fixes:
1. Memoize the empty string so that we don't create a new empty
string for every field's default.
2. If we are assigning a string to a message object, avoid creating
a new string if the assigned string has the correct encoding and
is already frozen.