Follow up on #10404, the initial description:
> For the sake of performance, the hex2bin call was removed from the generated PHP code in https://github.com/protocolbuffers/protobuf/pull/8006
However, after this PR all autogenerated files contain binary data, which makes it hard or even impossible to see using common tools (git and github included (i.e. [this file](https://github.com/protocolbuffers/protobuf/blob/main/php/src/GPBMetadata/Google/Protobuf/Struct.php)), as well as some code editors) since most software considers such files as binary ones and not as code.
Alexander suggested using a hex representation of string literals and I updated the original patch a bit, not hex-encoding printable characters.
Benchmarks from [#10404#issuecomment-1635939062](https://github.com/protocolbuffers/protobuf/pull/10404#issuecomment-1635939062)
```
The percent of printable chars: 0.83869 (80112 vs 15408)
hex parsing: 3.22371, length=382090
hex2Bin parsing: 1.18489, length=191059
hexAscii parsing: 0.89524, length=142306 (suggested option)
binary parsing: 0.26437, length=95542
```
Closes#13911
COPYBARA_INTEGRATE_REVIEW=https://github.com/protocolbuffers/protobuf/pull/13911 from mikhainin:do_not_store_binary_data_inside_php_files 490958e165
PiperOrigin-RevId: 601123255
Also hardened the text format printer against invalid UTF-8 in string fields. The output string will always be valid UTF-8, even if string fields contain invalid UTF-8.
PiperOrigin-RevId: 600990001
It's a little bit awkward that we currently have to take a runtime dependency
on rake even though it is a build tool. This CL adds a TODO to explore moving
the Rakefile logic to extconf.rb so that we can drop the rake dependency.
PiperOrigin-RevId: 600936420
We have a rake based extension `ext/google/protobuf_c/Rakefile` (added since 3.25.0), which depends on `rake`. However, when gem is installed by `bundle` command, `rake` is not guaranteed to be activated by `bundle` (e.g. when `BUNDLE_PATH` is not default and rake is not already installed in `BUNDLE_PATH`). Therefore, we have to explicit declare a runtime dependency on `rake` for the extension, or otherwise the installation will fail with `can't find gem rake (>= 0.a) with executable rake (Gem::GemNotFoundException)`.
Here is a reproduction:
```
root@4d9cb6ba41b1:/build# cat Gemfile
source 'https://rubygems.org'
gem 'google-protobuf'
root@4d9cb6ba41b1:/build# bundle config --local path vendor/bundle
root@4d9cb6ba41b1:/build# bundle
Fetching gem metadata from https://rubygems.org/...........
Resolving dependencies...
Installing google-protobuf 3.25.1 with native extensions
Gem::Ext::BuildError: ERROR: Failed to build gem native extension.
current directory: /build/vendor/bundle/ruby/3.3.0/gems/google-protobuf-3.25.1/ext/google/protobuf_c
rake RUBYARCHDIR\=/build/vendor/bundle/ruby/3.3.0/extensions/aarch64-linux/3.3.0/google-protobuf-3.25.1 RUBYLIBDIR\=/build/vendor/bundle/ruby/3.3.0/extensions/aarch64-linux/3.3.0/google-protobuf-3.25.1
/usr/local/lib/ruby/3.3.0/rubygems.rb:259:in `find_spec_for_exe': can't find gem rake (>= 0.a) with executable rake (Gem::GemNotFoundException)
from /usr/local/lib/ruby/3.3.0/rubygems.rb:278:in `activate_bin_path'
from /usr/local/bin/rake:25:in `<main>'
rake failed, exit code 1
Gem files will remain installed in /build/vendor/bundle/ruby/3.3.0/gems/google-protobuf-3.25.1 for inspection.
Results logged to /build/vendor/bundle/ruby/3.3.0/extensions/aarch64-linux/3.3.0/google-protobuf-3.25.1/gem_make.out
/usr/local/lib/ruby/3.3.0/rubygems/ext/builder.rb:125:in `run'
/usr/local/lib/ruby/3.3.0/rubygems/ext/rake_builder.rb:30:in `build'
/usr/local/lib/ruby/3.3.0/rubygems/ext/builder.rb:193:in `build_extension'
/usr/local/lib/ruby/3.3.0/rubygems/ext/builder.rb:227:in `block in build_extensions'
/usr/local/lib/ruby/3.3.0/rubygems/ext/builder.rb:224:in `each'
/usr/local/lib/ruby/3.3.0/rubygems/ext/builder.rb:224:in `build_extensions'
/usr/local/lib/ruby/3.3.0/rubygems/installer.rb:852:in `build_extensions'
/usr/local/lib/ruby/3.3.0/bundler/rubygems_gem_installer.rb:76:in `build_extensions'
/usr/local/lib/ruby/3.3.0/bundler/rubygems_gem_installer.rb:28:in `install'
/usr/local/lib/ruby/3.3.0/bundler/source/rubygems.rb:205:in `install'
/usr/local/lib/ruby/3.3.0/bundler/installer/gem_installer.rb:54:in `install'
/usr/local/lib/ruby/3.3.0/bundler/installer/gem_installer.rb:16:in `install_from_spec'
/usr/local/lib/ruby/3.3.0/bundler/installer/parallel_installer.rb:132:in `do_install'
/usr/local/lib/ruby/3.3.0/bundler/installer/parallel_installer.rb:123:in `block in worker_pool'
/usr/local/lib/ruby/3.3.0/bundler/worker.rb:62:in `apply_func'
/usr/local/lib/ruby/3.3.0/bundler/worker.rb:57:in `block in process_queue'
<internal:kernel>:187:in `loop'
/usr/local/lib/ruby/3.3.0/bundler/worker.rb:54:in `process_queue'
/usr/local/lib/ruby/3.3.0/bundler/worker.rb:90:in `block (2 levels) in create_threads'
An error occurred while installing google-protobuf (3.25.1), and Bundler cannot continue.
In Gemfile:
google-protobuf
```
Closes#15203
COPYBARA_INTEGRATE_REVIEW=https://github.com/protocolbuffers/protobuf/pull/15203 from ntkme:ruby-rake-dependency 1318863022
PiperOrigin-RevId: 600916613
Stage 2 of 3.
Fleshed out the bodies required by the aforementioned contracts above.
We create MessageVTable and utilize the `unsafe fns` directly in `message.cc` -- a departure from PrimitiveVTables.
In the final followup CL, we will perform the field_entry swapover and update all unit tests.
PiperOrigin-RevId: 600567890
A fuzz test discovered that our JSON parser will accept "overlong" UTF-8
characters, i.e. encodings that use more bytes than necessary. We should reject
these overlong encodings, because they are not considered valid.
To fix this problem, I updated the JSON lexer to rely on utf8_range for
checking UTF-8 validity. This way, the lexer does not need to do any UTF-8
validation of its own, but just has to interpret the UTF-8 enough to know how
many bytes to read for each character.
PiperOrigin-RevId: 599657903