Merge pull request #4346 from BSBandme/performance_result

Add performance.md and add instruction for linking tcmalloc
7 years ago · a7a746f0b7
parent 227363bc38 745ef89ebf
commit a7a746f0b7
2 changed files with 318 additions and 1 deletions
--- a/benchmarks/README.md
+++ b/benchmarks/README.md
@ -3,7 +3,9 @@
 This directory contains benchmarking schemas and data sets that you
 can use to test a variety of performance scenarios against your
-protobuf language runtime.
+protobuf language runtime. If you are looking for performance 
 numbers of officially support languages, see [here](
 https://github.com/google/protobuf/blob/master/docs/Performance.md)
 ## Prerequisite
@ -17,6 +19,11 @@ We are using [google/benchmark](https://github.com/google/benchmark) as the
 benchmark tool for testing cpp. This will be automaticly made during build the
 cpp benchmark.
 The cpp protobuf performance can be improved by linking with [tcmalloc library](
 https://gperftools.github.io/gperftools/tcmalloc.html). For using tcmalloc, you
 need to build [gpertools](https://github.com/gperftools/gperftools) to generate
 libtcmallc.so library.
 ### Java
 We're using maven to build the java benchmarks, which is the same as to build
 the Java protobuf. There're no other tools need to install. We're using
@ -79,6 +86,12 @@ $ make java
 $ make cpp
 ```
 For linking with tcmalloc:
 ```
 $ env LD_PRELOAD={directory to libtcmalloc.so} make cpp
 ```
 ### Python:
 We have three versions of python protobuf implementation: pure python, cpp
--- a/docs/performance.md
+++ b/docs/performance.md
@ -0,0 +1,304 @@
 # Protobuf Perforamcne
 This benchmark result is tested on workstation with processor of Intel® Xeon® Processor E5-2630 and 32GB RAM
 This table contains 3 languages' results:
 * **C++** - For C++ there're 3 kinds of parsing ways:
 	* **new** - This is for using new operator for creating message instance.
    * **new arena** - This is for using arena for creating new message instance.
    * **reuse** - This is for reusing the same message instance for parsing.
 * **Java** - For Java there're 3 kinds of parsing/Serialization ways:
 	* **byte[]** - This is for parsing from a Byte Array.
    * **ByteString** - This is for parsing from a 
    	com.google.protobuf.ByteString.
    * **InputStream** - This is for parsing from a InputStream
 * **Python** - For Pythong there're 3 kinds of python protobuf for testing:
 	* **C++-genereated-code** - This is for using cpp generated code of the 
    	proto file as dynamic linked library.
 	* **C++-reflection** - This is for using cpp reflection, which there's no
    	generated code, but still using cpp protobuf library as dynamic linked 
        library.
 	* **pure-Python** - This is for pure Python version, which don't link with 
    	any cpp protobuf library.
 ## Parsing performance
 <table>
 <tbody><tr>
 <th rowspan="2"> </th>
 <th colspan="3" rowspan="1">C++</th>
 <th colspan="3" rowspan="1">C++ with tcmalloc</th>
 <th colspan="3" rowspan="1">java</th>
 <th colspan="3" rowspan="1">python</th>
 </tr>
 <tr>
 <th colspan="1">new</th>
 <th colspan="1">new arena</th>
 <th colspan="1">reuse</th>
 <th colspan="1">new</th>
 <th colspan="1">new arena</th>
 <th colspan="1">reuse</th>
 <th colspan="1">byte[]</th>
 <th colspan="1">ByteString</th>
 <th colspan="1">InputStream</th>
 <th colspan="1">C++-generated-code</th>
 <th colspan="1">C++-reflection</th>
 <th colspan="1">pure-Python</th>
 </tr>
 <tr>
 <td>google_message1_proto2</td>
 <td>368.717MB/s</td>
 <td>261.847MB/s</td>
 <td>799.403MB/s</td>
 <td>645.183MB/s</td>
 <td>441.023MB/s</td>
 <td>1.122GB/s</td>
 <td>425.437MB/s</td>
 <td>425.937MB/s</td>
 <td>251.018MB/s</td>
 <td>82.8314MB/s</td>
 <td>47.6763MB/s</td>
 <td>3.76299MB/s</td>
 </tr>
 <tr>
 <td>google_message1_proto3</td>
 <td>294.517MB/s</td>
 <td>229.116MB/s</td>
 <td>469.982MB/s</td>
 <td>434.510MB/s</td>
 <td>394.701MB/s</td>
 <td>591.931MB/s</td>
 <td>357.597MB/s</td>
 <td>378.568MB/s</td>
 <td>221.676MB/s</td>
 <td>82.0498MB/s</td>
 <td>39.9467MB/s</td>
 <td>3.77751MB/s</td>
 </tr>
 <tr>
 <td>google_message2</td>
 <td>277.242MB/s</td>
 <td>347.611MB/s</td>
 <td>793.67MB/s</td>
 <td>503.721MB/s</td>
 <td>596.333MB/s</td>
 <td>922.533MB/s</td>
 <td>416.778MB/s</td>
 <td>419.543MB/s</td>
 <td>367.145MB/s</td>
 <td>241.46MB/s</td>
 <td>71.5723MB/s</td>
 <td>2.73538MB/s</td>
 </tr>
 <tr>
 <td>google_message3_1</td>
 <td>213.478MB/s</td>
 <td>291.58MB/s</td>
 <td>543.398MB/s</td>
 <td>539.704MB/s</td>
 <td>717.300MB/s</td>
 <td>927.333MB/s</td>
 <td>684.241MB/s</td>
 <td>704.47MB/s</td>
 <td>648.624MB/s</td>
 <td>209.036MB/s</td>
 <td>142.356MB/s</td>
 <td>15.3324MB/s</td>
 </tr>
 <tr>
 <td>google_message3_2</td>
 <td>672.685MB/s</td>
 <td>802.767MB/s</td>
 <td>1.21505GB/s</td>
 <td>985.790MB/s</td>
 <td>1.136GB/s</td>
 <td>1.367GB/s</td>
 <td>1.54439GB/s</td>
 <td>1.60603GB/s</td>
 <td>1.33443GB/s</td>
 <td>573.835MB/s</td>
 <td>314.33MB/s</td>
 <td>15.0169MB/s</td>
 </tr>
 <tr>
 <td>google_message3_3</td>
 <td>207.681MB/s</td>
 <td>140.591MB/s</td>
 <td>535.181MB/s</td>
 <td>369.743MB/s</td>
 <td>262.301MB/s</td>
 <td>556.644MB/s</td>
 <td>279.385MB/s</td>
 <td>304.853MB/s</td>
 <td>107.575MB/s</td>
 <td>32.248MB/s</td>
 <td>26.1431MB/s</td>
 <td>2.63541MB/s</td>
 </tr>
 <tr>
 <td>google_message3_4</td>
 <td>7.96091GB/s</td>
 <td>7.10024GB/s</td>
 <td>9.3013GB/s</td>
 <td>8.518GB/s</td>
 <td>8.171GB/s</td>
 <td>9.917GB/s</td>
 <td>5.78006GB/s</td>
 <td>5.85198GB/s</td>
 <td>4.62609GB/s</td>
 <td>2.49631GB/s</td>
 <td>2.35442GB/s</td>
 <td>802.061MB/s</td>
 </tr>
 <tr>
 <td>google_message3_5</td>
 <td>76.0072MB/s</td>
 <td>51.6769MB/s</td>
 <td>237.856MB/s</td>
 <td>178.495MB/s</td>
 <td>111.751MB/s</td>
 <td>329.569MB/s</td>
 <td>121.038MB/s</td>
 <td>132.866MB/s</td>
 <td>36.9197MB/s</td>
 <td>10.3962MB/s</td>
 <td>8.84659MB/s</td>
 <td>1.25203MB/s</td>
 </tr>
 <tr>
 <td>google_message4</td>
 <td>331.46MB/s</td>
 <td>404.862MB/s</td>
 <td>427.99MB/s</td>
 <td>589.887MB/s</td>
 <td>720.367MB/s</td>
 <td>705.373MB/s</td>
 <td>606.228MB/s</td>
 <td>589.13MB/s</td>
 <td>530.692MB/s</td>
 <td>305.543MB/s</td>
 <td>174.834MB/s</td>
 <td>7.86485MB/s</td>
 </tr>
 </tbody></table>
 ## Serialization performance
 <table>
 <tbody><tr>
 <th rowspan="2"> </th>
 <th colspan="1" rowspan="2">C++</th>
 <th colspan="1" rowspan="2">C++ with tcmalloc</th>
 <th colspan="3" rowspan="1">java</th>
 <th colspan="3" rowspan="1">python</th>
 </tr>
 <tr>
 <th colspan="1">byte[]</th>
 <th colspan="1">ByteString</th>
 <th colspan="1">InputStream</th>
 <th colspan="1">C++-generated-code</th>
 <th colspan="1">C++-reflection</th>
 <th colspan="1">pure-Python</th>
 </tr>
 <tr>
 <td>google_message1_proto2</td>
 <td>1.39698GB/s</td>
 <td>1.701GB/s</td>
 <td>1.12915GB/s</td>
 <td>1.13589GB/s</td>
 <td>758.609MB/s</td>
 <td>260.911MB/s</td>
 <td>58.4815MB/s</td>
 <td>5.77824MB/s</td>
 </tr>
 <tr>
 <td>google_message1_proto3</td>
 <td>959.305MB/s</td>
 <td>939.404MB/s</td>
 <td>1.15372GB/s</td>
 <td>1.07824GB/s</td>
 <td>802.337MB/s</td>
 <td>239.4MB/s</td>
 <td>33.6336MB/s</td>
 <td>5.80524MB/s</td>
 </tr>
 <tr>
 <td>google_message2</td>
 <td>1.27429GB/s</td>
 <td>1.402GB/s</td>
 <td>1.01039GB/s</td>
 <td>1022.99MB/s</td>
 <td>798.736MB/s</td>
 <td>996.755MB/s</td>
 <td>57.9601MB/s</td>
 <td>4.09246MB/s</td>
 </tr>
 <tr>
 <td>google_message3_1</td>
 <td>1.31916GB/s</td>
 <td>2.049GB/s</td>
 <td>991.496MB/s</td>
 <td>860.332MB/s</td>
 <td>662.88MB/s</td>
 <td>1.48625GB/s</td>
 <td>421.287MB/s</td>
 <td>18.002MB/s</td>
 </tr>
 <tr>
 <td>google_message3_2</td>
 <td>2.15676GB/s</td>
 <td>2.632GB/s</td>
 <td>2.14736GB/s</td>
 <td>2.08136GB/s</td>
 <td>1.55997GB/s</td>
 <td>2.39597GB/s</td>
 <td>326.777MB/s</td>
 <td>16.0527MB/s</td>
 </tr>
 <tr>
 <td>google_message3_3</td>
 <td>650.456MB/s</td>
 <td>1.040GB/s</td>
 <td>593.52MB/s</td>
 <td>580.667MB/s</td>
 <td>346.839MB/s</td>
 <td>123.978MB/s</td>
 <td>35.893MB/s</td>
 <td>2.32834MB/s</td>
 </tr>
 <tr>
 <td>google_message3_4</td>
 <td>8.70154GB/s</td>
 <td>9.825GB/s</td>
 <td>5.88645GB/s</td>
 <td>5.93946GB/s</td>
 <td>2.44388GB/s</td>
 <td>5.9241GB/s</td>
 <td>4.05837GB/s</td>
 <td>876.87MB/s</td>
 </tr>
 <tr>
 <td>google_message3_5</td>
 <td>246.33MB/s</td>
 <td>443.993MB/s</td>
 <td>283.278MB/s</td>
 <td>259.167MB/s</td>
 <td>206.37MB/s</td>
 <td>37.0285MB/s</td>
 <td>12.2228MB/s</td>
 <td>1.1979MB/s</td>
 </tr>
 <tr>
 <td>google_message4</td>
 <td>1.56674GB/s</td>
 <td>2.19601GB/s</td>
 <td>776.907MB/s</td>
 <td>770.707MB/s</td>
 <td>702.931MB/s</td>
 <td>1.49623GB/s</td>
 <td>205.116MB/s</td>
 <td>8.93428MB/s</td>
 </tr>
 </tbody></table>
 \* The cpp performance can be improved by using [tcmalloc](https://gperftools.github.io/gperftools/tcmalloc.html), please follow the (instruction)[https://github.com/google/protobuf/blob/master/benchmarks/README.md] to link with tcmalloc to get the faster result.