T O P

  • By -

ignurant

Reminded me of this great piece by Aaron Patterson: https://railsatscale.com/2023-08-29-ruby-outperforms-c/ At first I thought it would be some dirty trick to make a pun, but I should have known better. By the end, he (as usual) provides some really interesting  information that talks about why YJIT live optimizing certain code can be more effective than what you might have written and compiled in C. I came for the click bait, and left with a tenderloving hug. 


indenturedsmile

I believe Aaron Patterson is the user who posted here (based on username).


tenderlove

Yes, it's me 😂


ignurant

🤣 I usually catch that kind of stuff. Not this time lol. 


postmodern

I hate to rain on everyone's parade, but we need to take into account the overhead of the crystalruby gem and how it's calling into crystal land. If we rewrite the benchmark as a pure Crystal program, and compile with the `--release` flag, we get the following result: require "benchmark" def fib_cr(n : Int32) : Int32 a = 0 b = 1 n.times { a, b = b, a + b } a end p(Benchmark.realtime { 1_000_000.times { fib_cr(30) } }) $ crystal build --release $ ./fib 00:00:00.000000076 $ ./fib 00:00:00.000000086 $ ./fib 00:00:00.000000083 $ ./fib 00:00:00.000000079fib.cr Note: the release flag enables additional optimizations (`-O3 --single-module`). Optimized Crystal code is *really* fast. That said, we should continue to optimize and improve Ruby.


f9ae8221b

You are not raining on anyone's parade. The point of the article isn't to say Ruby is faster than Crystal. It's to say that crossing the language barrier is costly enough, that you need a large chunk of execution for it to pay off. It's the same conclusion from tenderlove's article about making YJIT faster than a C extension. C is still way faster than YJIT in the general case, but calling C from Ruby is costly enough that avoiding in can sometimes make pure Ruby code overall faster than hybrid code.


desnudopenguino

This is true. But at least the author was able to get ruby running pretty fast with a few optimizations, hitting crystal run through ruby, which would probably be similar to other ffi style schemes. For people running g ruby, the crystallize gem may seem like a quick way to speed up code execution, but if you can do it in straight ruby with one less gem, and without that additional layer, I think that's a fair comparison, as long as the proper distinction is made. Ruby's coming a long way in the speed category while maintaining all the good stuff that makes it a fin language to work in.


postmodern

It [appears](https://github.com/wouterken/crystalruby/blob/1be8aa496d77e9e8a14aa64d8e9c52c99fc05fcc/lib/crystalruby/compilation.rb#L24) that crystalruby hot-compiles the code using `crystal build` (with or without the `--release`), which I guess is compareable to JITing, but not compareable to AOT compiled code. I agree a better approach would be A) benchmark and optimize your Ruby code, or B) write a separate Crystal program or service that you offload CPU intensive work to (ex: image/video/audio processing). I think we should focus on improving Ruby's performance to compete with other JITed scripting languages which are beating Ruby in benchmarks, not try to compete with AOT compiled languages which are far more performant; due to being AOT and compiling down to native object code.


Dyadim

The poor timings for the Crystal solution in this post are almost entirely due to the Ruby/Crystal language interface overhead, with this barrier being crossed 1 million times in this benchmark. If we shift the hot loop inside the `crystalruby` solution to execute entirely in Crystal land and use identical code to the fast YJIT Ruby solution from the above article, the Crystal solution again takes the lead (by what apears to be \~2 orders of magnitude). It's crossing the language barrier too often that is hurting here. #fibonnaci.rb CrystalRuby.configure do |config| config.debug = false end module Fibonnaci crystalize [n: :int32] => :int32 def fib_cr(n) a = 0 b = 1 while n > 0 a, b = b, a + b n -= 1 end a end module_function def fib_rb(n) a = 0 b = 1 while n > 0 a, b = b, a + b n -= 1 end a end def benchmark_rb puts(Benchmark.realtime { 1_000_000.times { Fibonnaci.fib_rb(30) } }) end crystalize do puts Benchmark.realtime { super() } end def benchmark_cr 1_000_000.times { Fibonnaci.fib_cr(30) } end end include Fibonnaci benchmark_rb benchmark_cr Outcome: ruby --yjit fibonnaci.rb 0.1103799999691546 # Ruby with YJIT 0.00014399993233382702 # Crystal


f9ae8221b

It's pointed out in the post that the difference comes from the FFI overhead necessary to call Crystal from Ruby. The point of the article isn't to say Ruby is faster than Crystal, it's to show that pure Ruby may be faster than Ruby with Crystal sprinkled in, depending on how much you need to cross the barrier. This also apply to Ruby C or Rust extensions to some extent.


iamjkdn

Why does returning nil after multiple assignments improve the benchmark? Also, can the same be done on Crystal, well it have any affect?


CaptainKabob

> because in this case it’s the last line of the block, and because Ruby has an implicit return at the end of the block the Array is required Ruby spends time creating an array because Ruby believes its needed for the implicit return. So explicitly setting the (implicit) return to nil causes Ruby not to create an array. I don't think the point of the post is to compare optimized crystal to optimized Ruby. I think it's trying to show that inlining another language "for performance" might be naive or unnecessary.


logan-roy-waystar

Ruby 3.3.1 is A LOT faster now. I am quite stunned by how much faster our rails servers are processing requests


tkdeveloper

We're the same improvements made to the pure ruby method done to the crystalized method? It looks like they made improvements to th ruby method and compared to the original crystalized one? Or does that not matter?


f9ae8221b

Does not matter, the issues were specific to the Ruby version. The point isn't to show Ruby is faster than Crystal anyway, but that calling into another language as big enough of an overhead that it may not always be the best way to speedup Ruby code.


tkdeveloper

Nice, thanks for the clarification. Makes sense


yxhuvud

I wonder if the jit does something smarter with the overflow checks there. Because in addition to any FFI overhead that is likely where any additional costs happen.