Porting a Physics Engine to Rust
ReturnBepuphysics2 is a high performance 3D physics engine written in C#. This library emphasizes performance through cache locality, SIMD, and aggressive compiler optimizations. Memory is carefully managed manually to avoid garbage collection pauses. Porting this library to Rust comes with several challenges and porting to Rust will not provide much, if any, performance improvement. The goal of this port is to bolster Rust's 3D ecosystem with a mature and feature-rich physics engine, remove foreign function overhead, and integrate with the many popular ECS libraries in Rust.
Challenges
There are several important differences between C# and Rust that make porting a physics engine challenging.
- 1. There are a lot of C# syntax features and standard library APIs that need to be replaced.
- 2. Rust and C# have their own idioms and it's important to use Rust's idioms when porting to Rust.
- 3. Rust has a strict ownership system that can make it difficult to translate C# code to Rust, especially when dealing with memory management and alignment. At times, this can require a lot of unsafe code and fighting the borrow checker.
- 4. Rust's SIMD support is still unstable and incomplete. Sometimes, it is necessary to write instrinsics directly in Rust instead of using the portable_simd crate.
- 5. The Bepuphysics2 library has a lack of testing, which can make it difficult to ensure that the ported library is correct.
Strategy
The first step in porting the library involves identifying dependencies and C# APIs. Next I need to find all self-contained components that can be ported and tested independently. Lastly, I need to write comprehensive tests for the ported code to ensure correctness.
Studying the code base, it is determined that there are 2 major parts: the physics engine itself and the utilities code, which is self contained. The physics engine relies heavily on the utilities code. The utilities code is a collection of math functions and high performance structures that take advantage of SIMD. SIMD is achieved in C# through the Vector struct in the Systems.Numerics namespace. This class is a wrapper around SIMD instructions that relies on JIT optimizations to generate platform specific SIMD instructions. Fallback implementations are provided for platforms that do not support SIMD instructions. The amount of lanes in the Vector is determined at runtime based on the CPU architecture. This is a feature that is not available in Rust. Fortunately, it is was trivial to implement this by creating a wrapper type and using conditional compilation:
// I ommitted the implementation for all architectures for brevity // This code is targeted for the ARM architecture #[cfg(target_arch = "aarch64")] const fn preferred_byte_size() -> usize { #[cfg(all( target_feature = "neon", not(any(target_feature = "sve", target_feature = "sve2")) ))] { 16 } } pub const fn optimal_lanes<T>() -> usize { const fn max(a: usize, b: usize) -> usize { if a > b { a } else { b } } max(preferred_byte_size() / std::mem::size_of::<T>(), 2) } pub type Vector<T> = std::simd::Simd<T, { optimal_lanes::<T>() }>;
/// Computes the length of a vector. [MethodImpl(MethodImplOptions.AggressiveInlining)] public static void Length(in Vector3Wide v, out Vector<float> length) { length = Vector.SquareRoot(v.X * v.X + v.Y * v.Y + v.Z * v.Z); }
/// Provides a zero-cost abstraction for out parameters similar to C#'s `out` keyword. /// /// # Examples /// ``` /// // Instead of: /// let mut result = MaybeUninit::<Symmetric3x3Wide>::uninit(); /// Symmetric3x3Wide::scale(&self, rhs, unsafe { result.as_mut_ptr().as_mut().unwrap() }); /// let result = unsafe { result.assume_init() }; /// /// // You can write: /// let result = out!(Symmetric3x3Wide::scale(&self, rhs)); /// ``` /// In C#, the equivalent would be: /// ``` /// Symmetric3x3Wide.Scale(ref this, rhs, out var result); /// ``` #[macro_export] macro_rules! out { ($e:ident :: $method:ident ( $($arg:expr),* )) => {{ let mut __result = std::mem::MaybeUninit::uninit(); $e::$method($($arg,)* unsafe { &mut *(__result.as_mut_ptr()) }); unsafe { __result.assume_init() } }}; }
Adding ARM Support
The Bepuphysics2 library is currently not optimized for ARM. Instead, some SIMD operations will fallback to scalar operations on ARM. Here is an example from the MathHelper.cs file:
[MethodImpl(MethodImplOptions.AggressiveInlining)] public static Vector<float> FastReciprocal(Vector<float> v) { if (Avx.IsSupported && Vector<float>.Count == 8) { return Avx.Reciprocal(v.AsVector256()).AsVector(); } else if (Sse.IsSupported && Vector<float>.Count == 4) { return Sse.Reciprocal(v.AsVector128()).AsVector(); } else { return Vector<float>.One / v; } //TODO: Arm! }
#[inline(always)] pub fn fast_reciprocal(v: Vector<f32>) -> Vector<f32> { #[cfg(target_arch = "x86_64")] unsafe { if is_x86_feature_detected!("avx512f") { let v512 = _mm512_load_ps(v.as_ptr()); let result512 = _mm512_rcp14_ps(v512); std::mem::transmute(result512) } else if is_x86_feature_detected!("avx") { let v256 = _mm256_load_ps(v.as_ptr()); let result256 = _mm256_rcp_ps(v256); let result128 = _mm256_castps256_ps128(result256); std::mem::transmute(result128) } else if is_x86_feature_detected!("sse") { let v128 = _mm_load_ps(v.as_ptr()); let result128 = _mm_rcp_ps(v128); std::mem::transmute(result128) } else { v.recip() } } #[cfg(target_arch = "aarch64")] unsafe { let v_neon = vld1q_f32(v.as_array().as_ptr()); let result_neon = vrecpeq_f32(v_neon); std::mem::transmute(result_neon) } #[cfg(not(any(target_arch = "x86_64", target_arch = "aarch64")))] { v.recip() } }
Testing
Tests are still being implemented for each component in the utilities library.
Return