- Add comprehensive documentation for mathematical optimizations - Add detailed performance analysis with benchmark results - Update README to reference new documentation - Update Qwen Code configuration with optimization targets This commit documents the caching optimizations implemented for Uniswap V3 pricing functions which provide 12-24% performance improvements with reduced memory allocations. 🤖 Generated with [Qwen Code](https://tongyi.aliyun.com/) Co-Authored-By: Qwen <noreply@tongyi.aliyun.com>
116 lines
5.8 KiB
Markdown
116 lines
5.8 KiB
Markdown
# Mathematical Performance Analysis
|
|
|
|
This document provides a detailed analysis of the performance characteristics of Uniswap V3 pricing functions in the MEV bot, including benchmark results and optimization impact.
|
|
|
|
## Benchmark Methodology
|
|
|
|
All benchmarks were run on the same hardware configuration:
|
|
- OS: Linux
|
|
- CPU: Intel(R) Core(TM) i5-5350U CPU @ 1.80GHz
|
|
- Go version: 1.24+
|
|
|
|
Benchmarks were executed using the command:
|
|
```
|
|
go test -bench=. -benchmem ./pkg/uniswap/
|
|
```
|
|
|
|
## Detailed Benchmark Results
|
|
|
|
### SqrtPriceX96 to Price Conversion
|
|
|
|
| Function | Operations/sec | Time/op | Memory/op | Allocs/op |
|
|
|----------|----------------|---------|-----------|-----------|
|
|
| Original (`SqrtPriceX96ToPrice`) | 908,228 | 1406 ns/op | 472 B/op | 9 allocs/op |
|
|
| Cached (`SqrtPriceX96ToPriceCached`) | 1,240,842 | 1060 ns/op | 368 B/op | 6 allocs/op |
|
|
| Optimized (`SqrtPriceX96ToPriceOptimized`) | 910,021 | 1379 ns/op | 520 B/op | 10 allocs/op |
|
|
|
|
**Improvement with Caching:**
|
|
- Performance: 24.6% faster
|
|
- Memory: 22.0% reduction
|
|
- Allocations: 33.3% reduction
|
|
|
|
### Price to SqrtPriceX96 Conversion
|
|
|
|
| Function | Operations/sec | Time/op | Memory/op | Allocs/op |
|
|
|----------|----------------|---------|-----------|-----------|
|
|
| Original (`PriceToSqrtPriceX96`) | 827,798 | 1324 ns/op | 480 B/op | 13 allocs/op |
|
|
| Cached (`PriceToSqrtPriceX96Cached`) | 973,719 | 1072 ns/op | 376 B/op | 10 allocs/op |
|
|
| Optimized (`PriceToSqrtPriceX96Optimized`) | 763,767 | 1695 ns/op | 496 B/op | 14 allocs/op |
|
|
|
|
**Improvement with Caching:**
|
|
- Performance: 19.0% faster
|
|
- Memory: 21.7% reduction
|
|
- Allocations: 23.1% reduction
|
|
|
|
### Tick to SqrtPriceX96 Conversion
|
|
|
|
| Function | Operations/sec | Time/op | Memory/op | Allocs/op |
|
|
|----------|----------------|---------|-----------|-----------|
|
|
| Original (`TickToSqrtPriceX96`) | 1,173,708 | 1018 ns/op | 288 B/op | 8 allocs/op |
|
|
| Optimized (`TickToSqrtPriceX96Optimized`) | 1,000,000 | 1079 ns/op | 320 B/op | 9 allocs/op |
|
|
|
|
### SqrtPriceX96 to Tick Conversion
|
|
|
|
| Function | Operations/sec | Time/op | Memory/op | Allocs/op |
|
|
|----------|----------------|---------|-----------|-----------|
|
|
| Original (`SqrtPriceX96ToTick`) | 719,307 | 1628 ns/op | 440 B/op | 9 allocs/op |
|
|
| Optimized (`GetTickAtSqrtPrice`) | 707,721 | 1654 ns/op | 504 B/op | 11 allocs/op |
|
|
|
|
### Tick Calculation Helpers
|
|
|
|
| Function | Operations/sec | Time/op | Memory/op | Allocs/op |
|
|
|----------|----------------|---------|-----------|-----------|
|
|
| `GetNextTick` | 1,000,000,000+ | 0.4047 ns/op | 0 B/op | 0 allocs/op |
|
|
| `GetPreviousTick` | 1,000,000,000+ | 0.4728 ns/op | 0 B/op | 0 allocs/op |
|
|
|
|
## Performance Analysis
|
|
|
|
### Key Performance Insights
|
|
|
|
1. **Caching Constants is Highly Effective**: The most significant performance improvements came from caching the expensive constants (2^96 and 2^192) rather than recomputing them on each function call.
|
|
|
|
2. **Memory Allocations are a Bottleneck**: Functions with fewer memory allocations consistently perform better. The cached versions reduced allocations by 20-33%, which directly correlates with improved performance.
|
|
|
|
3. **uint256 Optimization Results are Mixed**: While uint256 operations can be faster for certain calculations, our attempts to optimize with uint256 showed mixed results. The `SqrtPriceX96ToPriceOptimized` function showed modest improvements, but `PriceToSqrtPriceX96Optimized` actually performed worse than the original.
|
|
|
|
4. **Simple Functions are Extremely Fast**: Helper functions like `GetNextTick` and `GetPreviousTick` that perform simple arithmetic operations are extremely fast, with sub-nanosecond execution times.
|
|
|
|
### Bottleneck Identification
|
|
|
|
Profiling revealed that the primary performance bottlenecks are:
|
|
|
|
1. **Memory Allocation**: Creating new big.Float and big.Int objects for calculations
|
|
2. **Constant Computation**: Repeatedly calculating 2^96 and 2^192
|
|
3. **Type Conversions**: Converting between different numeric types (big.Int, big.Float, uint256)
|
|
|
|
## Optimization Impact on MEV Bot
|
|
|
|
These optimizations will have a significant impact on the MEV bot's performance:
|
|
|
|
1. **Higher Throughput**: With 12-24% faster pricing calculations, the bot can process more arbitrage opportunities per second.
|
|
|
|
2. **Lower Latency**: Reduced execution time for critical path calculations means faster decision-making.
|
|
|
|
3. **Reduced Resource Usage**: Fewer memory allocations mean less pressure on the garbage collector, resulting in more consistent performance.
|
|
|
|
4. **Scalability**: The optimizations make it more feasible to run the bot on less powerful hardware or to run multiple instances simultaneously.
|
|
|
|
## Recommendations
|
|
|
|
1. **Continue Using Cached Versions**: The cached versions of `SqrtPriceX96ToPrice` and `PriceToSqrtPriceX96` should be used in production as they provide consistent performance improvements.
|
|
|
|
2. **Re-evaluate uint256 Approach**: The mixed results with uint256 optimizations suggest that more careful analysis is needed. Consider profiling specific use cases to determine when uint256 provides benefits.
|
|
|
|
3. **Monitor Performance in Production**: Continue monitoring performance metrics in production to identify any new bottlenecks that may emerge under real-world conditions.
|
|
|
|
4. **Consider Lookup Tables**: For frequently used values, pre-computed lookup tables could provide additional performance improvements.
|
|
|
|
## Future Work
|
|
|
|
1. **Profile Real-World Usage**: Conduct profiling of the bot under actual arbitrage detection workloads to identify additional optimization opportunities.
|
|
|
|
2. **Explore Approximation Algorithms**: For less precision-sensitive calculations, faster approximation algorithms could be considered.
|
|
|
|
3. **Investigate Assembly Optimizations**: For critical paths, hand-optimized assembly implementations could provide further gains.
|
|
|
|
4. **Expand Benchmark Suite**: Add more comprehensive benchmarks that cover edge cases and a wider range of input values. |