|
Saturday, May 29, 2010
Intel SSE performance issue
Particle system render buffer generation has been deeply refactored to obtain better performance. There was a strange performance issue during this process... You can see two versions of the same code bellow. There is no performance difference on AMD CPU between the first and the second rendering code fragments. But on Intel Core i5 the difference is huge. The first version generates only 10M particles per second, while the second one shows 60M particles per second!
Subscribe to:
Post Comments (Atom)
Compiler is "Visual C++ 2008 Express Edition". Seems like L1 cache miss occurs... I will try to post disassembly later.
ReplyDeleteWhat status on modern PC?
ReplyDelete