DirectCompute limits GPU abilities... When GPU can write data into several UAV targets physically, Direct3D11 says that GPU can't do that. We can't use DX10 level cards for physics simulation via Direct3D11 API...
This is a small test application which produce simple particles physics. There are 40625 particles on static mesh consisting of 1392 triangles.
GTX260 CUDA (Windows7 11.5ms per frame)
GTX260 CUDA (Linux 13.3ms per frame)
GTX260 OpenCL (Windows7 13.9ms per frame)
GTX260 OpenCL (Linux 15.6ms per frame)
HD5850 Direct3D11 (Windows7 15.6ms per frame)