0.1ms sounds about right for copying 1MB over a bus that's roughly 16GB/s, so I'd be inclined to believe that number. It should scale approximately linearly.
You have to bear in mind that the CPU timer isn't just timing how long it takes the CPU to do useful work, but how long it takes the GPU to catch up and do all its outstanding work. By calling Map you've required the GPU to catch up and execute all the work in its queue, do the copy and signal to the CPU that it's done. The more work the GPU has to run prior to the call to "CopyResource", the longer the CPU has to sit there and wait for it to complete. For that reason, I wouldn't expect the CPU timer to ever record a very low value in the region of 0.1ms no matter how small the copy is.