WOW! Those differences are crazy!
Please - I know benchmarking takes a lot of time - but could you check
something: the behavior those fs have at what time they flush data from
cache to disk is very different. Have you made sure that you measured
the time it really needs? I mean the difference between:
$ sync; time cp source dest
$ sync; time (cp source dest; sync)
Only the last measures somewhat correctly.