Issues to solvers, multi core, stress parameter
I would be most interested to learn how z88 performs with some of the recently released hardware, such as the Nvidia Pascal series cards with very many (> 2500) cuda cores, and the Intel core i7 processors with 8 and 10 cores.
Are there any licensing issues which preclude running any z88 Aurora solvers with AMD cpus, such as the soon to be released Ryzen 8 core processor ? Is the pardiso solver or other z88 software restricted to run on Intel processors by an Intel MKL license ?
More generally, what are there guidelines for building a computer to run z88 Aurora well?
there are no hardware guidelines for Z88Aurora, as there are no limitations to specific hardware. You can run it on any x86(-64) hardware like Intel and AMD platforms.
We personally use Intel CPUs ranging from Core Duo to recent Core-i5/7 or Xeon CPUs (Haswell generation). We do not expect and problems with Skylake or Kaby Lake. On my private machine Z88Aurora runs fine with a i7-6700K (Skylake).
In the past we had an issue with a user running an AMD APU, but it's unclear if the APU or other software/drivers caused the problem. Theoretically, AMD CPUs should work fine for everything, problems could occur because of the Intel MKL library we are using - Intel does not say it is limited to Intel systems and many AMD systems do work as expected, but there's no guarantee. But as far as I can remember, we only had that one support case which could be relied to AMD hardware...
Regarding CUDA: we have a Tesla K40 running fine - and very fast - with our Z88Aurora GPU addon (http://en.z88.de/z88aurora-gpu-addon), but the direct solver PARDISO still outperforms it (with a good CPU) on some projects. The iterative solvers, which are used on either CPU or GPU due to their low memory usage, are slower than the RAM-eating direct solver PARDISO. For large projects where the RAM is insufficient and you would have to use an iterative solver on your CPU, the CUDA solver can speed up the calculation 2-4 times. The limit then is the VRAM on your graphics card, so be sure you buy at least one with 6 GB or more. Gamer cards should be fine as well, the speed difference to the Quadros or Teslas does not always make up for the higher price. Look for high memory bandwidth, many CUDA cores and enough VRAM.
For the new Ryzen: the number of cores does - like our comparions have shown - not help too much when it exceeds 6-8 cores (HyperThreading is also useless with our PARDISO solver). Much more important is the single core speed, so for example a 4C CPU with 4.8 GHz will probably be faster than a 8C CPU with 3.8GHz. This is due to the overhead which is generated by splitting the calculation to more cores and the program pieces which cannot be parallelized. If you are interested in the comparisons, I can upload them tomorrow when I'm at work.
I attached a diagram of our PARDISO-comparison with different CPUs and HDD/SSD.
Unfortunately it's in German, so please translate by yourself:
Parallelisierung des FE-Gleichungslösers PARDISO = parallelization of the fe solver PARIDSO
relative Rechenzeit in % = relative calculation time in %
CPU-Kerne = CPU cores
P. S. The used CPUs were: Xeon E3-1271 v3 (4C/8T), Xeon E5-1650 (6C/12T) and 2x Xeon E5-2687W (8C/16T).
- (207.76 KiB) Noch nie heruntergeladen