PDA

View Full Version : Suitable simulation hardware



Stefan Kroll
June 18, 2013, 06:50 AM
Has anybody had a deeper look into the different hardware components of a system and their influence on simulation performance? I think this would be really interesting, since most of us have a somewhat limited budget for such things and thus have to optimize their spending a bit. (otherwise, maybe we could just settle for this?: http://top500.org/blog/lists/2013/06/press-release/ :-) )
Some examples:

we found that using SSD drives can improve simulation performance quite significantly, but just picking the ‘best’ drive available is simply not possible (would cost us about 8000(!) €/800 GB right now)
‘traditionally’ we use XEON processors for our simulation hardware; we have no real experience with core i7 or AMD CPUs. Is ECC really necessary for Infoworks simulations? Are there any other differences?
Will CPUs that are doing well in benchmarks (eg: http://www.cpubenchmark.net/high_end_cpus.html) be the most desirable solution for Infoworks? Can Infoworks (ICM 2D apart) really make efficient use of, say a twin Xeon or an Opteron 16 core? For some 1D simulations we see no improvement when stepping up from 3 to 4 cores…
For combined 1D+2D simulations in ICM: should we rather focus on a heavy CPU and settle for a low cost cuda GPU, or get a high end GPU and add whatever CPU the remaining budget would still allow?


So my question would be: Given a certain budget, how do you choose your simulation hardware?

Kristian Ravnkilde
June 20, 2013, 12:23 AM
Ah, loads of petaflops! That's what we need! But I'm finding sims run well on my (employer's) Core i7 based machine. The 16GB RAM makes a big difference over my previous one. I have a compatible GPU, but I don't find it makes the most dramatic difference, depending on the model - if the 1D part is large or difficult, the 2D part won't be the critical element. I'd gofor CPU and RAM first - you can add a graphics card later when there's spare cash lying around :(.

Stefan Kroll
June 20, 2013, 02:05 AM
Thank you for your input, Kristian! Could you post the details of your i7? I'm still trying to figure out how many cores (the 1D part of) a model can actually efficiently use...

Kristian Ravnkilde
June 20, 2013, 02:46 AM
It's a Core-i7 2600 processor running at 3.40 GHz. This has 4 cores (similar to having two processors, giving yu 8 threads, i.e. 2 per core). IWCS and ICM can access all of them. IWCS will hog them all, but with ICM you can choose how many to allow it to access to leave some processor time for other applications.
You do need lots of RAM to work effectively. I have 16MB, and have not managed to fill it up so far!
Works for me, but other configurations may work as well.

David Garcia
June 25, 2013, 06:44 AM
We've done a bit of benchmarking here with our 1D CS models and found that the sweet spot is 4 core Intel processors with 8-16 GB of RAM. We can throw more cores and more RAM at it, but eventually you end up with diminishing returns. The marginal gains from each extra core or gig of RAM just get smaller and smaller. We have found that hard drive speeds don't really effect model simulation times, but have a huge impact on loading results.

When I spec model computers I use the following guidelines:
1. Intel Core i7. Always the most recent model (4th gen just came out) and the fastest that the company can afford at the time. (This is the single most important factor of 1D simulation time)
2. 16 GB of RAM. CS doesn't use very much, but you'll want plenty of memory available so that you can do multiple things at once. It is extremely common to have CS, ArcGIS, Excel, and sometimes AutoCAD all open at the same time.
3. SSD boot drive + 2 to 4 TB Western Digital Black for working folder. This provides a fast SSD for booting and opening programs, and a large, fast storage drive for saving results. With a multi-terrabyte HD, you don't have to worry about having to delete old simulation results to make room for new ones. Just store everything in case it is ever needed again.

I don't have any experience with 2D simulations, but from what I understand you just need a decent nVidia video card for this.

Kristian Ravnkilde
June 25, 2013, 06:55 AM
That sounds excellent. Maybe they will get me an SSD drive next time.
And yes, any good video card will do. To take advantage of the ability of ICM to use the GPU to speed up 2D sims, you need one of a specific range, see http://blog.innovyze.com/2012/09/18/which-gpu-card-is-best-for-infoworks-icm/.
See also the Innovyze blog http://blog.innovyze.com/2012/12/14/runtime-comparisons-for-2d-models-in-infoworks-icm/
I'd spend money on the CPU and RAM in the first instance, and upgrade the video card when there's spare cash and/or a real need! It only makes a great difference if you have a lot of 2D flow going on, as there's a bit of an overhead in transferring data to the GPU from the CPU. The GPU can't process the 1D part of the model at all.

Robert Dickinson
June 25, 2013, 06:58 AM
Good benchmarking notes, David, especially the value of having a fast hard drive for loading models and model results. In InfoSWMM we also find that generally having two cores speeds up the simulation by 50 percent but the third and fourth core much less than 50 percent and eventually as you get into 8 or 16 cores the overhead from having multiple core actions actually slows down the simulation compared to less cores. It is important to be able to define how many cores are allocated to the simulation and I typically like to use 3 out of 4 cores leaving one core for my other programs to use.

Stefan Kroll
June 26, 2013, 09:23 AM
Hi all, sorry for the delay. This is all very interesting input!

Especially interesting is the confirmation of David and Robert: in 1D, it seems there’s really no point in going beyond 4 (physical) cores. I had hoped this would not be the case (somehow you always hope there’s just something wrong with the models you tested...) since it means we won’t see a big speedup for 1D models in the future. :-/

Do you really see no impact of disk speed on simulation speed? On 2 otherwise identical systems I used raid 1 or 0 for a model that creates (and saves) quite some results. Raid 0 results in consistently shorter simulation times by about 25 %. That’s almost as much as when going from 8 logical cores (using hyperthreading) back to 4 (no hyperthreading), 31 %. It seems that the disk usage during simulation is higher than during the final scan of results (since this is a single threaded process). So I guess you keep your local root on the SSD and ‘upload’ the results to the multi-TB-drive then?

Over the last days we ran some tests in preparation of the use of 2D models with a pure 2D model that Andrew Walker was so kind to let us. Conclusion: for pure 2D models, the fastest pc we have at the moment is a laptop, simply because it comes with an NVidia Quadro 3000M (not exactly the quickest card around). A Tesla C2075 though would be really nice as we could tuck it away in a box in our simulation room and use it remotely – which apparently won’t work with Geforce cards.

From all this, I’d so far retain the following ranking (for mostly 1D simulations with occasional 2D):


CPU with focus on high single core performance
SSD
16+ GB RAM
Remaining money goes into a CUDA-compatible card


Agreed? :-)

Andrew Walker
June 28, 2013, 12:23 PM
I managed to get my hands on a Dell server with 4 Xeon processors, each with 10 cores (so, that's a total of 40 cores in all). This machine was used to do a series of runs with a reasonably large 1D only model. The runs were done in ICM, but the results would be the same in CS as the model was simply migrated from CS to ICM and run without any changes to the network.

As you can see from the image below, I also experienced the big jump in performance between 2 and 4 cores, but after that things tailed off quite quickly. In fact, if you look closely, you'll see that from about 28 cores and upwards, the simulation started taking longer! Why? Well, it is at this point that the task of sub-dividing the bits of the simulation amongst all the cores takes more time than the gain you get from using the extra cores.

http://i643.photobucket.com/albums/uu153/atsw26/Innovyze/Multi-ProcessorSupport3_zpsc62bce65.jpg

Stefan Kroll
July 1, 2013, 04:33 AM
Nice! I want one too!! :-)

Did you do these tests with or without hyperthreading? Could you also share some specs of this beautiful piece of hardware? Would you say that the point of increased inefficiency is strongly dependent on the structure and size of the modelled network, our would you expect it always to be around 20 to 30 cores?

Andrew Walker
July 1, 2013, 05:26 AM
Hi Stefan,

The Server was a Xeon based system running at 2.4 GHz. I don't seem to have a note of the exact model number, but it had 32GB of memory and was running Widows Server 2008R2.

As a general rule, on the models we tested, the graph always flattened out once you were in the 20-30 cores area.

Appreciating where the 'drop' happens is always good to know of course, as one of the features of InfoWorks ICM is that you can limit a particular simulation to a certain number of cores. This means that you can optimise runs on a powerful machine to ensure each has just the right amount of resources. For example, on the machine I was using we experimented with limiting the cores/threads available to an individual simulation. We found that 8 threads seemed to work well, which allowed us to run 4 concurrent simulations and still leave enough resource for the Server to manage regular background Windows tasks.

http://i643.photobucket.com/albums/uu153/atsw26/Innovyze/MaxThreads_zpsc72f503a.jpg

Andrew

David Garcia
July 1, 2013, 05:31 AM
Does that type of set up require 4 licences? One for each simultaneous simulation?

Andrew Walker
July 1, 2013, 05:43 AM
No! For the example above you only need 1 seat/licence. The ICM licence provides authority for a particular PC or Server to run simulations. If the machine is a powerful server then you can make it run concurrent simulations (as above). If you wanted to run four simulations on four different machines at the same time, then you would need four licences/seats.

By comparison, products like InfoWorks CS or InfoSWMM can only perform one simulation at a time with a single licence.

Andrew

Robert Dickinson
July 1, 2013, 05:47 AM
These are really good tips, Andrew.

I did want to mention that we also have the ability to select the number of cores used during the simulation for InfoSWMM and H2OMAP SWMM. We limit the number of cores to 8. We also allow the user to save only a Domain or Selection Set to make the output file smaller those models run in continuous simulation that may generate GB's of binary output files if the whole network is saved.

42