Skip to content
Snippets Groups Projects
Commit aea36fd9 authored by Antonio Ragagnin's avatar Antonio Ragagnin :speech_balloon:
Browse files

Edit test_gpu_treebuild_on_leonardo.md

parent e040f6a4
No related branches found
No related tags found
No related merge requests found
Pipeline #25720 passed
......@@ -4,6 +4,8 @@ The `hotwheels` tree build can run either in serial or in parallel. The parallel
Here above there is the scaling test for insterting `1e7` particles into the tree. As you can see the algorithm scales very well with openmp threads. The GPU code scales as a CPU code with 4-8 threads. Therefore it is suggested to use this setup in situations where the number of cores per MPI rank is limited. Otherwise, for OpenMP-dominated runs, the CPU tree build scales much better than the GPU.
Notice how having multiple particles pear leaf do decrease the tree build process (less nodes to create and less recursion). This choise is strongly incouraged as it has even the great advantage of reducing the tree walk time (work in progress).
Improvement is in progress and things may vary in the future. Here below a job script for running the scaling test on Leonard BOOSTER machine.
```bash
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment