Move the content of the loop over scales in a separate function, so it can be called with fresh arguments every time to check parallelism and dependencies.