Peano
Loading...
Searching...
No Matches
UpdateParticle_MultiLevelInteraction_StackOfLists_ContiguousParticles.py
Go to the documentation of this file.
1# This file is part of the Peano project. For conditions of distribution and
2# use, please see the copyright notice at www.peano-framework.org
3from peano4.solversteps.ActionSet import ActionSet
4
5import jinja2
6
8
10
11
13 """!
14
15 Tree walker to realise particle-particle interactions with particles held by different adjacent cells or on different mesh levels
16
17 This code snippet creates a tree walker, i.e. an action set that runs
18 through the underlying spacetree top down and builds up
19 @ref page_toolbox_particles_mesh_traversal "the active set and the local set".
20 These sets can then be used to trigger mesh events.
21 It works if and only if you use continuous storage per vertex as introduced
22 with the @ref toolbox_particles_memorypool "memory pools".
23
24 Within touchCellFirstTime(), the routine exposes the following data
25 besides the generic information of any action set:
26
27 - ***marker*** The CellMarker is avaialable to a compute kernel and allows you
28 to query spatial information. This one is actually one of the generic objects
29 which every action set has available, but we highlight that it is there, as
30 the @ref page_toolbox_particles_mesh_traversal "discussion on uniqueness for particle-particle interactions"
31 explicitly refers to the marker.
32 - ***activeParticles*** The list of active particles is a list of pointers. It
33 follows the @ref page_toolbox_particles_mesh_storage "generic definition of active sets".
34 - ***localParticles*** This is a list of local particles, i.e. particles which
35 are tied to the cell's adjacent vertices. While they reside on the cell's
36 level, we have no guarantee that they reside within the current cell. They
37 might also be contained within its halo of marker.h()/2.
38
39 The additional field _activeParticles is an alias for activeParticles.
40
41
42 The code is slightly more sophisticated compared than its cousin, the
43 UpdateParticle_MultiLevelInteraction_Sets, as it exposes more information:
44
45
46 - ***numberOfActiveParticlesPerVertex*** This alias for
47 _numberOfActiveParticlesPerVertex is a sequence of integers
48 which tells you how many particles within activeParticles are added by one
49 particular vertex.
50 - ***numberOfCoalescedLocalParticlesPerVertex*** is a sequence of integers which
51 tells you how many particles within localParticles are
52 added by each adjacent vertex.
53
54 Both pieces of information are important if you work with memory pools
55 as discussed below.
56
57
58
59 ## Different particles species
60
61 Both active and local set are of the same type if you stick
62 to the default value of None for active_particle_set. But you can also
63 let particles of different type interact by selecting a local particle
64 set that is different to the active one. Users can inject code to modify
65 the local set per cell and hence to alter particle states.
66
67
68 ## Realisation and data consistency
69
70 This tree walker realises the two sets via two lists: Per cell, we
71 append the local set to the active set and memorise on a stack how
72 many of these particles have been added. When we backtrace, i.e.
73 ascend within the tree again, this memorised size is used to remove
74 the last n elements from the active set.
75
76 So the active set grows while we descend in the tree, and it is
77 truncated when we ascend again.
78
79 This is a very efficient implementation yet fails if particles move
80 up and down within the tree while we walk. It should work perfectly
81 fine if particles do not move. If particles move up and down, the
82 present algorithms might have redundant pointers within the active set.
83 It is therefore absolutely key that this action set is not used in
84 combination with any other action set which lifts data, as a lift
85 destroys contributions made by coarser mesh cells. If particles drop,
86 we also cannot use this action set, as, again, it might corrupt the
87 active sets constructed by coarser meshes. The issue with the lifts
88 notably becomes apparent once you take into account that lifts trigger
89 a scatter on gathered vertices. That is, we descend within the tree and
90 build up our continuous index spaces. Then, later, we decide that we
91 lift one particle. This lift means that particles associated with a
92 coarser vertex are scattered. However, our active set assumes that
93 we had a valid, continous memory for them. So all of our active sets
94 are now invalid. Sieves in contrast work, as long as their action
95 set is used before the current one. They merely add data to the
96 active set that's constructed next.
97
98 - No lifts.
99 - No drops.
100 - Sieves are fine as long as they happen before touchVertexFirstTime()
101 of this action set.
102 - Lifts into a global sieve list are fine, as long as they happen in
103 touchVertexLastTime() called after this action set.
104
105 In the case of a dynamic setup, i.e. in combinatino with the action
106 set UpdateParticleGridAssociation, you might be better of with using
107 the set-based implementation of the particle-particle interaction.
108
109
110
111 ## Further plug-in points
112
113 Besides the cell kernel which is there to realise particle-to-particle
114 interactions, we also have a vertex kernel which we call whenever a
115 vertex is loaded for the first time. That is, you can assume that the
116 vertex kernel has been launched for all @f$ 2^d @f$ vertices of a cell before
117 its cell kernel is triggered. The vertex kernel is also passed on the
118 active set (from the coarser level). Its local set is all the particles
119 whose centre lies within the square/cube around the vertex with mesh size
120 h. So this area goes h/2 along each each coordinate axis into the
121 neighbouring cells.
122
123
124 ## Flattening of particle arrays
125
126@todo Some of the syntax here might be outdated and recipes might be redundant. Compare general toolbox docu.
127
128 The routine not only keeps track of how many particles we add per iteration.
129 It also keeps track of the number of particles per vertex that have been
130 added. If you know that you work with a @ref page_toolbox_particles_mesh_storage "memory pool", i.e. a consecutive
131 storage of particles, you can use this additional meta information within
132 your kernel invocations to work with loops over consecutive arrays (AoS)
133 rather than sole link lists. You can exploit that you know something
134 about individual chunks indexed by the pointers.
135
136 To exploit this additional information, you cannot use localParticles anymore,
137 i.e. you have to use localParticles. This means, you
138 might have to check if a particle is contained within a cell manually.
139 Once you rewrite your kernels that way, you can exploit the integers
140 _numberOfActiveParticlesPerVertex and numberOfCoalescedLocalParticlesPerVertex:
141
142 @image html UpdateParticle_MultiLevelInteraction_StackOfLists_ContiguousParticles_continuous-memory.png
143
144 Assume you enter a cell and you get the localParticles
145 list hosting 19 pointers. Further to that, you receive the array
146 numberOfLocalParticlesPerVertex with four integer entries. You know that the
147 first five entries point to one continuous chunk of particles in memory, the
148 next seven do so as well, and so forth. This holds if and only if you work
149 with a memory pool which maintains vertex-local or globally continuous
150 particles.
151
152 I do recommend to call toolbox::particles::ensureThatParticleListsEncodeContinuousMemoryChunks()
153 for your data to ensure that the memory layout above is preserved. Simply
154 add the calls
155
156 ~~~~~~~~~~~~~~~~~~~~~~~~~~
157 ::toolbox::particles::ensureThatParticleListsEncodeContinuousMemoryChunks(
158 activeParticles,
159 _numberOfActiveParticlesPerVertex
160 );
161 ::toolbox::particles::ensureThatParticleListsEncodeContinuousMemoryChunks(
162 localParticles,
163 numberOfLocalParticlesPerVertex
164 );
165 ~~~~~~~~~~~~~~~~~~~~~~~~~~
166
167 to your cell-wise compute kernel. Furthermore, you might want to check if all
168 vertices are properly gathered:
169
170 ~~~~~~~~~~~~~~~~~~~~~~~~~~
171 ::toolbox::particles::ensureAllParticleListsAreGatheredOrEmpty(
172 fineGridVerticesMyParticleSetName
173 );
174 ~~~~~~~~~~~~~~~~~~~~~~~~~~
175
176 The actual compute kernels then typically look similar to the following code
177 snippet:
178
179
180 ~~~~~~~~~~~~~~~~~~~~~~~~~~
181 typename std::list<T*>::const_iterator localParticlesIterator = localParticles.begin();
182
183 for (auto localParticlesChunkSize: numberOfLocalParticlesPerVertex) {
184 T* localParticlesChunk = *localParticlesIterator;
185 std::advance( localParticlesIterator, localParticlesChunkSize );
186
187 for (int localParticleNumberInThisChunk=0; localParticleNumberInThisChunk<localParticlesChunkSize; localParticleNumberInThisChunk++) {
188 if (
189 marker.isContained( localParticlesChunk[localParticleNumberInThisChunk].getX() )
190 and
191 // ensure that particle is not updated twice
192 ) {
193 [...]
194 }
195 }
196 }
197 ~~~~~~~~~~~~~~~~~~~~~~~~~~
198
199
200 Iterating over the active set follows the same pattern. Most codes will
201 have to couple local and active sets and hence use a kernel structure
202 similar to
203
204
205 ~~~~~~~~~~~~~~~~~~~~~~~~~~
206 typename std::list< typename ParticleContainer::value_type >::const_iterator localParticlesIterator = localParticles.begin();
207
208 for (auto localParticlesChunkSize: numberOfLocalParticlesPerVertex) {
209 typename ParticleContainer::value_type localParticlesChunk = *localParticlesIterator;
210 std::advance( localParticlesIterator, localParticlesChunkSize );
211
212 typename std::list< typename ParticleContainer::value_type >::const_iterator activeParticlesIterator = activeParticles.begin();
213 for (auto activeParticlesChunkSize: numberOfActiveParticlesPerVertex) {
214 typename ParticleContainer::value_type activeParticlesChunk = *activeParticlesIterator;
215 std::advance( activeParticlesIterator, activeParticlesChunkSize );
216
217 computeInnerLoop(
218 marker,
219 localParticlesChunk,
220 localParticlesChunkSize,
221 activeParticlesChunk,
222 activeParticlesChunkSize
223 );
224 }
225 }
226 ~~~~~~~~~~~~~~~~~~~~~~~~~~
227
228 This inner loop is where the magic happens, i.e. where we can exploit the
229 fact that localParticlesChunk and activeParticlesChunk both point to coninuous
230 memory regions. This means that computeInnerLoop is structurally simple:
231
232
233 ~~~~~~~~~~~~~~~~~~~~~~~~~~
234 void computeHydroForceInnerLoop(
235 const peano4::datamanagement::CellMarker& marker,
236 Particle* particlesAssociatedWithLocalVertex, // das ist falsch
237 int numberOfLocalParticles,
238 Particle* activeParticles,
239 int numberOfActiveParticles
240 ) {
241 for (int activeParticleNumberInThisChunk=0; activeParticleNumberInThisChunk<numberOfActiveParticles; activeParticleNumberInThisChunk++) {
242 for (int localParticleNumberInThisChunk=0; localParticleNumberInThisChunk<numberOfLocalParticles; localParticleNumberInThisChunk++) {
243 // read from activeParticles[activeParticleNumberInThisChunk]
244 // write to particlesAssociatedWithLocalVertex[localParticleNumberInThisChunk]
245 }
246 }
247 }
248 ~~~~~~~~~~~~~~~~~~~~~~~~~~
249
250
251 ## Dynamic adaptive mesh refinement
252
253 In the second sweep of any adaptive mesh refinement, we might drop particles
254 and hence destroy our nicely arranged memory layout.
255 Once particles move up and down the mesh hierarchy, the entries in
256 _numberOfActiveParticlesPerVertex are all invalid.
257 In such a case, we have to switch off any coalesced memory access.
258
259
260 The @ref page_toolbox_particles_mesh_traversal "generic mesh traversal description" provides
261 further context and details.
262 The important detail here is:
263 As soon as we might suspect that the memory layout is corrupted, we have to
264 assume that our pointers are all ov the place and do not point to continuous
265 data anymore. Even worse, we know that previously built-up active sets might be
266 invalid. We have built them up top-down, but when we drop data out of the
267 active set, they change on-the-fly.
268
269
270
271 ## Optimising coalesced compute kernels
272
273 Compute kernels can be optimised quite aggressively, but fiddling out how they
274 can be optimised is tedious. I recommend to use MAQAO and Intel's compiler
275 optimisation feedback. The page @ref toolbox_particles_memorypool holds
276 additional information. It is also worth studying @ref page_swift_performance_optimisation "Swift's performance optimisation"
277 remarks with respect to vecorisation as well as Swift's discussion of @ref swift_particle_lifecycle Particle Lifecycle "vectorised kernels".
278
279 Compute kernels can be optimised quite aggressively, but fiddling out how they
280 can be optimised is tedious. I recommend to use MAQAO and Intel's compiler
281 optimisation feedback. Here are some lessons learned:
282
283 - You cannot collapse the two for loops. That would lead to a scattered
284 iteration space in memory and hence make any vectorisation impossible.
285 - The active loop has to be the inner one. I do not understand why, but
286 Intel's icpx refuses to vectorise if the local loop is the inner one.
287 - The compiler has to be able to inline. According to https://clang.llvm.org/docs/AttributeReference.html#always-inline-force-inline
288 this is something one should be able to control via the annotation
289
290 [[clang::always_inline]]
291
292 but this led to multiple compiler crashes (again with Intel). To some
293 degree this makes sense: The function has to be in the header! I played
294 around with ipo, but that did not resolve the fundamental issues.
295 - You might have to recursively study all inlined methods and make the
296 stuff they use inline as well.
297 - Typically, you have to embed the inner-most loop into a statement alike
298 ~~~~~~~~~~~~~~~~~~~~~~~
299 if (
300 marker.isContained(particlesAssociatedWithLocalVertex[localParticleNumberInThisChunk].getX())
301 and
302 not particlesAssociatedWithLocalVertex[localParticleNumberInThisChunk].getCellHasUpdatedParticle()
303 ) {
304 // read from activeParticles[activeParticleNumberInThisChunk]
305 // write to particlesAssociatedWithLocalVertex[localParticleNumberInThisChunk]
306 }
307 ~~~~~~~~~~~~~~~~~~~~~~~
308
309 The @ref page_peano_performance_optimisation "generic Peano optimisation" page
310 provides some useful hints how to guide these optimisation through the
311 compiler. It also makes sense to study swift2::kernels::computeHydroForce()
312 which was one of the first routines that we optimised along these ideas.
313
314 """
315
317 self,
318 particle_set,
319 particle_particle_interaction_kernel,
320 touch_vertex_first_time_compute_particle_update_kernel=None,
321 touch_vertex_last_time_compute_particle_update_kernel=None,
322 prepare_traversal_kernel="",
323 unprepare_traversal_kernel="",
324 additional_includes="",
325 active_particle_set=None,
326 ):
327 """!
328
329 Initialise object
330
331 ## Arguments
332
333 particle_set: ParticleSet
334
335 particle_particle_interaction_kernel: String holding C++ code
336
337 active_particle_set: ParticleSet or None
338 You can compare different particles species with this argument. It
339 allows you to make the active particles stem from another species than
340 the local ones that you actually update. Pick None if both sets are of
341 the same type.
342
343 touch_vertex_first_time_compute_particle_update_kernel: String or None
344 Can be empty, but if you wanna move particles, then a minimal example
345 string passed equals
346
347 for (auto* localParticle: localParticles ) {
348 localParticle->setMoveState( globaldata::Particle::MoveState::NotMovedYet );
349 }
350
351 i.e. you will have a variable localParticle available in this kernel
352 and it is a pointer.
353
354
355 """
356 super(
357 UpdateParticle_MultiLevelInteraction_StackOfLists_ContiguousParticles, self
358 ).__init__(descend_invocation_order=1, parallel=False)
359
360 self._particle_set = particle_set
361 self.d = {}
362 self.d["LOCAL_PARTICLE"] = particle_set.particle_model.name
363 self.d["LOCAL_PARTICLES_CONTAINER"] = particle_set.name
364 if active_particle_set == None:
365 self.d["ACTIVE_PARTICLE"] = particle_set.particle_model.name
366 self.d["ACTIVE_PARTICLES_CONTAINER"] = particle_set.name
367 else:
368 self.d["ACTIVE_PARTICLE"] = active_particle_set.particle_model.name
369 self.d["ACTIVE_PARTICLES_CONTAINER"] = active_particle_set.name
370 self.d[
371 "PARTICLE_PARTICLE_INTERACTION_KERNEL"
372 ] = particle_particle_interaction_kernel
373 self.d[
374 "TOUCH_VERTEX_FIRST_COMPUTE_KERNEL"
375 ] = touch_vertex_first_time_compute_particle_update_kernel
376 self.d[
377 "TOUCH_VERTEX_LAST_COMPUTE_KERNEL"
378 ] = touch_vertex_last_time_compute_particle_update_kernel
379 self.d["ADDITIONAL_INCLUDES"] = additional_includes
380 self.d["PREPARE_TRAVERSAL_KERNEL"] = prepare_traversal_kernel
381 self.d["UNPREPARE_TRAVERSAL_KERNEL"] = unprepare_traversal_kernel
382
383 __Template_TouchVertexFirstTime = jinja2.Template(
384 """
385 {% if TOUCH_VERTEX_FIRST_COMPUTE_KERNEL!=None %}
386 auto& assignedParticles = fineGridVertex{{LOCAL_PARTICLES_CONTAINER}};
387 int numberOfCoalescedAssignedParticles = fineGridVertex{{LOCAL_PARTICLES_CONTAINER}}.isGathered() ? fineGridVertex{{LOCAL_PARTICLES_CONTAINER}}.size() : -1;
388
389 {{TOUCH_VERTEX_FIRST_COMPUTE_KERNEL}}
390 {% endif %}
391"""
392 )
393
394 __Template_TouchVertexLastTime = jinja2.Template(
395 """
396 {% if TOUCH_VERTEX_LAST_COMPUTE_KERNEL!=None %}
397 auto& assignedParticles = fineGridVertex{{LOCAL_PARTICLES_CONTAINER}};
398 int numberOfCoalescedAssignedParticles = fineGridVertex{{LOCAL_PARTICLES_CONTAINER}}.isGathered() ? fineGridVertex{{LOCAL_PARTICLES_CONTAINER}}.size() : -1;
399
400 {{TOUCH_VERTEX_LAST_COMPUTE_KERNEL}}
401 {% endif %}
402"""
403 )
404
405 __Template_TouchCellFirstTime = jinja2.Template(
406 """
407 std::list< globaldata::{{LOCAL_PARTICLE}}* > localParticles;
408 std::vector< int > numberOfCoalescedLocalParticlesPerVertex( TwoPowerD );
409
410 _numberOfActiveParticlesAdded.push_back(0);
411 for (int i=0; i<TwoPowerD; i++) {
412 // Keep track of number of particles
413 _numberOfActiveParticlesPerVertex.push_back( fineGridVertices{{ACTIVE_PARTICLES_CONTAINER}}(i).size() );
414 _numberOfActiveParticlesAdded.back() += fineGridVertices{{ACTIVE_PARTICLES_CONTAINER}}(i).size();
415 numberOfCoalescedLocalParticlesPerVertex[i] = fineGridVertices{{ACTIVE_PARTICLES_CONTAINER}}(i).size();
416
417 // Add particles to active particle set
418 for (auto* p: fineGridVertices{{ACTIVE_PARTICLES_CONTAINER}}(i) ) {
419 _activeParticles.push_back( p );
420 }
421
422 // Construct the two local particle sets
423 for (auto* p: fineGridVertices{{LOCAL_PARTICLES_CONTAINER}}(i) ) {
424 localParticles.push_back( p );
425 }
426 }
427
428 logDebug( "touchCellFirstTime(...)", "size of local/active particles=" << localParticles.size() << "/" << _activeParticles.size() << " at " << marker.toString() );
429
430 {% if PARTICLE_CELL_UPDATE_KERNEL!=None %}
431 std::list< globaldata::{{ACTIVE_PARTICLE}}* >& activeParticles = _activeParticles;
432 auto& numberOfCoalescedActiveParticlesPerVertex = _numberOfActiveParticlesPerVertex;
433 assertion1(
434 not numberOfCoalescedActiveParticlesPerVertex.empty()
435 and
436 (numberOfCoalescedActiveParticlesPerVertex.size() % TwoPowerD == 0),
437 numberOfCoalescedActiveParticlesPerVertex.size()
438 );
439 {{PARTICLE_PARTICLE_INTERACTION_KERNEL}}
440 {% endif %}
441"""
442 )
443
444 __Template_TouchCellLastTime = jinja2.Template(
445 """
446 assertion( not _numberOfActiveParticlesAdded.empty() );
447 assertion( _activeParticles.size()>=_numberOfActiveParticlesAdded.back() );
448 assertion( _numberOfActiveParticlesPerVertex.size()>=TwoPowerD );
449
450
451 // More elegant way to write
452 // -------------------------
453 // for (int i=0; i<_numberOfActiveParticlesAdded.back(); i++) {
454 // _activeParticles.pop_back();
455 //}
456 _activeParticles.erase(
457 std::prev(_activeParticles.end(),_numberOfActiveParticlesAdded.back() ),
458 _activeParticles.end()
459 );
460
461 _numberOfActiveParticlesPerVertex.erase(
462 std::prev(_numberOfActiveParticlesPerVertex.end(),TwoPowerD ),
463 _numberOfActiveParticlesPerVertex.end()
464 );
465
466 _numberOfActiveParticlesAdded.pop_back();
467"""
468 )
469
470 __Template_EndTraversal = jinja2.Template(
471 """
472 assertion1( _activeParticles.empty(), (*_activeParticles.begin())->toString() );
473"""
474 )
475
476 def get_body_of_operation(self, operation_name):
477 result = "\n"
478 # if operation_name==ActionSet.OPERATION_BEGIN_TRAVERSAL:
479 # result = self.__Template_BeginTraversal.render(**self.d)
480 if operation_name == ActionSet.OPERATION_TOUCH_CELL_FIRST_TIME:
481 result = self.__Template_TouchCellFirstTime.render(**self.d)
482 if operation_name == ActionSet.OPERATION_TOUCH_CELL_LAST_TIME:
483 result = self.__Template_TouchCellLastTime.render(**self.d)
484 if operation_name == ActionSet.OPERATION_TOUCH_VERTEX_FIRST_TIME:
485 result = self.__Template_TouchVertexFirstTime.render(**self.d)
486 if operation_name == ActionSet.OPERATION_TOUCH_VERTEX_LAST_TIME:
487 result = self.__Template_TouchVertexLastTime.render(**self.d)
488 if operation_name == ActionSet.OPERATION_END_TRAVERSAL:
489 result = self.__Template_EndTraversal.render(**self.d)
490 return result
491
493 return " return std::vector< peano4::grid::GridControlEvent >();\n"
494
496 return self.d["PREPARE_TRAVERSAL_KERNEL"]
497
499 return self.d["UNPREPARE_TRAVERSAL_KERNEL"]
500
502 return __name__.replace(".py", "").replace(".", "_")
503
505 return False
506
507 def get_includes(self):
508 result = jinja2.Template(
509 """
510#include "tarch/multicore/Lock.h"
511#include "toolbox/particles/particles.h"
512#include "vertexdata/{{LOCAL_PARTICLES_CONTAINER}}.h"
513#include "globaldata/{{LOCAL_PARTICLE}}.h"
514#include "vertexdata/{{ACTIVE_PARTICLES_CONTAINER}}.h"
515#include "globaldata/{{ACTIVE_PARTICLE}}.h"
516
517{{ADDITIONAL_INCLUDES}}
518
519#include <list>
520#include <vector>
521"""
522 )
523 return result.render(**self.d)
524
525 def get_attributes(self):
526 result = jinja2.Template(
527 """
528 std::list< globaldata::{{ACTIVE_PARTICLE}}* > _activeParticles;
529 std::vector< int > _numberOfActiveParticlesAdded;
530 std::vector< int > _numberOfActiveParticlesPerVertex;
531 int _spacetreeId;
532"""
533 )
534 return result.render(**self.d)
535
537 return """
538 _spacetreeId = treeNumber;
539 """
Default superclass for any data model in Peano which is stored within the grid.
Definition DaStGen2.py:47
Action set (reactions to events)
Definition ActionSet.py:6
__init__(self, particle_set, particle_particle_interaction_kernel, touch_vertex_first_time_compute_particle_update_kernel=None, touch_vertex_last_time_compute_particle_update_kernel=None, prepare_traversal_kernel="", unprepare_traversal_kernel="", additional_includes="", active_particle_set=None)
Initialise object.