אירועים
אירועים והרצאות בפקולטה למדעי המחשב ע"ש הנרי ומרילין טאוב
גילי יבנה (הרצאה סמינריונית למגיסטר)
יום שלישי, 21.02.2017, 13:00
The cost of accessing shared objects that are stored in remote memory, while
neglecting accesses to shared objects that are cached in the local memory,
is evaluated by the number of remote memory references (RMRs) in an execution.
two flavours of this measure- cache-coherent (CC) and distributed shared memory
(DSM)-model two popular shared-memory architectures. The number of RMRs,
however, does not take into account the granularity of memory accesses, namely,
the fact that accesses to the shared memory are performed in blocks.
This thesis proposes a new measure, called block RMRs, counting the number of
remote memory references while taking into account the fact that shared objects
can be grouped into blocks. On the one hand, this measure reflects the fact that
the RMR paid for bringing a shared object to the local memory might save another
RMR for bringing another object placed at the same block. On the other hand,
this measure accounts for false sharing: the fact an RMR may be paid when
accessing an object due to a concurrent access to another object in the same
block.
This paper proves that in the CC model finding an optimal placement, i.e.,
grouping of objects into blocks, is NP-hard when a block can store three objects
or more; the result holds even if the sequence of accesses is known in advance.
In the DSM model, the answer depends on the cost of invalidating data
throughout the system. If cache coherence is supported (i.e., some mechanism
exists to inform processes that the data in their local memory is no longer
valid), and if the cost of invalidation is negligible compared to the cost of
an RMR, then finding an optimal solution is NP-hard. If invalidation is not
negligible, an optimal layout can be approximated within an additive factor
(depending on the number of processes), if the sequence of accesses is known
in advance.
In both the CC and the DSM models, finding an optimal placement is NP-hard when
objects have different sizes, even for a single process.