{"id":79,"date":"2007-10-25T20:28:37","date_gmt":"2007-10-26T04:28:37","guid":{"rendered":"http:\/\/www.airs.com\/blog\/archives\/79"},"modified":"2007-10-25T20:29:27","modified_gmt":"2007-10-26T04:29:27","slug":"single-threaded-memory-model","status":"publish","type":"post","link":"https:\/\/www.airs.com\/blog\/archives\/79","title":{"rendered":"Single Threaded Memory Model"},"content":{"rendered":"<p>One more round on the parallel programming theme.  There has been some recent discussion on the <a href=\"http:\/\/gcc.gnu.org\/ml\/gcc\/2007-10\/msg00398.html\">gcc<\/a> and <a href=\"http:\/\/lkml.org\/lkml\/2007\/10\/24\/673\">LKML<\/a> mailing lists about an interesting case.<\/p>\n<blockquote>\n<p>static pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;<br \/>\nstatic int acquires_count = 0;<\/p>\n<p>int<br \/>\ntrylock()<br \/>\n{<br \/>\n  int res;<\/p>\n<p>  res = pthread_mutex_trylock(&#038;mutex);<br \/>\n  if (res == 0)<br \/>\n    ++acquires_count;<\/p>\n<p>  return res;<br \/>\n}\n<\/p><\/blockquote>\n<p>On x86 processors, current gcc will optimize this to use a comparison, an add with carry flag, and an unconditional store to <code>acquires_count<\/code>.  This eliminates a branch, but it means that the variable is written to even if the lock is not held.  That introduces a race condition.<\/p>\n<p>This is clearly permitted by the C language standard, which in general describes a single threaded model.  The standard says nothing about precisely when values are written to memory.  It simply says that if you assign a value to a global variable, then when you read the global variable you should see that value.<\/p>\n<p>It may seem that this should be an invalid optimization: the compiler should not move stores out of a conditionally executed basic block.  Several people have reacted to this test case in this way.  However, note that the first function call could be anything&#8211;it could even be a test of a global variable.  And note that the store could be a load&#8211;we could load a global variable twice, protected by some conditional both times.  If we require that this code be safe in a multi-threaded environment, then we can not coalesce the loads.  That would be OK with people who write C as a fancy assembly language.  But it would significantly hurt performance for complex programs, especially in C++, in which the conditionals would be in different functions and then inlined together.<\/p>\n<p>So what should the compiler do, given a choice between optimizing complex C++ code and helping to generate correct complex multi-threaded code?  Complex C++ code is much more common than complex multi-threaded code.  I think the compiler is making the right choice.  And it conforms to the language standard.<\/p>\n<p>For this code to be correct in standard C, the variable needs to be marked as <code>volatile<\/code>, or it needs to use an explicit memory barrier (which requires compiler specific magic&#8211;in the case of gcc, a <code>volatile asm<\/code> with an explicit memory clobber).  But many people don&#8217;t see that&#8211;including <a href=\"http:\/\/lkml.org\/lkml\/2007\/10\/25\/186\">Linus Torvalds<\/a>.<\/p>\n<p>I think this is a nice example of my earlier point, which is that our current models of parallel programming are generally too hard for people to use.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>One more round on the parallel programming theme. There has been some recent discussion on the gcc and LKML mailing lists about an interesting case. static pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER; static int acquires_count = 0; int trylock() { int res; res = pthread_mutex_trylock(&#038;mutex); if (res == 0) ++acquires_count; return res; } On x86 processors, current [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[6],"tags":[],"class_list":["post-79","post","type-post","status-publish","format-standard","hentry","category-programming"],"_links":{"self":[{"href":"https:\/\/www.airs.com\/blog\/wp-json\/wp\/v2\/posts\/79","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.airs.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.airs.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.airs.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.airs.com\/blog\/wp-json\/wp\/v2\/comments?post=79"}],"version-history":[{"count":0,"href":"https:\/\/www.airs.com\/blog\/wp-json\/wp\/v2\/posts\/79\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.airs.com\/blog\/wp-json\/wp\/v2\/media?parent=79"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.airs.com\/blog\/wp-json\/wp\/v2\/categories?post=79"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.airs.com\/blog\/wp-json\/wp\/v2\/tags?post=79"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}