Archive for February, 2011

Monetization of Public Space

The current rumors say that Facebook has a paper valuation of $50 billion and Twitter has one of $10 billion. Those numbers are certainly part of a mini-bubble today, but they may be supportable tomorrow. It’s interesting to note that the only service these companies do is provide a space for people to do things, a space for people to provide their own information. This is not nothing, as it involves paying for the space to host photographs and videos and for network costs. But Facebook and Twitter are not being valued for being hosting providers. They are being valued for creating a shared public space which many people choose to enter. (WordPress and LiveJournal are other examples of online companies that simply create a shared space; Google and Bing are examples of online companies which provide a service, providing real information beyond what people do on their sites.)

Is there anything comparable in the real world? The real world is full of shared public spaces, but they are not generally money making operations. I think of places like parks or shared urban spaces like Times Square. These are typically built by the government or by some public cooperative at a cost, rather than being built by a private corporation as a source of revenue.

What’s the key difference? It’s that it is much cheaper to create an online space which can hold several million people than it is to create a physical space that large. It’s cheap enough that even the relatively small returns you get from billboards may be enough to pay for creating that space. And, since it is online, you can have targeted billboards, which may pay somewhat more.

I think this becomes an interesting difference between the online world and the real world. In the online world, the shared public spaces are not paid for by society, they are paid for private companies. There is no notion of a right of public access on the Internet. There is no equivalent of the homeless person on the corner or the busker singing for change. I don’t know if these differences are good or bad.

People are already working on goggles which bring the online world into the real world, adding annotations for the objects that we look at. It will be a simple further step for these goggles to edit out the things we don’t want to see, and to add targeted billboards in their place. As the goggles would be sold by private corporations, that will bring the monetization of public spaces in the online world into the real world as well. This is obviously a very speculative idea, but it would in some ways be the logical culmination of people’s withdrawal from cities and shared society into an increasingly private life. Again, I don’t know if this would be good or bad, but if it happens it has the potential to be truly different.

Comments (8)

Race Conditions

The Go language is generally safe, in that errors in your program do not lead to unpredictable crashes (unless you use the facilities in the unsafe package). There is one classic cause of problems, however, which Go does not protect you from: race conditions. A race condition occurs when one goroutine modifies a variable and another reads it or modifies it without any synchronization.

Go makes correct synchronization easy, following the slogan of “do not communicate by sharing memory; instead, share memory by communicating.” Race conditions only occur when sharing memory. However, although correct synchronization is easy, the language does not enforce it. Race conditions are possible and I have seen them in real Go programs.

Is it possible for a language to be safe with regard to race conditions? In a language that supports multiple threads of execution, it’s always going to be possible to construct race conditions by doing things like having multiple threads write to the same offset in the file. Let’s rule out that somewhat exotic variety and concentrate on race conditions which don’t involve external constructs. Let’s also not worry about other threading problems like deadlock or starvation; they are real problems, but they are usually easier to understand and avoid.

Race conditions in memory always involve global variables, in which I include memory which can be accessed via a pointer. When more than one thread can access a global variable, the variable can become a nonlocal side effect: something which can affect program behaviour while changing in a way that is not obvious from reading the program code. When that variable can change without any synchronization, you may have a race condition.

One approach, then, is to prohibit variable modifications. When no variables can be modified, no race conditions are possible. This may sound odd to programmers used to conventional imperative languages like C++, Java, or Go, but this is the approach used by strict functional languages. There are few truly strict functional languages outside of academia, but languages like Haskell approach the goal by clearly delineating the non-strict set of operations. Functional languages generally do not have explicit threading models, but the lack of side effects makes it possible to automatically parallelize them. E.g., when a function is called with two arguments, both arguments can be evaluated in parallel; since there are no global variables, there is no way that the evaluation of one argument can affect the evaluation of the other.

While functional languages have some clear benefits, I think they face a problem of history. They’ve been around for a long time, but they are not widely used. It’s possible that this is merely a matter of education, but I think it’s a more fundamental issue. The real world is full of global state and non-local side effects. I think it’s hard for programmers to adapt to languages which don’t support them. Operations that are conceptually very simple, like sorting a list of numbers, become harder to understand in a functional language. Functional languages will always have adherents and will always have good uses in specific domains, but I think widespread adoption is unlikely. Go is obviously not a functional language.

A somewhat less strict approach for avoiding race conditions is for different threads of execution to not share memory. Without shared memory, all thread communication must use a limited set of communication operations which can avoid races. The Erlang language follows this model. Erlang is a functional language, though somewhat less strict than Haskell. Erlang has an explicit thread model. However, threads do not share memory, and can only communicate via message passing. In effect this enforces Go’s slogan: all sharing must be done by communication. This model also has a nice advantage in that threads can run on the same machine or a different one with no change in the program, but only a change in efficiency.

Go did not adopt this model because of efficiency considerations. Threads communicating in shared memory are far more efficient that threads communicating via message passing. In shared memory you can pass around pointers to large data structures. Without shared memory, large data structures must be sent as a whole. Real implementations of Erlang have a variety of implementation optimizations to reduce the cost of sending large data structures, but that means that programmers have to be aware of the available optimizations when writing programs.

One could argue that correct program behaviour is always more important than efficiency, and that the efficiency advantages of shared memory should be disregarded in the name of correctness. However, that argument does not hold up in the real world, where efficiency does matter. Erlang made the right choice for an environment in which threads can run on different machines and can even move from one machine to another. Go may not be the right choice for that environment. Of course, it is also not the environment in which most programs run.

I don’t know of any other approaches to avoiding race conditions in a programming language, though I’m sure there are some. One possibility would be for the language to maintain a global synchronization state. Each operation on a global variable would then be annotated with the current state. If two operations were done using an incompatible state, meaning operations done by different threads with no explicit synchronization, you would get an error.

If this were done purely dynamically, the result would be similar to Thread Sanitizer or helgrind. With language support, it could be constructed to avoid all race conditions. However, since race conditions would only be detected dynamically, it would mean that your program could still crash occasionally at runtime. This would be much better than the current situation, in which your program can behave unpredictably at runtime. The dynamic information would also provide a lot of the information required to debug the problem.

This approach could also be used statically, but I suspect it would require fairly extensive annotations on functions to indicate their synchronization status. It might be possible to deduce some of that statically using whole program analysis, but in a language like Go with channels I think that would be difficult in practice and impossible in theory (i.e., some programs would provably require annotations, and most programs would in practice require annotations).

So what would the dynamic approach look like? It would basically mean instantiating the Go memory model in each operation, tracking the happens before relationships. For example, give every goroutine a ticker. When a goroutine sends a value on a channel or acquires or releases a lock, increment the ticker. When sending a value on a channel or releasing a lock, attach the current goroutine and its tick. When receiving a value on a channel or acquiring a lock, record a happens-before relationship from the attached goroutine/ticker to the current goroutine/ticker. When modifying any global variable, including any shared memory, annotate the variable with the goroutine and its ticker. When reading any global variable, look at the current annotation, and require that there be a happens-before relationship between that annotation and the executing goroutine.

What’s nice about this approach is that it doesn’t require any change to the language or to your program. What’s not nice is that you have to pay a heavy execution cost, and you only catch all actual race conditions, you don’t catch all possible race conditions. It would be nice if the heavy execution cost could be mitigated by static analysis, but in general I don’t think it can. Escape analysis can prove that some variable reads don’t require checking annotations, but I think that even simple programs will defy such analysis for most reads.

On the other hand, with some caching, perhaps the execution cost would not be prohibitive. Most reads are going to be of values written by the current goroutine. That requires an extra read and comparison for each memory read, which would be bad but perhaps supportable. Since one would presumably want to track individual memory words, memory usage would double. Alternatively, since the Go memory model is written in terms of variables, one could instead have an extra word per variable. Then pointer accesses would require identifying the appropriate variable, even if the pointer were to the middle of the variable.

I doubt this could ever be the default way to run Go, but it would certainly be interesting to try an implementation and find the real performance effects.

Comments (15)

Index Funds

It’s a general guideline that people who do not follow the market obsessively should invest in index funds. Compare to mutual funds, they are lower cost, and over time tend to perform about as well, particularly when costs are taken into account. Compared to investing in individual stocks, they are much lower risk, because they are more diversified. Also people who do not follow the market obsessively tend to be extremely bad at timing their moves in and out of the market.

However, investing in index funds only makes sense if you believe that the overall stock market is going to go up over time. The New York Times has a nice graph showing the S&P 500 return on an annual basis since 1920 compared with inflation. There are plenty of time spans where the S&P 500 did not keep up with inflation—the dark red areas. Note that in the light red the index did keep up with inflation, but the average return was less than 3%. The green areas are relatively infrequent.

What this tells us is that the stock market tends to increase slightly faster than inflation on average, and that there are big chunks of time where it does not. But why should the stock market increase faster than inflation at all? While clearly the stock market often has nonsensical prices for individual stocks, these seem unlikely to persist across the whole market for decades at a time. But growing faster than inflation over a long period of time implies that money is coming in from somewhere. We have to assume that it is not the case that people are increasingly pulling out their savings to invest in the stock market, so the money must be being created. Normally creating money tends to lead to inflation over time, but here we see the stock market growing faster than inflation. So it seems to me that the money must be coming from a slow but steady increase in the size of the overall economy, presumably due to increased productivity.

This means that you should invest in index funds if you believe that the economy is going to continue to grow. It also means that, realistically, you can not expect a large return over time. The economy has historically grown, but it has not grown all that much faster than inflation. You should certainly take that into account if you expect to retire on your savings. In particular it’s easy to trick yourself into looking at the absolute values of your projected savings without taking inflation into account.

It’s also interesting to consider that while the growth of the economy does over time mean that there is more money to invest, there is no rule which says that that money must be invested in the stock market. That is historically what has happened. Why should we expect it to always happen in the future? The short-term market has become a mugs game, in which professional traders with high-powered computers are going to outdo any ordinary person. The long-term market is full of established companies with executives who work with directors to siphon off ever-increasing salaries and build short-term stock pops at the expense of the long-term management of the company. I don’t see any signs that these problems are leading most investors to avoid the stock market, but it seems possible that it could happen—at least as possible as having housing prices drop across the entire country for the first time ever.

So what should you do with your money? I think the main thing to do, assuming you don’t care to follow it obsessively, is to not count on it to grow fast. There are a lot of very safe investments you can make, such as Treasury Inflation-Protected Securities (TIPS). It may be a comfort to know that you won’t lose money.

And what if you don’t have any money to invest, as is true for the majority of people in the U.S.? Then you’ll need to plan to live on Social Security; it’s not much money, but it’s much better than nothing. There are few jobs with pensions any more, and existing pensions are under heavy threat. Or, you could just never retire; at least you’ll be helping the economy grow for everybody else.

Comments (2)