Patterns in static

Your own personal economist

Notes from your economist pal, Eric Blair





navigational aids:

 




topics covered:





News ticker:





Promotion: A mock-up cover for Math You Can't Use that the publisher rejected. It features the patent for the Banana Protective Device.

BUY THIS BOOK!

(Not the actual cover, but here's a larger version anyway.)

Find patterns in static!





This site is listed on Blogwise, the DC Metro blog map, and (sort of) DC blogs.

the feedback logo. It rotates.

24 October 08. Capital, liquidity, and the crash

[PDF version]

Today's guest blog is from Mr AF, of Washington, Columbia. Last time, I talked about how money turns into money from the perspective of the squishy concept of our human perception of value. Mr AF works for a federal agency that is deeply concerned with the not-at-all squishy problem of measuring the solvency of U.S. banks. Many of the massive simplifications I was comfortable making in my micro-focused context are of vital importance in his macro context. Notably, where I wrote of reserve requirements, I should have been talking about capital requirements; and where I wrote about velocity, Mr. AF writes of liquidity.

Also, Mr AF is careful to distinguish among the different levels of money: M0 = cash, M1 = M0+ checking-type accounts, M2 = M1 + savings-type accounts, M3 = M2 + still less liquid accounts.

Bank reserve requirements are governed by the Fed's Reg D, specifically §204.9, which says that banks have to hold somewhere less than 10 percent of reserves on transaction deposits (basically, checking accounts) and there are no reserve requirements whatsoever on nontransaction accounts (e.g. savings accounts and CDs). You can see from FDIC data that nontransaction deposits make up about 90 percent of deposits, so for all practical purposes the impact of reserve requirements on slowing lending is nil.

What really constrains lending are capital requirements, the details of which vary by regulator. (Note: there are four main federal bank regulators, the Fed, the OCC, the FDIC, and the OTS, plus 50 state regulators, plus the SEC et al stick their nose in as well...) These are terribly complicated affairs, much more difficult to compute than your standard 1/(1-R) where R is the reserve requirement. Capital is meant to absorb losses when the bank's loans go bad. The United States is in the process of implementing Basel II regulations--these are international standards set by the Bank for International Settlements. Some banks get to use pretty standard ratios and some (the big guys) have to use complicated mathematical formulas (PDF). Either way, though, the money multiplier is more based on capital than reserve requirements. Note that the capital requirements for Treasury securities are 0 percent, and are just 2 percent for high-grade corporate loans and high-quality mortgages.

Why am I making a big deal out of the difference between capital and reserve requirements? Because it goes to the heart of modern understanding of money and liquidity. Why doesn't the Fed care about reserve requirements any more? Because as long as the bank is solvent (has positive capital), it can borrow the money it needs for unexpected deposit outflows from the fed funds market, the discount window, the Federal Home Loan Banks, issuing commercial paper, selling loans or securities, etc. That is, in a perfectly liquid market, all that matters is solvency.

This is important because the primary thing that is special about money is that it is perfectly liquid. Everyone must accept cash as payment (“legal tender” and whatnot). However, if you own a bunch of assets (remember, loans are assets of banks and deposits are liabilities) and can sell the assets for their true/fair/market value whenever you want then you basically have cash and the money supply is the entire value of the country's assets. This includes loans, bonds, stocks, houses, microwaves, leaves of grass, etc., anything that has value. This is somewhere in the neighborhood of $90 trillion according to the Flow of Funds data (PDF). Compare this to M1 of about $1.4 trillion and M2 of about 7.5 trillion, according to the Fed's H.6 release. The Fed doesn't publish M3 any more but it is probably less than twice M2. Some of the assets are pure assets (like a house without a mortgage) and some are leveraged (i.e., the owner of the asset also has liabilities). Don't forget that money itself is a liability on the Fed's balance sheet.

Nobody wants to actually hold any more cash (either physically or metaphorically) than they have to--cash is a depreciating asset, it has no rate of return and loses value with inflation. Ideally we would have a society without cash and just trade assets back and forth--Fischer Black envisioned “a world without money.” However, there are sometimes transaction costs associated with trading assets (which is why we need a medium of exchange in the first place) and so cash still has a special place in the system. Think of liquidity as how easy it is to turn an asset into cash. You can turn your house into cash but it will take a 6 percent commission and several months. You can turn a Treasury into cash a lot faster in terms of both commission and time. So Treasuries are basically the same thing as cash, while houses are like cash but say at a 20 percent discount discounted for both monetary and time costs of turning it into cash). William Barnett has done a lot of good work trying to figure out what is the liquidity-weighted amount of money in the system.

So here's the thing: in a very liquid world, the money that the Fed metaphorically prints is pretty much irrelevant--people are basically trading assets and just swap into money briefly to complete the transaction. The Fed can print a lot more money or a lot less and it doesn't really matter to the participants; their money is determined by the size of their balance sheet, which is determined by capital requirements. Google “endogenous money supply” for a lot more on this subject.

Velocity, as traditionally defined in macroeconomics 101, is equal to PY/M, or basically nominal output divided by the nominal money supply. However, this equation is foundered by how to define the money supply (putting aside problems with measurement of P and Y for the moment). I've just argued that you can't use M1 or M2 for this, you need to measure total liquidity-adjusted assets in the system. But if liquidity is based on “animal spirits,” search frictions, all that sort of thing that the Fed can't control then its actions are pretty much irrelevant. Which makes the concept of velocity pretty much irrelevant--much more important is liquidity. And what M1, M2, M3, etc. measure is the amount of assets in the system relative to the base money supply of Fed notes and reserves.

So what matters is liquidity in the system, which you may have noticed has basically vanished over the last year. A lot of mortgage-backed securities are going to eventually pay off at much better rates than they are currently priced at but there is no liquidity out there whatsoever. Think of it as an increase in money demand, to come back to the basic IS/LM model that modern economists scoff at(PDF). There is a decrease in demand for less liquid assets and an increase in demand for more liquid ones. That's why you have the TED spread at record levels--there is not really that much credit risk differential, it's a liquidity differential.

Trying to figure out what are the determinants of liquidity is one of the main research topics of modern macroeconomics, in my opinion.


[link][3 comments]

on Friday, October 24th, techne said

Trying to figure out what are the determinants of liquidity is one of the main research topics of modern macroeconomics, in my opinion.

As a neuroscientist/behaviorist, it seems like panic and the herd mentality has got to be an important determinant. Is this what Kahneman’s Nobel Prize is all about? What’s the state of that research and how is it being applied here?

There is a decrease in demand for less liquid assets and an increase in demand for more liquid ones.

(puts on another hat) As a thirtysomething non-economist who currently has a bit o' extra money, how can I turn this to my advantage? :)

on Saturday, October 25th, AF said

Good news, techne, you can get paid for holding illiquid assets. The problem is that you might see them go down in value before they go up. It's hard for individuals to buy specific securities directly but I think I saw recently that Pimco up to like 80% weighting in mortgage-backed securities, you might look into that. Note that I am not an investment adviser.

I don't think that Kahneman did much research in this area specifically -- he is better known for looking at how individuals think about risk, but I am not sure that he has done much about the role of group mentality or anything. Robert Shiller is more the name I think of here, although he is not a psychologist or neuro-economist or anything. But none of these guys have come up with (in my opinion) very interesting or useful results besides saying "herding happens" and "people are irrational". Why do they herd, when does it happen, how/should we control it? The literature on information cascades is interesting but hasn't really gone anywhere either.

on Monday, October 27th, AF said

I just found out that the Bank of Canada eliminated all reserve requirements in the early 1990s: http://www.bankofcanada.ca/en/review/summer05/djoudad.html
The Canadian banks still have to meet capital requirements though (which were in some ways more restrictive than in the U.S., which is part of the reason that none of the Canadian banks have blown up so far).

Comment!
Yes, the comment box is tiny; write in a real text editor then just cut and paste here.
If you are a human, type the letter h in the first box.
h for human:
Name:
E-Mail:
Homepage:
 
 


10 October 08. Velocity, risk, and the crash

[PDF version]

OK, about every day, somebody asks me to explain the bailout package, the credit crunch, the subprime mortgage thing, or the concept of ownership of land.

There are so many sources that can give you the basics of what's going on that I'm not even gonna bother to give you a list of links; ask your local newspaper or search engine. I'll touch on that, but this will primarily be a topic essay going back to my favorite question: the creation of value.

Or, in this case, the creation of money. You know the government prints money (economists call this M0), and for the purposes here, that's all easy enough. But there are other means.

The first, still in the hands of the government, is simply to print bonds. A promise from the U.S. government to hand you cash 90 days in the future is basically equivalent to cash.

But it gets more interesting when other parties throughout the economy get to produce money.

Reserves
We'll start with our banks, who re-loan the same money repeatedly. Say the bank has $100 million in deposits, in air-conditioned quarters (i.e., cold, hard cash). It then lends out $90 million. The people getting the loans will tell you `we have $90 million,' and the people holding savings accounts will take out their bank statements and say `we hold $100 million.'

The bank has just generated $90 million in cash.

In fact, the requirements for banks are to hold in the ballpark of 10% of actual savings, and they can lend out the rest. Naturally, every bank will lend out to the hilt of that allowance. So if the Treasury prints a $100 bill and hands it to somebody, who puts it in the bank, the Treasury has really just put $190 in the economy.

Margin
The brokerage firm can do the same sort of trick with its clients. Using the deposits and positions on hand, it can lend cash to investors, so somebody who walks in with $100 can buy $200 worth of stock. This is known as buying on margin. Your $200 worth of stock is probably only going to shift a marginal amount in a week or two, to maybe $180 at the worst, so the $100 that the brokerage lent you is just going to sit there doing nothing under most conditions. The name margin emphasizes that we just care about the variation at that last 10% or 20% of the price. A person buying on margin deposits only 50% of the stock price and thus comes closer to just buying and selling the marginal changes.

Despite the story in the last paragraph, the 50% requirement is not insurance or an attempt to ensure that you'll be able to pay back the brokerage's loan. No, the margin deposit comes from Federal Reserve Regulation T, and it's the Federal Reserve that sets the rules because borrowing on margin is another means of printing money. At the brokerage firm's accounting department, it's very much like the bank's lending out of savings.

We typically only sell items we already own, but that's just how squares think. Why not sell the stock first, thus having a debt of so many shares, then buy them later? It's is what you would do if you expect the share price will fall: sell today at $100, buy tomorrow at $90, and pull a $10 profit. This is known as a short sale. For our purposes, in that first step, you walked into your brokerage house with a 50% margin deposit of $1,000, and after selling your imaginary shares, walked out with $2,000. Sweet: you just printed yourself some cash, which you can use for other purposes anywhere else in the economy, until you decide to flatten out your short.

Returning to a prior example, the Treasury prints a $100 bill, and I put it in the bank, which then lends out $90 to a trader, who hands it in as margin requirements to make a short sale, walking away with $180. And my savings account still says I have a hundred bucks in savings. What will the trader do with her newly-minted $180? Maybe she'll put it in her own saving account, so the bank can loan $160 to somebody else.

All is full of value
So, back to value. Picture somebody handing you $100. I don't know how you'd respond to that. Maybe you picture what you'll buy with it, or maybe you have an inherent sense in your gut, down where you feel love and hate, that you've been given a piece of paper with some value.

Now consider the papers above. If somebody hands you a $100 90-day bond, you don't get to contemplate the size of Benjamin Franklin's forehead, but you can more-or-less have the same experience of weighing the paper and contemplating the same sort of purchases.

Although we don't often get handed stocks or loan certificates, they have the same weight, and can be traded for the same sorts of things that one could buy with cash. If holding a $100 stock in your hand doesn't have the right weight in your gut, then just sell it for the equivalent value in Benjamins and go back to buying as normal.

The point here is that our currency--the store of our economy's value--is not just bills printed by the Treasury, but all of these financial papers. We care about the velocity of all of it, including the bonds, loans, stocks, and whatever else.

Velocity
Mortgages and other loans are frequently bought and sold. E.g., my own mortgage was sold even before I could make my first payment. There are many reasons for selling a debt: one company may be good with retail customer-wrangling but another may be more efficient with monthly servicing; a company may want to offset the risk from another position with the semi-reliable income from being on the receiving side of a debt; a company may have an opinion that interest rates will move in a way the first company didn't expect; a company may want to bundle several loans to pool risk.

Whatever happens, selling the loan frees up the original lender to make more loans.

If you've been following all the subprime mortgage stories, (OK, here's one link on the subprime mortgage crisis in audio or transcript format), then you know that these subprime mortgages were re-packaged and sold, then re-re-packaged and resold, and so on. Then, as the loans at the base of all this reselling started to default, the loans stopped moving.

Money has a velocity--the rate at which it changes hands; macroeconomists have reduced the measurement of this velocity to a two-decimal-point science. High velocity is the sign of a good economy. Getting back to a fundamental axiom of microeconomics, every trade makes both sides marginally better off, so more trades mean more marginal improvements. Money sitting in the bank is sitting; money being traded is hopefully building something and making people happy.

Or if you think I'm being too airy about perceived value, then go back to the above examples: I put $100 in the bank, and Joe borrows $90. Joe put it in the bank, and Joe's bank now has the reserves to loan out $81 to Jane. Jane puts $81 in the bank, and Econ 103 students have to work out how much money is in the system after a hundred such deposit-loan steps. hint: a thousand bucks. If we cut this off at the third loan, then there's only about $340 in the system. So not only does trading add value in a conceptual sense, but in the context of the reserve/margin system, it creates value on the accounting ledgers, too.

At this point, you can see that one of the most fundamental questions the U.S. government faces is also one of the most difficult: how many dollars are there in the system? If we print a billion more, how many billions more will spawn therefrom? If we nudge the various requirements by half a percent, how will that grow or shrink the dollar count? This column is of course leaving out a thousand details, like how Panama, El Salvador, and Ecuador officially don't have their own currencies, but just use US dollars for all affairs, and have their own wheels-within-wheels of dollar multipliers, while many other countries unofficially do the same thing.

The dollar count is fundamental, because too much leads to inflation spirals and too little leads to a sort of block in the economic plumbing where nobody can pay each other or get cash for new projects, and yet we can only make educated guesses about how much cash is in the system. But the overwhelming consensus at the moment is that we're on the blocked-plumbing side of things--the so-called `credit crunch' that the recent bailout bill is intended to alleviate. These banks deputized to produce money can't do it unless they sell off the existing loans, but those loans aren't moving, because so many of them are terrible and worthless investments. Those lousy mortgages are simply sinks of value, where it sits and waits and prevents banks from re-using it for other purposes. These loans have become brakes on money's--and thus the economy's--velocity. So the bailout bill gives the government a means to produce a few billion more dollars by buying those crappy loans, and thus freeing the banks to re-use their money-producing powers.

Of course, our government frees up this money partly by printing bonds. Why don't they just print a few billion in $100 bills and then drop them from helicopters? There are explanations for this, but they involve ritual goat sacrifice, which I'm not willing to make for the sake of this superficial discussion. If you want my official opinion about the bailout bill, I personally remain skeptical.

Risk
We like velocity, and elements of the system, like the margin allowance that lets traders make twice as many trades, are intended to maximize velocity. But the risks inherent in all of these velocity-generating and money-producing tricks are clear enough. What if all the depositors come asking for their money at the same time? What if you short sell a stock, making $100, and then the stock price rises to infinity? You've just lost infinity minus a hundred dollars, and now you can't pay back the bank that lent you the original $100, and once you default on that loan they don't have the cash depositors entrusted them with. Authors often use game metaphors here: maybe house of cards or domino effect. Things are not so fragile as that implies. Me, I think the appropriate metaphor is to Twister.

One intuitive response to a disastrous calling-in of all risks at once, like we're seeing now, is to say that we shouldn't be taking all these risks, and should go back to just having the darn Treasury print money and leave it at that.

First, we can't go back. The size of the economy, in the sense of how many dollars people think they have, is calibrated to this system where banks and others can contribute this multiplier effect to the cash printed by the treasury. There's a right amount of money in the system to keep things lubricated, and the current balance depends on these margin-type rules. Also, I really can't picture how banking would work without inducing at least some multiplier (though you can easily argue that 90% is too loose). This doesn't mean that we can't have checks on the system--and more are clearly needed--but the basic contours of the system aren't going anywhere.

If we repealed Regulation T and banned stock trades on margin, then the Treasury would more-or-less have to print that money itself. Same with the bank loans that turn $100 into $190. The government hands off the responsibility of producing money to others, and the cost of that delegation is a risk that the so-entrusted banks will fall apart.

So, you pull out your bank account statement, and it says you have $100 in checking. Where does that value come from? It is partly based upon a promise from the Federal government, but partly a chain of promises from your bank, other people who deposited money into the bank, the other side of the loan or stock transaction that let people make that deposit, and so on and so forth. That $100 is not backed by trust in the U.S. government, but by trust in the whole of the economy.



[link][2 comments]

on Friday, October 10th, someone you know said

You are so frickin' smart. Just saying. If people don't read read your blog then they just don't get it (I know, this is WAPO's thing).

on Saturday, October 18th, techne said

That $100 is not backed by trust in the U.S. government, but by trust in the whole of the economy.

ooo! Chills.

Sarah Vowell was on the Daily Show recently and referenced this bit from the first fireside chat:


After all there is an element in the readjustment of our financial system more important than currency, more important than gold, and that is the confidence of the people. Confidence and courage are the essentials of success in carrying out our plan. You people must have faith; you must not be stampeded by rumors or guesses. Let us unite in banishing fear. We have provided the machinery to restore our financial system; it is up to you to support and make it work.

It is your problem no less than it is mine. Together we cannot fail.


Quite a contrast to the messages we are receiving today--partisans would say that's due to a failure of leadership, but it seems that the global media's news cycle requirements are playing a bigger role.

Comment!
Yes, the comment box is tiny; write in a real text editor then just cut and paste here.
If you are a human, type the letter h in the first box.
h for human:
Name:
E-Mail:
Homepage:
 
 


24 September 08. Causality and ethics

[PDF version]

There's a platitude that it is ethics that distinguishes humans from the rest of the natural world. In prior posts, you've seen me say that humans are distinguished by their ability and tendency to perceive causal relationships. These two statements are closely related: without causality, there can be no ethics.

Some causal chains are obvious, even to young children: if I drop a plate, then it breaks. If I kick the dog, the dog will bite me. For those that are not so obvious, you can help your child by laying it all out line by line. Here is Joe. Joe committed a misdeed. As a result, Joe's misdeed came back to him and he suffered. Here is Jane. She committed a virtuous act, and as a result, she was rewarded for it. The end.

Person does good, is rewarded; person does bad, is punished may sound simplistic, but it is the canonical format used by most of the stories we hear or see or read. The modern version of Little Red Riding Hood (as alluded to last time), all of Aesop's Fables, the one about Snow White and the vainglorious queen, any romantic comedy, they all tie reward to the virtuous and punishment to the misbehaving. We'll get to the stories that don't riff on that theme below.

These stories help us to move up the ladder of causal subtlety from mechanical misdeeds like kicking the dog to societal issues like littering. Thus, causal stories of the form virtue reward and ill behavior punishment are really central to building a society.

It so happens that religious stories directly fit into the same structure: the omnipotent overseer makes certain that good reward and bad punishment. Where no simple causal mechanism exists, the omnipotent overseer defines one.

The lit
I think it's so completely obvious that morality is taught through causal chains that I don't feel much compulsion to provide a host of references, but let me give you one or two so you know I'm not entirely making this up.

First, we can point to Jean Piaget, an oft-cited pioneer in the academic study of child development. Among others, he wrote many books on how children develop cause-and-effect relationships, and one entitled The Moral Development of the Child (that has almost no discussion of causality). So this could be traced back to Piaget's writings circa 1930 if you were so inclined.

The intro pages to Karniol (1980) give a nice summary of the modern interpretation of Piaget's moral stories, and examples of how kids sometimes take the causal story to what we consider an absurd extreme (e.g., the boy stole the bike the bridge collapsed). She also ran experiments on about 150 elementary school children. They were read skeletal stories of the form Joe stole money. Later, Joe fell down the stairs. or Jane lied. Later, Jane fell in a puddle. There were a range of types of causality, including immanent causality (the result is because of something inside the person), asyndetic1and/or mediated causality (it was the person's action, but mediated via another force), or chance causality (which is delightfully not jargon). Chance causality explanations were basically the least popular, ranging in use among the five grades from 16 to 34 percent; mediated causality ranged from 58 to 86 percent usage; immanent causality ranged from 23 to 47%.

That's the first experiment; the final experiment, using only kids who'd given a mediated causality response in the first experiments, and a story in which the kid in the story gets struck by lightning, was able to induce a greater recourse to chance causality among the listeners (70%). But the first two experiments (and another story in the third experiment where the boy breaks his leg) still show that if there is no causal story spelled out, the brain of the listener will probably invent one. If you want more, Karniol gives a dozen or so other papers that come to similar conclusions: even the youngest kids will see a link between a person's actions and the eventual outcome when there is a relationship to be had, and will invent one when there isn't.

Variants of the story
Now that the canonical story is ingrained in us, hard, there are all sorts of variants that turn our causal expectations around. Some just make for a better story, but others begin to show flaws in the system.

The ending to Moby Dick was so gut-wrenching because it was so outside of the entire framework. I'm a bit amazed that it got published and sold well enough that we've heard of it, given how much it bucks convention.

Adult fiction is filled with what we call moral ambiguity, by which we mean that the virtuous aren't rewarded and the evil aren't punished. This is not to be confused with stories that create tension by allowing the bad guy to win halfway through, getting the princess or the thousand pounds of gold boullion both props play the same rôle in the typical story. In those half-win stories, tension comes from our knowledge that the inevitable downfall will only be worse after the temporary victory.

Many bookshelves have been filled with Dark Knight-type stories about characters of ambiguous virtue. But we humans have an easy solution for these stories: if we are firmly wired to see virtue reward, then we eventually start to see reward virtue. In logic class, it'd be a blatant error to conclude the second relation from the first, but we're not talking about logic, we're talking about how people think.

If you're an Objectivist, you learn that whatever it takes to gain reward is by definition virtuous. If you follow other sorts of commerce-oriented ethical systems, then you follow a similar but looser line. And as the cliché goes, might makes right. In the other direction, I've heard more than enough people give me a line like `it's not illegal, so it's not unethical', which in this context means no punishment not evil.2

Or, it's easy for both kids and adults to miss what the cause that led to the final outcome. It's downright cliché that the protagonist is attractive and the antagonist ugly, from which we are taught that attractive reward; ugly punishment. Add this to the last paragraph, and we find that unattractive = evil, which I find really is how a lot of people think.

If the virtuous are always rewarded and the evil always punished, then anybody who is being punished must be doing something wrong. If we see a person, or a group of people (grouped by language, size of nose, or genitalia), and find that they are doing worse than others, our brains work overtime to fill in the blank in the relation ______ punishment. E.g., if they hadn't eaten from the Tree of Knowledge of Good and Evil, they wouldn't be worse off.

Now, all those stories are really just practice for what happens here in reality, where we write our own stories. The non-fiction evening news is making a huge effort to fulfill our expectations: the evil have to be punished, and (from time to time) the virtuous have to be rewarded. As viewers, our expectations about how the world should be are very high. If the assailant doesn't go to jail, then we're left with the frustration of a story cut short just before the resolution. If their country is evil, and our country is virtuous, then there is tension until we find a way to bring about some sort of punishment for them, preferably in a manner that brings rewards to our contractors.

And so we see a great deal of our legislative and interpersonal effort put into making sure that rewards and punishments are eventually paid out, even though the only real benefit may be the sense of resolution that comes from making the world fit the stories we were told as kids.

We all have these virtue reward and evil punishment relations tatooed to the inside of our foreheads. Our parents made sure of it, by teaching us ethical causal stories at the same time that we were learning more mechanistic causal stories. If they didn't present us such stories, we'd just make up our own. But the mechanical relationships like I drop the plate the plate breaks are much more robust than the relationship between nice behavior and reward, to the point that we can easily invent unverifiable relationships, like how a pretty face and big muscles implies virtue, or spilling one's seed is evil, or that whatever person we've never met before is getting exactly what he or she deserves. The ability to develop and understand causal stories, which makes us human, gives us ethical beliefs, and allows us to construct a society, is exactly the same force that lets us dress up self-interested behavior as virtue, makes us pine for retribution against perceived slights, and nudges us to wish ill upon those who look or behave differently from our ideal.

@article{karniol:immanent,
title = "A Conceptual Analysis of Immanent Justice Responses in Children",
author = "Karniol, Rachel",
journal = "Child Development",
volume = 51, number = 1,
pages = "118-130",
url = here,
publisher = "Blackwell Publishing on behalf of the Society for Research in Child Development",
year = "1980"
}



Footnotes

... asyndetic1
Syndetic: Serving to unite or connect; connective, copulative.
... evil.2
It's not as if I know what the True and Correct ethical system is, but an ethical system that directly equates individual benefit with ethics is really just the state of nature calling itself ethics, and a rejection of the idea that we humans can develop beyond biology.


[link][a comment]

on Wednesday, September 24th, Spoofy said

Nice post. What happens if an attractive person spills his seeds? Is this person still evil? OR does this go back to the ambiguity thing= only ugly people that spill seeds are truly evil.

Comment!
Yes, the comment box is tiny; write in a real text editor then just cut and paste here.
If you are a human, type the letter h in the first box.
h for human:
Name:
E-Mail:
Homepage:
 
 


3 September 08. Google OS (aka Chrome)

[PDF version]

OK, Ms ABR of Washington, Columbia asked me to write about Google's new browser, so here goes. I'm typing fast, editing lightly, and posting on an odd-numbered day.

Google's browser is an attempt to shift the position of a long-running search for balance, over where work is to be done. So this discussion of the browser has to start with a brief history of networked computing.

We begin with your mainframes of old, like before we were born old, which often had terminals attached. Terminals, like the terminus of a railroad station, were the end of a line out of the central system, where the end in this case has a screen and a keyboard attached. You would send requests from your little end of the line, they would go to the mainframe, and then it would send results back down the line. Thus, these terminals were called dummy terminals, because they did no thinking, just relaying keyboard presses and displaying the output.

This is why the personal computer revolution was so interesting: you now had terminals that looked like dummy terminals (like the TRS-80) that were capable of doing things on their end of the line. So home users, who had no mainframe to attach to, were increasingly using these little terminals to do independent work that the dummies could never do.

Now, put a mainframe capable of math on one end, and a terminal capable of doing math on the other. The key question for the rest of the essay: who does the processing?

To make this more concrete, jump forward to the Internet age. You type in a web address, and the server sends back a big block of text. That's dummy terminal mode, where your computer is doing minimal thinking. Now say you go to a site with silly Flash or Java games. You go to the site, you get a bar that says `loading' on the screen for a minute, and then you play your game on your screen, without really talking to the server. Now things are reversed: the server just read your request and dumped back data, and your PC does all the work.

Or say you go to Gmail. It has a `loading' bar like a Flash game. But the server is active, because it's trying to find your new mail, starred mail, spam count, and so on. But your PC is active, because it's opening and closing window bits without talking to the server, autocompleting and highlighting things when your mouse is in just the right place, and so on. There's a sweet spot between work on the server side and work on the client side; a lot of people think Google has hit it. No citations today. But try typing 'Google sweet spot' into, uh, a search engine. Me, I think Google has missed it: my email should not need a `loading' bar, but that's just opinion.

Virtual machines
Why not have the client do everything? That's the clear trend, but it's been tried before, and past efforts were not as victorious as hoped. Recall Java, which emerged with much hype the mid-1990s as the way to get networked computing onto our increasingly smart client PCs. In retrospect, we can see Java's failings pretty clearly. First and of least importance, it emerged in the middle of the object-oriented fad of computing, and the language itself went way overboard.

Second, it relied on a virtual machine (VM) that never ran as well as we'd have hoped. Sun promised to write a VM for any device (telephone, Windows box, Linux box) that would handle the guts and details, and then you'd write a program in one language--Java--that runs on all these machines. But the VMs were all a little different: at the least, your telephone has buttons that your PC doesn't and vice versa, so how do you write something that works in both places? But the big virtual machine difference was between Sun's virtual machine and Microsoft's. Microsoft's Java machine was designed to be incompatible, as recorded in a ton of court documents. You'll recall the press about the Microsoft antitrust case, which was mostly about Microsoft killing the Netscape browser, but the real crux of the case was about how the browser carried a Java VM, and Microsoft felt it important to kill the VM.

So once you download a Java program, it might not run. Running from a virtual machine instead of native to the hardware, it might run slowly. And finally, there was the downloading issue: a Java program is too much for a guy living in 1995 with spotty AOL dialup to use without frustration.

But the virtual machine idea was a good one. It's a fabulously attractive idea to have a code-running box that manages all the low-level work, so programmers can do the high-level stuff. It's so fabulous that Microsoft does it: their .NET framework basically allows you to write in any language, then translate it to their .NET machinery to run on a Windows box. This is exactly the abstraction Java did, but .NET is written around Windows machines. The virtual machine idea predates Java. Infocom games, like Zork or the Hitchhiker's Guide to the Galaxy, were data files for a virtual machine. The Infocom VM is easy to rewrite; I could get one for my telephone.

Your browser is a virtual machine. Every browser can read JavaScript (whose code has no discernable relationship to Java--the naming similarity is pure advertising), and can run Flash, and load Java programs. That's why Google's mail program can run on basically any machine as long as you have a browser to interpret its Javascript.

The family tree
One of my favorite things about how modern computers work is the fork/exec model. I won't bother with details, but programs can start other programs. Every process has a parent process (unless the parent died, in which case it's an orphan), and no program can spawn out of nowhere: it needs a parent. This is how the entire thing works, from boot to shutdown: you start with init, then init forks off a new program, say the bash shell. Then the bash shell forks off a browser when you type firefox at the command prompt. Then you open a lot of tabs in Firefox.

The process model gives you stability, because the children are only vaguely related to their parents (mostly via carefully-controlled interprocess communication), and if the parent has issues, then they won't affect the child, and vice versa. It's the operating system's job to make sure that this is the case, and to make sure that the processor gives fair time to every process running, where by `fair time' I mean access to the hard drive, the processor, and other physical resources the OS is taking care of.

So back to Firefox, which does not spawn child processes (to speak of). It's one monolithic blob to the operating system, not a family, so, e.g., if one blob of Javascript fails in one place, then all the others will also be stuck.

Google Chrome is prolific: it is designed to spawn lots of children. For every web page you have open, you should have a separate process. So let's review: you have a Javascript program (aka a web page) in one tab, and that tab is its own process that the operating system treats equally to every other program. Yup, sounds like a standalone virtual machine to me, exactly like the Java VM or Microsoft's .NET.

So Google has taken those last steps to make our typical programming languages of the Web exactly the languages you need to write standalone programs for any operating system. With a few lines of Javascript and HTML, you can write and distribute a standalone Windows program.

Or to put it more directly: the operating system now gives equal treatment to Google Docs and Microsoft Office.

Critique and politics
The Google VM will definitely benefit Google: they've got the lead in programmers who speak the language that their VM speaks. Does that make their browser evil? Maybe, but as evil goes, this is pretty beneficial to everybody (except Microsoft), because another VM choice may allow some fun new applications.

In fact, Google has made their code available under a relatively corporate-friendly member of the family of free software licenses (BSD). Why? Because they don't care about vending VMs, they want to make sure that absolutely everybody has such a VM, so that it's feasible to write for the Google VM rather than for .NET or whatever other toolkits might be hanging around. How getting people to choose Javascript over .NET will turn into $$$ for Google is left as an exercise for the reader.

Oh, here's one hint (along one of several threads): go back to the problem of balancing work on the client and server ends of the cable. If Google gives you software that grabs more processor time on your PC for Google Docs, then it can redesign things so that its servers in California don't have to think so much. Google doesn't have to spend cash on new servers--they just use more processor time on your PC. Google is thinking maybe you can pay the darn electricity bill for once.

Further, mainframes are not particularly smart. From my own experience buying servers for research, the big boxes are designed to push lots of data through a pipe, hold a big database, mesh together into an army of servers, and otherwise handle lots of little requests. But the processor on some servers is identical to the processor on a high-end PC, and ten cheap PCs would easily run circles around one blade of a server. So the only way that Google could feasibly make a million instances of Docs smarter is to push work out to the clients.

As a digression, all this processor-seeking touches upon one of my personal pet peeves: VMs are slow. As I type, I'm waiting for Amarok to add an album to my playlist. This is not something that should require waiting for (op--it's done), but Amarok is written in Ruby, which allowed for all sorts of nifty widgets that would take longer to write from scratch. Hey, just click a performer and pull up their Wikipedia page in your music player, all while you're waiting for their music to actually play. So I'm not sure if we can expect too much richness from Google's new virtual machine, though maybe for once the promises that it'll be better with next year's faster processors will actually come true.

But that's all the critique I've got. Google has taken that last step to turn the web pages of the type in which they specialize into bona fide applications that the operating system treats as such. That's nifty, and means that we can expect our web pages to turn increasingly complex and to increasingly take advantage of the processing power on our end of the cable.


[link][2 comments]

on Thursday, September 4th, AB said

Well thanks for this post Mr. Blair. I do have a few questions: If Chrome really does slow down people's home computers, won't there be a public outcry (and eventual disuse). Also: Has anyone else heard of the IP issues that seem to be surrounding Chrome?

on Thursday, September 4th, the author said

--Your processor has 100% time to allocate to everything running. Before, if you had Word and firefox running two tabs, then Word gets 50% time, Firefox tab 1 gets 25% time, and Firefox tab 2 gets 25% (very, very roughly). If they're all separate processes, each gets 33% time. Overload with the first setup just pushes Firefox to crawl; overload in the second case slows Word down too. [very, very roughly.]

If you care about the performance of those two tabs running Google Calc and Google Reader over Word's performace, then great, you like the additional boost. If you care about Word's performance, have enough tabs open, are sensetive enough to performance issues to notice, the "very, very roughly" part doesn't have too big of an impact, resource allocation is sufficiently consistent that there's a pattern, and you catch that pattern, then maybe you'll be annoyed that the new browser takes more resources from other standalone programs.

--There was a bit of an outcry over some boilerplate language in the user license that all the data that passes through Chrome is the property of Google, we own your passwords, and so on. They retracted it pretty much the next day, and I'm willing to believe that it was really just lawyers overcovering their asses, and not really what Google was after. Anyway, this is on a BSD license, you can pick through for any monitoring code and see how Google is watching, if at all.

Comment!
Yes, the comment box is tiny; write in a real text editor then just cut and paste here.
If you are a human, type the letter h in the first box.
h for human:
Name:
E-Mail:
Homepage:
 
 


26 August 08. Statistics as unbearable longing

[PDF version]

Introductory logistics: This is part two; see the beginning here. Also, I just finished writing a textbook on statistical computing--about three hours ago as I write this, and I'm half relieved about getting my time back, half anxious about how long it will take before I get my first complaint letter about how the betas on page 317 should be in boldface, and half amazed that I managed to write something like that. I suppose this is a bit of catharsis after writing hundreds of pages of math and data technique.

Statistics--by which I mean all of mathematical inquiry aimed at explaining the real world, and sometimes even plain measurement--has fundamental failings for its intended purpose of allowing us humans to better understand the world.

Picking up from last time, statistics can never prove. The real world is uncertain and messy, but mathematics is pure and certain and unwavering. Mix the two together, and what do you get? An uncertain mess.

Our language is inclined toward our desire to accept things as true. Statistical language makes at least a half-assed effort--maybe even a three-quarter-assed effort--to retain skepticism at all times. A good and pedantic hypothesis test comes up with two outcomes: reject or fail to reject. This seems appropriately skeptical, and it means that if somebody is snooping around in the data beforehand, and inappropriately failed to reject, then it's no big deal: we just learned nothing from that experiment (in formal stats language, the test had insufficient power).

But this fine point breaks down at every opportunity, because we long for certainty, and statistics just won't give it to us.

The first breakdown and non-math-geeks are welcome to skip to the next paragraph is that the system is asymmetric regarding what should be symmetric hypotheses. Given two variables, we typically wind up with H0: the variables are equal, and H1: the variables are not equal. The above reject/fail-to-reject language typically refers to H0. Failing to reject H0 is appropriately indefinite, but rejecting H0 is definite, not-squishy language stating that we know the variables are not equal, because we reject the claim of equality. Since the reasearcher is probably trying to show that the variables are different, the language is slightly skewed in favor of the reasearcher. In an ideal world, perhaps we'd say that the test fails to reject and fails to accept. Then when we fail to reject one hypothesis, we're failing to accept the alternate, which has the same level of confidence on both sides.

The second problem is that even that little bit of legalese, fail to reject, is hard to keep in place for long--it turns into accept even in many stats textbooks, especially the ones with a `tude that tries to make statistics fun. And don't expect the phrase to ever appear in the newspaper: my brief search of the NYT turns up one op-ed making exactly the point I'm making here, one correct use of the phrase, and assorted cruft. The longing for certainty is just too strong to let weak language stand.

But there are benefits to accepting the weakness of statistics. If we bear in mind that statistics can not prove, then my lament last time about how all our published positive results are doomed to be too confident is not so bad. An article with a solid result from a statistical test should simply slightly raise our confidence in whatever they found. If the research was especially carefully conducted, then it will raise our confidence a lot. Perhaps another article will come by next year that bolsters our belief or cuts it down a bit.

So after incredibly tedious and careful mathematical contortions, the best result we can get is that the human reader believes the claim a little bit more.

Some people are disappointed by the inability of mathematics to touch the core of what we as humans want, and just reject the entire project. Forget all those studies: they either tell us what we already know or are a pile of sophistry that will be contradicted next week. That's extreme. Our measurements are never perfect, but we make them. We're surrounded by black boxes that we'll never be able to crack open, and situations where we know any measurement will be imprecise. Despite knowing that we'll never be able to fully and truly understand anything, we still try.

Correlation is not causation, but neither is anything else
But to really make our model good, we need to tell a story, almost invariably of the form A causes B. Unfortunately, statistics has no concept of causality.

This is one of those philosophy of science things that you could expound on forever, though I won't go into it too deeply here. But the concept of causation happens only inside the human brain. It's not something we can measure, perhaps with a causality ruler (or a more portable causality tape), and then write down that A causes B with 3.2 causal units, but C causes B with 8.714 causal units. There are intuitive ways to measure a causal claim, like saying that if A always comes before B, then A causes B; in direct correspondence, there are easy ways to break such a simple measure, like how Christmas card sales cause Christmas.

But people like stories. As kids, we're taught how the world works via causal stories, that were not just a list of incidents but were a chain of events. Because granny was ill, Ms Hood took her basket of food and went walking over the river and through the woods; because the wolf was evil, he conspired to eat Ms Hood; because Ms Hood was virtuous, she was saved. A story where a bunch of unconnected, seemingly random things happen is just not satisfying, and correlation without causation is dissatisfying in exactly the same way.

You could take the basic intuition about how causality works and build machinery to draw causal flowcharts, which give a wealth of means to reject the flowchart; look up structural equation modeling or read Perl (2000). But apply the above rule that statistics can never prove a model of the world: statistics can never prove a causal model of the world--and this case is only worse because we're not even entirely certain about how to measure or even identify causality. As with any model, stats can bolster or cut down our confidence in the causal claim, but that's where it ends.

Of course, people fake it all the time. You will rarely if ever find a newspaper article declaring a correlation without strongly implying (if not directly stating) that the statistical model showed a causal link. Get your favorite researcher drunk and he or she will stop talking about correlations and start talking about causation, even though everybody in the room knows that it's just a mathematical mirage.

There's so much that we want to understand about our world and those around us that we'll never come close to. We're just guessing at reality based on our sadly limited information, and nothing makes that more evident and visceral than statistics.

Relevant previous entries:
The one about how people often reject academic studies without consideration.

@book{perl:causality,
author="Judea Perl",
title="Causality",
publisher="Cambridge University Press",
year=2000, month=mar
}



[link][a comment]

on Thursday, August 28th, SueDoc said

I failed to reject your mom's null, if you know what I mean.

Comment!
Yes, the comment box is tiny; write in a real text editor then just cut and paste here.
If you are a human, type the letter h in the first box.
h for human:
Name:
E-Mail:
Homepage:
 
 


6 August 08. The two sides of the statistical war

[PDF version]

There is a little war in the statistical world. Like other little wars, like Mac vs PC or Ford versus Chevy or Protestant versus Catholic, everybody who isn't on one of the teams has no idea how to differentiate between the two sides. Also, there is no resolution to the central question.

John Tukey (1977, pp 1-2) gives a metaphor of the detective and the judge. The detective gathers all the evidence he can, regardless of whether the evidence will be admissible in court or whether it proves guilt or innocence. He just compiles a thick a notebook as possible and worries about sorting it out later. The judge does the sorting. She is bound by law to ignore some evidence, and is comfortable ignoring most of the detective's notebook as irrelevant to the final, narrow question before the court.

Data-oriented inquiry has a very similar division, of descriptive modeling and hypothesis testing. The descriptive modeling step simply gathers information and puts it into a human-comprehensible format. The hypothesis test uses the strict laws that you forgot from statistics class to make a more objective statistical claim.

In the last episode, we saw many examples of descriptive modeling: take in all the airline prices, and list all the patterns you--or a computer--can find. Find the smallest demographic/marketing subgroup who all want to vote for Obama. Observe that pesticide use has been going up with time, and cancer rates have gone up with time.

There are two steps to take from there, one of which (developing a causal link) I won't talk about until next time. The main step is the hypothesis test, wherein you come up with some means of verifying the claim that the relationship you just found is what you claimed it was.

We need those extra steps because correlations could be sheer coincidence, meaning that they may reflect a true statement about the data at hand, but we shouldn't rely on them next week, or claim that there is some causal story that made that correlation happen. Stupid coincidences happen all the time and are easy to manufacture.

The problem with all our wonderful technology, however, is that as the power of your relation-searching machinery goes up, the power of your hypothesis testing diminishes. Here are two questions:

Randomly draw a person from the U.S. population. What are the odds that that person makes more than $1m?

Randomly draw 350 million people from the U.S. population. What are the odds that that wealthiest person in your list makes more than $1m?

The odds in the second case will be much higher, because we took pains in that one to pick the wealthiest person we could. That is, the first is a hypothesis about just data, the second is a hypothesis about an order statistic of data.

Now say that you have a list of variables before you.

Claim based on intuition that A is correlated to B1. What are the odds that your claim is OK with more than 95% odds?

Write down the best correlation between A and B1, B2, ..., B1, 000, 000. What are the odds that your best correlation is OK with more than 95% odds?

With a big enough list of variables, you are guaranteed to find a correlation (or any other model) that passes any hypothesis test you want.

You've read stories like this before: researcher inspects the data very carefully, eventually stumbles upon a relationship that works, thinks about how it makes sense that those two variables are related, and then publishes. With luck, it's something quirky enough to get into the NYT, Economist, or any other pop science outlet that happily reports one-off, unreplicated studies about how a crazy and unexpected variable has an important effect on the things we care about.

And that's the core of the conflict. The descriptive camp points out that it can develop badass means of testing a thousand hypotheses, and the hypothesis testing camp points out that once they do that and pick the best correlation out of a thousand, all the hypothesis tests are basically invalid until modifications are made that the descriptive kids won't bother to make.

There are a few ways by which we can have too many hypotheses. The simplest is to just have a systematic list of a few million possibilities in need of testing. If we can get a million genetic markers from a drop of blood, which we can do, then we need to correct for that as we run a million hypothesis tests. People usually do the corrections in this case.

Before moving on to the real disasters, let me note that some people reject the discussion to this point. If variables A and B2891 are truly and honestly correlated, then that fact is true no matter whether we ran exactly one test or ran a million. There is no Heisenberg weirdness here: observing the correlations does not change them.

However, our tests and how we interpret them are changing. A hypothesis test makes sense only in a given environment, and that environment has to include the data, how the data was gathered, cleaned, and pre-inspected, and what other tests are being run at the same time. In the cookbook-format manual, none of this gets mentioned: the recipe calls for a list of numbers, mashed into a certain statistic, compared to a certain table, and you're done. But once a human observer comes along, you're already out of the textbook.

But the people who don't quite get the concept of the multiple testing problem don't get much cred. It's subtle and easy to get wrong, but people eventually work it out. If you write a loop to run every regression of a list of twenty variables against some outcome (usually GDP or some overall productivity number), then you are guaranteed to find an excellent fit to your data, and you will have no proof that what you found is any good, and nobody will respect you.

No, that's not where the debate lies.

Eyeballing multiple testing
Here's another way to get too many hypotheses: given a list of twenty variables, you can produce what is called a TrellisTM or lattice plot, which gives a 2-D dot plot of every variable against every other. It's not hard to put plots for twenty variables on a screen, and then scan to find the pair whose line is sharpest and shows the best correlation. Congratulations, you've just run 20×19 = 380 hypothesis tests. When tested more formally, the correlation you just spotted is almost guaranteed to hold, even if your data is pure noise. Or you can try any of a multitude of other visualizations that will similarly allow you to see hundreds of relations at once.

The DataViz field is trendy right now. There are a few icons of the field who are working hard on self-promotion, such as Edward Tufte, whose books show how graphs can be cleaned up, chartjunk eliminated, and grainy black and white fliers from the 1970s cleaned up through the use of finely detailed illustrations in full color. John Tukey's Exploratory Data Analysis (cited above) is aggressively quirky, and encourages disdain for the hypothesis testing school.

These guys, and their followers, are right that we could do a whole lot better with our data visualizations, and that the stuff based on facilitating fitting the line with a straightedge should have been purged at least twenty years ago.

The underlying philosophy, however, is humanist to a fault. The claim is that the human brain is the best data-processor out there, and our computers still can't see a relationship among a blob of dots as quickly as our eye/brain combo can. This is true, and a fine justification for better graphical data presentation. And hey, we humans would all rather look at plots than at tables of numbers.

A slice from plate nine of the Rorschach series of inkblots.
Figure One: If you don't see faces, you're crazy. Oh, and there's a penis and vagina in every inkblot too.

But the argument forgets that humans are so good at seeing relationships among blobs of dots that we often see patterns in static (there's a word for this tendency: apophenia). We look at clouds and see bunnies, or read the horoscope and think that it's talking directly to us, or listen to a Beatles song about playground equipment and think it's telling us to kill people. Given ten scatterplots, you will find a pattern--in fact, if a psychologist were to show you a series of ten seemingly random inkblots and you didn't see a reasonable number of patterns in them, the psychologist might consider you to be mentally unhealthy in any of a number of ways.

Better data visualization doesn't address the problem of apophenia. In fact, following Tukey's lead, the people who focus on clean testing are characterized as not seeing the value of all these full-color diagrams. They're wearing blinders for the sake of being good Boy Scouts and not seeing the trees and grass and chirping birds around them. Conversely, the testing people generally see little value in all these full-color plots and want to go back to inferring things.

So this is the current battleground in the descriptive-versus-testing war. No side can win--there are no tests for overtesting, so this is all just intuition and opinion. We can write down in a cookbook that if your data-analysis model includes a series of ten tests, you need to make such-and-such an order-statistic correction. But how do you write into a textbook model framework that you surfed charts of the data for forty-five minutes, including eight 3D plots and two TrellisTM diagrams?

Further, both sides are necessary, and both sides have valid points. So this is a perfect recipe for sniping back and forth forever.

But overcharting (and defining what that means) is not where the true problem lies.

The looming problem
After all there is a middle ground, where a person comes in with some idea of what the data will say, rather than waiting for the scatterplot of Delphi to reveal it. Then the researcher refines the original idea in dialog with the data. The closer something fits prior human beliefs, the more we are inclined to accept it, so the researcher is not on a pure fishing expedition, but is not wearing blinders to what the data has to say.

So one researcher could be reasonable--but what happens when there are thousands of reasonable researchers? When a relevant and expensive data set has been released, a large number of people will look at it. I've been to an annual conference attended by about a hundred people built entirely around a single data set, and who knows how many weren't able to fly out. With so many humans looking at the same set of numbers, every reasonable hypothesis will be tested. Even if every person maintains the discipline of balancing data exploration against testing, we as a collective do not.

Every person was careful to not test every option, so the order statistic problem seemed to be dodged, but the environment is not just one researcher at a computer, but thousands across the country, and collectively, a thousand hypothesis tests were run, and journals are heavily inclined to publish only those that scored highly on the tests. So it's the multiple testing problem all over again, but in the context of the hundreds or thousands of researchers around the planets studying the same topic. Try putting that into a cookbook description of a test's environment.

There's no short-term solution to this one.

In the next episode, I'll take this a little further.

[1] @book{tukey:eda, author= "John W Tukey", title = "Exploratory Data Analysis", publisher = "Addison-Wesley", year= 1977 }



[link][no comments]

Comment!
Yes, the comment box is tiny; write in a real text editor then just cut and paste here.
If you are a human, type the letter h in the first box.
h for human:
Name:
E-Mail:
Homepage: