PHYS771 Lecture 9: Quantum
Scott Aaronson
There are two ways to teach quantum mechanics. The first way — which for most physicists today is still the only way — follows the historical order in which the ideas were discovered. So, you start with classical mechanics and electrodynamics, solving lots of grueling differential equations at every step. Then you learn about the “blackbody paradox” and various strange experimental results, and the great crisis these things posed for physics. Next you learn a complicated patchwork of ideas that physicists invented between 1900 and 1926 to try to make the crisis go away. Then, if you’re lucky, after years of study you finally get around to the central conceptual point: that nature is described not by probabilities (which are always nonnegative), but by numbers called amplitudes that can be positive, negative, or even complex.
Today, in the quantum information age, the fact that all the physicists had to learn quantum this way seems increasingly humorous. For example, I’ve had experts in quantum field theory — people who’ve spent years calculating path integrals of mind-boggling complexity — ask me to explain the Bell inequality to them. That’s like Andrew Wiles asking me to explain the Pythagorean Theorem.
As a direct result of this “QWERTY” approach to explaining quantum mechanics – which you can see reflected in almost every popular book and article, down to the present — the subject acquired an undeserved reputation for being hard. Educated people memorized the slogans — “light is both a wave and a particle,” “the cat is neither dead nor alive until you look,” “you can ask about the position or the momentum, but not both,” “one particle instantly learns the spin of the other through spooky action-at-a-distance,” etc. — and also learned that they shouldn’t even try to understand such things without years of painstaking work.
The second way to teach quantum mechanics leaves a blow-by-blow account of its discovery to the historians, and instead starts directly from the conceptual core — namely, a certain generalization of probability theory to allow minus signs. Once you know what the theory is actually about, you can then sprinkle in physics to taste, and calculate the spectrum of whatever atom you want. This second approach is the one I’ll be following here.
So, what is quantum mechanics? Even though it was discovered by physicists, it’s not a physical theory in the same sense as electromagnetism or general relativity. In the usual “hierarchy of sciences” — with biology at the top, then chemistry, then physics, then math — quantum mechanics sits at a level between math and physics that I don’t know a good name for. Basically, quantum mechanics is the operating system that other physical theories run on as application software (with the exception of general relativity, which hasn’t yet been successfully ported to this particular OS). There’s even a word for taking a physical theory and porting it to this OS: “to quantize.”
But if quantum mechanics isn’t physics in the usual sense — if it’s not about matter, or energy, or waves, or particles — then what is it about? From my perspective, it’s about information and probabilities and observables, and how they relate to each other.
Ray Laflamme: That’s very much a computer-science point of view.
Scott: Yes, it is.
My contention in this lecture is the following: Quantum mechanics is what you would inevitably come up with if you started from probability theory, and then said, let’s try to generalize it so that the numbers we used to call “probabilities” can be negative numbers. As such, the theory could have been invented by mathematicians in the 19th century without any input from experiment. It wasn’t, but it could have been.
Ray Laflamme: And yet, with all the structures mathematicians studied, none of them came up with quantum mechanics until experiment forced it on them…
Scott: Yes — and to me, that’s a perfect illustration of why experiments are relevant in the first place! More often than not, the only reason we need experiments is that we’re not smart enough. After the experiment has been done, if we’ve learned anything worth knowing at all, then hopefully we’ve learned why the experiment wasn’t necessary to begin with — why it wouldn’t have made sense for the world to be any other way. But we’re too dumb to figure it out ourselves!
Two other perfect examples of “obvious-in-retrospect” theories are evolution and special relativity. Admittedly, I don’t know if the ancient Greeks, sitting around in their togas, could have figured out that these theories were true. But certainly — certainly! — they could’ve figured out that they were possibly true: that they’re powerful principles that would’ve at least been on God’s whiteboard when She was brainstorming the world.
In this lecture, I’m going to try to convince you — without any recourse to experiment — that quantum mechanics would also have been on God’s whiteboard. I’m going to show you why, if you want a universe with certain very generic properties, you seem forced to one of three choices: (1) determinism, (2) classical probabilities, or (3) quantum mechanics. Even if the “mystery” of quantum mechanics can never be banished entirely, you might be surprised by just how far people could’ve gotten without leaving their armchairs! That they didn’t get far until atomic spectra and so on forced the theory down their throats is one of the strongest arguments I know for experiments being necessary.
A Less Than 0% Chance
Alright, so what would it mean to have “probability theory” with negative numbers? Well, there’s a reason you never hear the weather forecaster talk about a -20% chance of rain tomorrow — it really does make as little sense as it sounds. But I’d like you to set any qualms aside, and just think abstractly about an event with N possible outcomes. We can express the probabilities of those events by a vector of N real numbers:
(p1,….,pN),
Mathematically, what can we say about this vector? Well, the probabilities had better be nonnegative, and they’d better sum to 1. We can express the latter fact by saying that the 1-norm of the probability vector has to be 1. (The 1-norm just means the sum of the absolute values of the entries.)
But the 1-norm is not the only norm in the world — it’s not the only way we know to define the “size” of a vector. There are other ways, and one of the recurring favorites since the days of Pythagoras has been the 2-norm or Euclidean norm. Formally, the Euclidean norm means the square root of the sum of the squares of the entries. Informally, it means you’re late for class, so instead of going this way and then that way, you cut across the grass.
Now, what happens if you try to come up with a theory that’s like probability theory, but based on the 2-norm instead of the 1-norm? I’m going to try to convince you that quantum mechanics is what inevitably results.
Let’s consider a single bit. In probability theory, we can describe a bit as having a probability p of being 0, and a probability 1-p of being 1. But if we switch from the 1-norm to the 2-norm, now we no longer want two numbers that sum to 1, we want two numbers whose squares sum to 1. (I’m assuming we’re still talking about real numbers.) In other words, we now want a vector (α,β) where
α2 + β2=1. Of course, the set of all such vectors forms a circle:
The theory we’re inventing will somehow have to connect to observation. So, suppose we have a bit that’s described by this vector (α,β). Then we’ll need to specify what happens if we look at the bit. Well, since it is a bit, we should see either 0 or 1! Furthermore, the probability of seeing 0 and the probability of seeing 1 had better add up to 1. Now, starting from the vector (α,β), how can we get two numbers that add up to 1? Simple: we can let α2 be the probability of a 0 outcome, and let β2 be the probability of a 1 outcome.
But in that case, why not forget about α and β, and just describe the bit directly in terms of probabilities? Ahhhhh. The difference comes in how the vector changes when we apply an operation to it. In probability theory, if we have a bit that’s represented by the vector (p,1-p), then we can represent any operation on the bit by a stochastic matrix: that is, a matrix of nonnegative real numbers where every column adds up to 1. So for example, the “bit flip” operation — which changes the probability of a 1 outcome from p to 1-p — can be represented as follows:
Indeed, it turns out that a stochastic matrix is the most general sort of matrix that always maps a probability vector to another probability vector.
Exercise 1 for the Non-Lazy Reader: Prove this.
But now that we’ve switched from the 1-norm to the 2-norm, we have to ask: what’s the most general sort of matrix that always maps a unit vector in the 2-norm to another unit vector in the 2-norm?
Well, we call such a matrix a unitary matrix — indeed, that’s one way to define what a unitary matrix is! (Oh, all right. As long as we’re only talking about real numbers, it’s called an orthogonal matrix. But same difference.) Another way to define a unitary matrix, again in the case of real numbers, is as a matrix whose inverse equals its transpose.
Exercise 2 for the Non-Lazy Reader: Prove that these two definitions are equivalent.
Gus Gutoski: So far you’ve given no motivation for why you’ve set the sum of the squares equal to 1, rather than the sum of the cubes or the sum of the fourth powers…
Scott: I’m gettin’ to it — don’t you worry about that!
This “2-norm bit” that we’ve defined has a name, which as you know is qubit. Physicists like to represent qubits using what they call “Dirac ket notation,” in which the vector (α,β) becomes , omitting all of the 0 entries.
So given a qubit, we can transform it by applying any 2-by-2 unitary matrix — and that leads already to the famous effect of quantum interference. For example, consider the unitary matrix
which takes a vector in the plane and rotates it by 45 degrees counterclockwise. Now consider the state |0〉. If we apply U once to this state, we’ll get and a 1/2 probability of with 1/2 probability, and
Again one can ask: did God have to use the tensor product? Could She have chosen some other way of combining quantum states into bigger ones? Well, maybe someone else can say something useful about this question — I have trouble even wrapping my head around it! For me, saying we take the tensor product is almost what we mean when we say we’re putting together two systems that exist independently of each other.
As you all know, there are two-qubit states that can’t be written as the tensor product of one-qubit states. The most famous of these is the EPR (Einstein-Podolsky-Rosen) pair:
, then we say ρ is separable. Otherwise we say ρ is entangled.
Now let’s come back to the question of how many real parameters are needed to describe a mixed state. Suppose we have a (possibly-entangled) composite system AB. Then intuitively, it seems like the number of parameters needed to describe AB — which I’ll call dAB — should equal the product of the number of parameters needed to describe A and the number of parameters needed to describe B:
dAB=dA dB.
If amplitudes are complex numbers, then happily this is true! Letting NA and NB be the number of dimensions of A and B respectively, we have
dAB=(NA NB)2=NA2 NB2=dA dB.
But what if the amplitudes are real numbers? In that case, in an N-by-N density matrix, we’d only have N(N+1)/2 independent real parameters. And it’s not the case that if N=NA NB then
Question: Can this same argument be used to rule out quaternions?
Scott: Excellent question. Yes! With real numbers the left-hand side is too big, whereas with quaternions it’s too small. Only with complex numbers is it juuuuust right!
There’s actually another phenomenon with the same “Goldilocks” flavor, which was observed by Bill Wootters — and this leads to my third reason why amplitudes should be complex numbers. Let’s say we choose a quantum state
>>> Read full article>>>
Copyright for syndicated content belongs to the linked Source : Hacker News – https://www.scottaaronson.com/democritus/lec9.html