Table of contents
1 From perturbations to statistical observables
 1.1 What do CMB experiments measure?
 1.2 The CMB powerspectrum
 1.3 Line of sight integration
 1.4 The matter powerspectrum
 1.5 Polarisation and gravitational waves
2 Physical understanding of the observables
 2.1 The CMB powerspectrum
 2.1.1 What the line of sight integrals tell us
 2.1.2 Acoustic oscillations
 2.1.2.1 The free oscillator
 2.1.2.2 Baryon loading
 2.1.2.3 Radiation driving
 2.1.2.4 Diffusion damping
 2.1.2.5 Summary
 2.1.5 Cosmological parameter dependence of the CMB powerspectra
 2.2 Matter powerspectrum
 2.2.1 Growth of matter perturbations
 2.2.2 Perturbations in real space
 2.3 Summary
From perturbations to statistical observables
What do CMB experiments measure?
There has to date been over $50$ experiments to measure the cosmic microwave background using telescopes on Earth, balloon experiments high in the atmosphere and satellites in space (COBE, WMAP, Planck). Telescopes on earth usually focus on measuring CMB in small patches on the sky whereas the satelitte experiments have mapped out the CMB over the whole sky (or as much as we can see; we cannot see through the galaxy). Telescope and balloon experiments also have to deal with all the issues of having the light go through the atmosphere, but on the other hand they are much cheaper than sending a satelitte into space and they can choose to scan the same part of the sky many many times to beat down the noise and get better resolution than what we would get from a fixed time allsky satellite experiment. Both of these therfore play an important role in measuring the CMB. The observational status today is that we have managed to measure the CMB temperature to as good precision as its almost possible to get and we have also good measurements on Emode polarisation. The main piece missing now, which is the main goal of future CMB experiments, is to measure the missing piece  the socalled Bmode polarisation  that are generated by gravitational waves in the early Universe (but we don't know in what amount). One can give several courses in the history of these experiments, how to do these kinds of experiments (scanning strategy, detector technology), dealing with foregrounds (the CMB is not the only source of photons in our Universe) and how to do the data analysis in the best possible way (see e.g. BeyondPlanck), but we won't go into more details in this course. The data analysis part is what the CMB group at our insitiute is great at and if you want to work on this we have many master projects on this topic availiable every year. Anyway this course is (luckily) about theory so lets move on to that.
So what do these experiments really measure and how do we connect this to the theory we have gone through? They measure the temperature of CMB photons $T(\hat{n})$ as functions of direction on the sky (which is the same as saying we measure the temperature as a function on the unit sphere). Subtracting off the mean temperature over all directions what we get is therefore $\frac{\delta T}{T} = \Theta(\vec{x}, \hat{p},t)$ where $\Theta$ is the perturbation we solved for (though we have this in fourierspace), $\vec{x}$ is our positon, $\hat{p}$ is the direction on the sky the photons hits the detectors and the time $t$ is obviously today. We can expand this in spherical harmonics as $$\Theta(\vec{x},\hat{p},t) = \sum_{\ell=1}^\infty\sum_{m=\ell}^{\ell} a_{\ell m}(\vec{x},t) Y_{\ell m}(\hat{p})$$ The goal of an experiment is to accurately measure the $a_{\ell m}$'s. However what the temperature is in any given direction is not something the theory can tell us (the initial conditions are that of a random field). To make a connection to the theory we derived we need to look at the statistical properties of this temperature field.
The CMB powerspectrum
The expansion of $\Theta(\vec{x},\hat{p},t)$ above implies that $$a_{\ell m}(\vec{x},t) = \int d\Omega_{\hat{p}} Y_{\ell m}^*(\hat{p})\Theta(\vec{x},\hat{p},t)$$ Expressing $\Theta$ in terms of its Fourier components this becomes $$a_{\ell m}(\vec{x},t) = \int\frac{d^3k}{(2\pi)^3}e^{i\vec{k}\cdot\vec{x}}\int d\Omega_{\hat{p}} Y_{\ell m}^*(\hat{p})\Theta(\vec{k},\hat{p},t)$$
Inflation predicts that the initial perturbations (like $\Theta$) is (very close to being) a gaussian random. Consequently the $a_{\ell m}$'s will also be a gaussian random field with mean $0$ and some variance $C_\ell$, i.e. $$\left<a_{\ell m}a^*_{\ell' m'}\right>= \delta_{\ell\ell'}\delta_{mm'}C_\ell$$ where the brackets denotes an ensamble average (average over many different realizations of our Universes). And since its gaussian all the statistical information about $\Theta$ is contained in $C_\ell$ (in practice there will be some deviations from pure gaussianity depending on the model of inflation so higher order moments does potentially contain interesting information). For a fixed $\ell$, each $a_{\ell m}$ have the same variance and for each value of $\ell$ there are $2\ell+1$ possible values of $m$. Thus in an experiment after measuring the $a_{\ell m}$'s we can estimate the powerspectrum as $$\hat{C}_\ell = \frac{1}{2\ell +1}\sum_{m=\ell}^\ell a_{\ell m}^2$$ There is an important difference with what we measure and what the theory says. The theory is formulated in terms on an ensamble average (average of many Universes) whereas we only have one Universe to measure in. This is where the ergothic assumption comes into play (see the section "PowerSpectrum" in the Appendix for more info) which tells us that as long as we have enough modes the mean of these will be the same as the ensamble average. This means that for large $\ell$'s we have enough statistical power to accurately estimate $C_\ell$, while for very low $\ell$ we will run into cosmic variance: we have very few $m$'s to estimate $C_\ell$ from so there is a fundamental uncertainity that is due to us only having one single Universe to measure this in that we cannot get around. The variance of this uncertanity is given by $\frac{\text{Var}(C_\ell)}{C_\ell^2} = \frac{2}{2\ell+1}$ (we have $C_\ell = \frac{1}{2\ell +1}\sum_m a_{\ell m}^2$ which follows a $\chi^2$ distribution). This uncertanity is what is depicted by the green shaded region (and the large red errorbars) for low $\ell$ in Figure 1 and shows that our current measurements for temperature fluctuations is close to as good as we can get.
To derive an expression for $C_\ell$ we expand $\Theta$ in multipoles $$a_{\ell m}(\vec{x},t) = \sum_\ell (2\ell + 1)(i)^\ell\int\frac{d^3k}{(2\pi)^3}e^{i\vec{k}\cdot\vec{x}}\int d\Omega_{\hat{p}} Y_{\ell m}^*(\hat{p})\mathcal{P}_\ell(\mu)\Theta_\ell(\vec{k},t)$$ where $\mu = \hat{p} \cdot \hat{k}$ which gives us $$a_{\ell m}a_{\ell' m'}^* = \sum_{\ell_1,\ell_2} (2\ell_1 + 1)(2\ell_2+1)(i)^{\ell_1\ell_2}\int\frac{d^3kd^3k'}{(2\pi)^6}e^{i(\vec{k}  \vec{k}')\cdot\vec{x}}\int d\Omega_{\hat{p}} d\Omega_{\hat{p}'} Y_{\ell m}^*(\hat{p}) Y_{\ell'm'}(\hat{p}')\mathcal{P}_{\ell_1}(\mu)\mathcal{P}_{\ell_2}(\mu')\Theta_{\ell_1}(\vec{k},t) \Theta_{\ell_2}^*(\vec{k}',t)$$
Now remember that when we solve this numerically we put $\Theta_\ell(\vec{k},t) = \Theta^{\rm code}_\ell(k,t) \mathcal{R}_{\rm ini}(\vec{k})$ where the initial value of the curvature perturbation $\mathcal{R}_{\rm ini}\propto \Psi$ was set to unity, i.e. we factored out the common initial condition from inflation before solving (which is perfectly fine since the equation system is linear). We must now put it back in. When we take the ensamble average of the above this will lead to a term $\left<\mathcal{R}_{\rm ini}(\vec{k})\mathcal{R}^*_{\rm ini}(\vec{k}')\right> = (2\pi)^3\delta(\vec{k}  \vec{k}')P(k)$ with $P(k) = \frac{2\pi^2}{k^3}\mathcal{P}_\mathcal{R}(k)$ being the primordial powerspectrum (with $\mathcal{P}_\mathcal{R}(k) = A_s(k/k_{\rm pivot})^{n_s1}$) and the deltafunction will enforce $\vec{k} = \vec{k}'$ taking care of one of the integrals. $$\left<a_{\ell m}a_{\ell' m'}^*\right> = \sum_{\ell_1,\ell_2} (2\ell_1 + 1)(2\ell_2+1)(i)^{\ell_1\ell_2}\int\frac{d^3k}{(2\pi)^3}\int d\Omega_{\hat{p}} d\Omega_{\hat{p}'} Y_{\ell m}^*(\hat{p}) Y_{\ell'm'}(\hat{p}')\mathcal{P}_{\ell_1}(\mu)\mathcal{P}_{\ell_2}(\mu')\Theta^{\rm code}_{\ell_1}(k,t)\Theta^{\rm code}_{\ell_2}(k,t) \frac{2\pi^2\mathcal{P}_\mathcal{R}(k)}{k^3}$$
For the next step we need $\int d\Omega_{\hat{p}} \mathcal{P}_\ell(\mu) Y_{\ell'm}^*(\hat{p}) = \delta_{\ell\ell'} \frac{4\pi}{2\ell+1} Y^*_{\ell m}(\hat{k})$ which takes care of two of the integrals (and the two Kroneker delta kills both of the sums) leaving us with $$\left<a_{\ell m}a_{\ell' m'}^*\right> = \delta_{\ell\ell'}(4\pi)^2\int\frac{d^3k}{(2\pi)^3}Y_{\ell m}(\hat{k})Y_{\ell m'}^*(\hat{k})\Theta^{\rm code}_\ell(k,t)^2 \frac{2\pi^2\mathcal{P}_\mathcal{R}(k)}{k^3}$$
Finally splitting the $k$integral into an integral over angles ($\Omega_{\hat{k}}$ now means over the angles of $k$) using that $\int d\Omega_{\hat{k}} Y_{\ell m}(\hat{k})Y_{\ell m'}^*(\hat{k}) = \delta_{mm'}$ we get $$\left<a_{\ell m}a_{\ell' m'}^*\right> = \delta_{\ell\ell'}\delta_{mm'}4\pi\int\frac{dk}{k}\mathcal{P}_\mathcal{R}(k)\Theta^{\rm code}_\ell(k,t)^2$$ so $$C_\ell = 4\pi \int \mathcal{P}_\mathcal{R}(k)\Theta^{\rm code}_\ell(k)^2\frac{dk}{k}$$ where the time is taken to be today (which is the time when we observe the CMB, so $\Theta^{\rm code}_\ell(k) = \Theta^{\rm code}_\ell(k,t=t_{\rm today})$). The difference wrt what is in Chapter 8.5.2 in Dodelson is that he expresses this in terms of $P(k) = \frac{2\pi^2}{k^3}\mathcal{P}_\mathcal{R}(k)$ which gives us the equivalent formulation $$C_\ell = \frac{2}{\pi}\int k^2 P(k)\Theta^{\rm code}_\ell(k)^2 dk$$
The expression above tells us how to go from the stuff we computed numerically, the temperature multipoles $\Theta_\ell$, to the theoretical prediction for the CMB angular powerspectrum. This can be directly compared to the equivalent quantity measured in experiments.
Line of sight integration
If you have seen the CMB powerspectrum you might have noticed that we are able to measure it up to $\ell \sim \mathcal{O}(1000)$'s. The formula we derived for $C_\ell$ requires us to know $\Theta_\ell$ and we also remember that to solve the equations for a given $\Theta_\ell$ requires us to solve all the previous $\Theta_0,\Theta_1,\ldots,\Theta_\ell$'s. This is a bit of an issue: do we need to solve a coupled system of $1000$'s of equation? The answer used to be yes! In principle its not bad. If you have coded up the equations for a general $\ell_{\rm max}$ then you just have to set one number to do so. However this makes it an expensive calculation. Lucikly there is a way around this and this technique is called lineofsight integration. Not only is it going to help us compute the CMB spectrum much much faster, but it is also the key to a simple understanding of the CMB powerspectrum. This is covered in Chapter 8.5 in Dodelson.
Remember after having found the equation for the photon temperature perturbation $\Theta = \Theta(t,\vec{k},\hat{p})$ $$\frac{\partial \Theta}{\partial t} + \frac{ik\mu}{a} \Theta + (\frac{\partial \Phi}{\partial t} + \frac{i k\mu}{a}\Psi) = \frac{d\tau}{dt}[\Theta_0  \Theta + i\mu v_b  \frac{3\mu^21}{4}\Pi]$$ the first thing we did was to expand it in multipoles. However this is quite a simple ODE so we can first formally integrate it by writing it as $$\frac{\partial (\Theta + \Psi)}{\partial \eta} + \left(ik\mu  \frac{d\tau}{d\eta}\right)(\Theta + \Psi) + \frac{\partial (\Phi\Psi)}{\partial \eta} = \frac{d\tau}{d\eta}[\Theta_0 + \Psi + \frac{\Pi}{4} + i\mu v_b  \frac{3\mu^2}{4}\Pi]$$ and introducing an integrating factor $e^{ik\mu(\eta  \eta_0)  \tau}$ where $\eta_0$ is the conformal time today. The equation then becomes $$ \begin{align} \frac{\partial [(\Theta + \Psi)e^{ik\mu(\eta  \eta_0)  \tau}]}{\partial \eta} &= \frac{\partial (\Psi\Phi)}{\partial \eta}e^{\tau}e^{ik\mu(\eta  \eta_0)} \\ &\frac{d\tau}{d\eta}e^{\tau}e^{ik\mu(\eta  \eta_0)}[\Theta_0 + \Psi + \frac{\Pi}{4}] \\ & \frac{d\tau}{d\eta}e^{\tau}e^{ik\mu(\eta  \eta_0)} i\mu v_b \\ &+ \frac{d\tau}{d\eta}e^{\tau}e^{ik\mu(\eta  \eta_0)} \frac{3\mu^2}{4}\Pi \end{align} $$ Introducing the visibility function $g = \frac{d\tau}{d\eta}e^{\tau}$ this becomes $$ \begin{align} \frac{\partial [(\Theta + \Psi)e^{ik\mu(\eta  \eta_0)  \tau}]}{\partial \eta} &= \frac{\partial (\Psi\Phi)}{\partial \eta}e^{\tau}e^{ik\mu(\eta  \eta_0)} \\ &+ g e^{ik\mu(\eta  \eta_0)}[\Theta_0 + \Psi + \frac{\Pi}{4}] \\ &+ ge^{ik\mu(\eta  \eta_0)} i\mu v_b \\ & e^{ik\mu(\eta  \eta_0)} \frac{3\mu^2}{4}g\Pi \end{align} $$ Integrating it from the early Universe till today we obtain $$ \begin{align} (\Theta + \Psi)_{\rm today}  (\Theta + \Psi)_{\rm ini} e^{ik\mu(\eta_{\rm ini}  \eta_0)}e^{\tau_{\rm ini}} &= \int_{\eta_{\rm ini}}^{\eta_0}d\eta \, g e^{ik\mu(\eta  \eta_0)}[\Theta_0 + \Psi + \frac{\Pi}{4}] \\ &+ \int_{\eta_{\rm ini}}^{\eta_0}d\eta \, \frac{\partial (\Psi\Phi)}{\partial \eta}e^{\tau}e^{ik\mu(\eta  \eta_0)} \\ &+ \int_{\eta_{\rm ini}}^{\eta_0}d\eta \, ge^{ik\mu(\eta  \eta_0)} i\mu v_b \\ & \int_{\eta_{\rm ini}}^{\eta_0}d\eta \, e^{ik\mu(\eta  \eta_0)} \frac{3\mu^2}{4} g\Pi \end{align} $$ where we have used $\tau_0 = 0$. The contribution from the second term on the left hand side is irrelevant since its weighted by $e^{\tau_{\rm ini}} \approx 0$ since the optical depth is huge early on. We are almost ready to take multipoles. The terms that have a $e^{ik\mu(\eta  \eta_0)}$ term is easy to deal with as they are directly related to the definition of the socalled spherical bessel function, but the terms with $\mu$ and $\mu^2$ is a bit annoying in this respect so first lets talk about how we can get ridd of them whih will simplify everything as we will see. For these we can use a similar trick as when working with the Fourier transform. Recall that when taking the Fourier transform of a gradient $\nabla f$ we simply get a factor $i\vec{k}$ times $\hat{f}$. The integral we have here is similar to a Fourier transforms so the same thing applies here: we can replace $\mu$ by $\frac{1}{ik}\frac{d}{d\eta}$ where the derivative will apply to anything $\mu$ multiplies expect the exponential, i.e. in the integrand $\mu f e^{ik(\eta\eta_0)} \to \frac{1}{ik}\frac{df}{d\eta} e^{ik(\eta\eta_0)}$ and similarly $\mu^2$ can be replaced by $\frac{1}{k^2}\frac{d^2}{d\eta^2}$ and so on. The formal justification for this is the same as we used for proving the Fourier transform derivative relation which is good old integration by parts (also here the boundary terms vanish). Writing out the argument in detail: start by using $(ik \mu)^n e^{ik\mu(\eta  \eta_0)} = \frac{d^n}{d\eta^n} e^{ik\mu(\eta  \eta_0)}$ to get $$ \begin{align} (\Theta + \Psi)_{\rm today} &= \int_{\eta_{\rm ini}}^{\eta_0}d\eta \, ge^{ik\mu(\eta  \eta_0)}[\Theta_0 + \Psi + \frac{\Pi}{4}]\\ & + \int_{\eta_{\rm ini}}^{\eta_0}d\eta \, \frac{\partial (\Psi\Phi)}{\partial \eta}e^{\tau}e^{ik\mu(\eta  \eta_0)} \\ & {\color{red}{+ \int_{\eta_{\rm ini}}^{\eta_0}d\eta \,[\frac{d}{d\eta}e^{ik\mu(\eta  \eta_0)}] \frac{1}{k}gv_b}} \\ & {\color{blue}{+ \int_{\eta_{\rm ini}}^{\eta_0}d\eta \,[\frac{d^2}{d\eta^2}e^{ik\mu(\eta  \eta_0)}] \frac{3}{4k^2}g\Pi}} \end{align} $$ then do integration by parts on the two terms that have derivatives of the exponential to get $$ \begin{align} (\Theta + \Psi)_{\rm today} &= \int_{\eta_{\rm ini}}^{\eta_0}d\eta \, g e^{ik\mu(\eta  \eta_0)}[\Theta_0 + \Psi + \frac{\Pi}{4}] \\ &+ \int_{\eta_{\rm ini}}^{\eta_0}d\eta \,\frac{\partial (\Psi\Phi)}{\partial \eta}e^{\tau}e^{ik\mu(\eta  \eta_0)} \\ &{\color{red}{ \int_{\eta_{\rm ini}}^{\eta_0}d\eta \,e^{ik\mu(\eta  \eta_0)} \frac{1}{k}\frac{d}{d\eta}(gv_b)}}\\ &{\color{blue}{+ \int_{\eta_{\rm ini}}^{\eta_0}d\eta \,e^{ik\mu(\eta  \eta_0)} \frac{3}{4k^2}\frac{d^2}{d\eta^2}(g\Pi)}} \end{align} $$ We can now finally take multipoles of this expression by multiplying by $\frac{i^\ell}{2}P_\ell(\mu)$ and integrate $\mu$ over $[1,1]$. The term on the left is easy: by definition of the multipoles we just get $\Theta_\ell$. The $\Psi$term does not have a directional ($\mu$) dependence so this term will only give a nonzero term for $\ell=0$ which is nothing but an (unobservable) shift in the CMB temperature. The other terms give rise to spherical bessel functions: $j_\ell(x) = \frac{i^\ell}{2}\int_{1}^1 P_\ell(x) e^{ix\mu} d\mu$. The end result is simply $$\Theta_\ell^{\rm today}(k) = \int_{\eta_{\rm ini}}^{\eta_0}d\eta\left[g[\Theta_0 + \Psi + \frac{\Pi}{4}] + \frac{\partial (\Psi\Phi)}{\partial \eta}e^{\tau}  \frac{1}{k}\frac{d}{d\eta}(gv_b) + \frac{3}{4k^2}\frac{d^2}{d\eta^2}(g\Pi)\right]j_\ell(k(\eta_0\eta))$$ or in the form we will implement it in the code where derivatives are written in terms of $x = \log a$ $$\Theta_\ell^{\rm today}(k) = \int_{x_{\rm ini}}^{0}dx\left[\tilde{g}[\Theta_0 + \Psi + \frac{\Pi}{4}] + \frac{\partial (\Psi\Phi)}{\partial x}e^{\tau}  \frac{1}{k}\frac{d}{dx}(\mathcal{H}\tilde{g}v_b) + \frac{3}{4k^2}\frac{d}{dx}(\mathcal{H}\frac{d}{dx}(\mathcal{H}\tilde{g}\Pi))\right]j_\ell(k(\eta_0\eta))$$ where $\tilde{g} = \frac{d\tau}{dx} e^{\tau} = \mathcal{H} g$ is the visibility function in terms of $x$. Let's reflect a bit on this formula: it tells us that we can compute any multipole $\Theta_\ell$ from simply knowing $\Pi,\Phi,\Psi,\Theta_0,v_b$ and its time derivatives and performing this integral. The highest photon multipole that enter the equation above is $\Theta_2$ (in $\Pi$) so we only have to choose $\ell_{\rm max}$ large enough to ensure that this one is computed accurately which is typically something like $\ell_{\rm max} = 68$. This simple calculation have reduced the computational needs of having $1000$'s of coupled ODEs down to a few dusin! We'll get back to the physical interpretation of this formula.
The only downside here is that we also need to know how to compute the spherical bessel function. See the math notes for this.
In hinsight this is a very obvious solution, however it was not realized until 1996 when Uros Seljak and Matias Zaldarriaga pointed this out in the paper "A Line of Sight Approach to Cosmic Microwave Background Anisotropies". Every single EinsteinBoltzmann solver ever made since then have implemented this technique.
The matter powerspectrum
Computing the matter powerspectrum is much easier than the CMB powerspectrum, its simply the square of the fourier components of the matter overdensity field: $$P(k,x) = \Delta_M(k,x)^2$$ where $\Delta_M(k,x) \equiv \frac{c^2k^2\Phi(k,x)}{\frac{3}{2}\Omega_{M 0} a^{1} H_0^2}$ and $\Omega_{M 0}$ is the total matter density parameter today. This is often plotted as $k$ in units of $h/\text{Mpc}$ versus $P(k)$ in units of $(\text{Mpc}/h)^3$. For what we compute in the numerical project (where the curvature perturbations are normalized to $1$ at the initial time) we must multiply this by the primordial powerspectrum to get the full result so $$P(k,x) = \Delta_M^{\rm code}(k,x)^2 P(k) = \Delta_M^{\rm code}(k,x)^2\frac{2\pi^2}{k^3} \left(\frac{k}{k_{\rm pivot}}\right)^{n_s1}$$
Polarisation and gravitational waves
Where does polarization enter the Bolzmann formalism and how do we get the equations needed to compute its predictions? Recall when we wrote down the distribution function for photons $f_\gamma$ and later we computed stuff like the energy density from $E = \frac{g_\gamma}{(2\pi)^2}\int E f_\gamma d^3p$ where $g_\gamma = 2$ is the number of polarizaton states. What we implicitly did there was to assume that there is no net polarization both initially and that none is generated in the evolution so that each of the two polarization states are equally common. The former is true (or atleast we have no reason to think otherwise), however the latter is not. In general we should write $f_\gamma = f_\gamma^{(1)} + f_\gamma^{(2)}$ where each of the two polarization states gets a perturbation $\Theta^{(i)}$. The total temperature perturbation is just $\Theta = \Theta^{(1)} + \Theta^{(2)}$. The collision term for Thompson scattering does generate a net polarization if there is a nonzero quadrupole moment $\Theta_2$ so in general $\Theta^{(1)} \not= \Theta^{(2)}$. Previous in the course when we wrote down the Boltzmann equation for the polarization without justification this is how we would compute it. The equations we presented for polarization is (very roughly) for $\Theta_P = \frac{1}{2}(\Theta^{(1)}  \Theta^{(2)})$. This was just to give you some idea on how polarization enters in the mathematical formalism we have gone through. Doing the actual calculations is beyond what we will go through in this course (though PhD students have to implement this in the numerical project and discuss the results) and we will instead focus on the physical understanding below. You can find more details about this in Wayne Hu's lecture notes. This is also covered in Chapter 10.5 in Dodelson.
To completely describe a radiation field we in general needs four numbers  the socalled Stokes parameters. These are the temperature $T$, the linear polarization $Q$ and $U$ along two different directions, and circular polarization $V$. $T$ and $V$ are rotationally invariant and can therefore be expanded in spherical harmonics. $Q$ and $U$, on the other hand, transform under rotations in the plane perpendicular to the direction of the photons. It turns out that the linear combination $Q\pm iU$ transforms in a particularly simple way and we can define rotational invariant quantities $E$ and $B$  the socalled $E$mode and $B$mode polarization patterns. When we are only considering scalar perturbations, we can choose a coordinate system for each Fourier mode where $U=0$ and $Q=\Theta_P$ for which $B = 0$ and $E$ is given by $\Theta_P$. The nice thing about this is that scalar perturbations only produces one particular polarization pattern and the other pattern  the $B$mode patttern  can be produced by two things 1) tensor perturbations, i.e. gravitational waves and 2) by gravitational lensing of $E$ modes, but this is mainly on small scales. Thus a large scales $B$mode pattern is considered strong evidence for gravitational waves (and inflation which produces them).
In our treatment of the collision term we ignored the angular dependence. The full expression for the differential crosssection is $$\frac{d\sigma}{d\Omega} = \frac{3}{8\pi}\hat{E}'\cdot \hat{E}^2\sigma_T$$ where $\hat{E}'$ and $\hat{E}$ are the incoming and outgoing directions of the polarization vector (electric field). This direcional dependence will mix the two polarization states and is what can generate a net polarization. The Boltzmann equation gives us the following evolution equation for the polarization $\Theta_P$ $$\frac{\partial \Theta_P}{\partial \eta} + ik\mu \Theta_P = \frac{\partial \tau}{\partial \eta}[\Theta_P + \frac{1}{2}(1P_2(\mu))\Pi]$$ where $\Pi = \Theta_2 + \Theta_{P0} + \Theta_{P2}$. One thing to notice from this straight away is that we have no $\Phi,\Psi$ terms to source it. And this is as expected: a net polarization is a consequence of Thompson scattering and not a gravitational effect. The only thing that connects this equation to the rest of the EinsteinBolzmann system we have derived is the presence of $\Theta_2$  the temperature quadrupole. If $\Theta_2=0$ and $ \Theta_{P} \approx 0$ initially then we will have $ \Theta_{P} \approx 0$ for all times so no net polarization is ever generated. We therefore need a temperature quadrupole to generate polarization.
This course deals mainly with scalar perturbations. Vector perturbations have no know source in the early Universe and even if it did it would quickly decay away, but tensor perturbations are relevant for polarization. To describe tensor perturbations we can choose a basis with $\vec{k}$ pointing in the $\vec{z}$ direction and then the metric perturbations can be written $$\delta g_{\mu\nu} = \pmatrix{0 & 0 & 0 & 0 \\ 0 & h_+ & h_x & 0 \\ 0 & h_x & h_x & 0 \\ 0 & 0 & 0 & 0 }$$ where $h_+,h_x$ correspond to the two polarization states. This can then be plugged into the Einstein equations (the $\delta G^{1}_1  \delta G^{2}_2$ component) which leads to $$\frac{d^2 h}{d\eta^2} + 2\mathcal{H} \frac{d h}{d\eta} + k^2 h = 0$$ In general the right hand side will contain matter sources (anisotropic stress of the photon/neutrino tensor perturbations), but they are negligible. This is a damped waveequation describing waves moving with velocity $c$. The expansion of the Universe stretches the gravitational waves and dampes the amplitude. The solution can be written. $$h = h^{\rm ini}\frac{\sin(k\eta)}{k\eta}$$ so a mode is at the present time damped by a factor $\frac{1}{k\eta_0}$. The Boltzmann equation gives us evolution equation for the tensor perturbations in the photon distribution. These will be similar as in the case of scalar perturbations, but instead of being sourced by $\Phi,\Psi$ in the scalar case its now sourced by $h$. Tensor perturbations in the photon distribution adds to both the CMB temperature and polarization powerspectrum, i.e. $$C_\ell = 4\pi\int \frac{dk}{k}[\Theta_\ell^{\rm scalar}^2A_s(k/k_{\rm pivot})^{n_s1} +\Theta_\ell^{\rm tensor}^2A_t (k/k_{\rm pivot})^{n_t1}]$$ for temperature and similar for E and B modes. Since $h$ quickly decays away for small scale modes (the ones that enter the horizon early) the effect of tensor perturbations will only be generated on the largest scales in our Universe. The contribution to the temperature powerspectrum is small and in the cosmic variance dominated regime, but there is a key signature in the Bmode polarization powerspectrum peaking at $\ell \sim 100$ which also contains a reionization bump at $\ell \lesssim 10$. We don't know of any other known physics that creates such a signature so its in some sense a smoking gun for primordial gravitaitonal waves (or some new exotic physics at least). The size of this bump depends on how much gravitational waves are set up during inflation and is parametrized by the socalled tensortoscalar ratio $r = A_t / A_s$. There is another mechanism for generating Bmodes. The treatment of the CMB anisotropies that we have done do not include the effect of gravitational lensing  the bending of light caused by the presence of big structures in our Universe. Gravitational lensing smooths out the high $\ell$ peaks in the temperature powerspectrum and will rotate some of the Emodes into Bmodes. The lensing Bmodes will therefore be similar in shape to the Emodes just with a much smaller amplitude. Importantly this contribution is small for small $\ell$ where the gravitational wave signal  if present  is located. So far the only Bmodes we have detected are those generated by gravitational lensing (which are very interesting in their own right), but this gravitational wave signature is the holy grail of current and near future CMB experiments. The best constraints on $r$ from the temperature powerspectrum today is $r\lesssim 0.1$, but unless $r$ is much smaller than $10^{4}10^{3}$ there are good chances of detecting it in the near future. It is very challenging though as we are talking about a signature that is at most a few tens of nanoKelvin compared to a few milliKelvin for the temperature anisotropies and its made even harder by the fact that photons emitted in our own galaxy acts as a foreground and can mimic this signal unless the observations have been carefully cleaned. The BICEP collaboration did announce a detection of this signal a few years ago ("We got it, five sigma at $r$ of point two!"), however it quickly became clear that what they were seeing was not gravitational waves but polarization coming from "dust" particles in our galaxy.
Physical understanding of the observables
We will now try to understand all the features in the CMB and matter powerspectrum based on the physics we know is going on and the equations we have derived. From this we will also discuss how varying each cosmological parameter changes the observables and how this can be used togeather with observations to tell us detailed information about our Universe.
For getting a deeper physical intuition about the CMB and how different physical processes manifests themselves in the powerspectra there are no better online resource than Wayne Hu's CMB tutorials so I suggest you also check that out (better than what I will do here  and I will probably borrow a lot of that material as its very good!).
For seeing how the CMB and matter power spectra change with varying cosmological parameters see this great animation by Max Tegmark (click on the parameter of choice to see the effect).
For a more analytical approach to the evolution of perturbations and powerspectra see the great papers by Hu & Sugiyama, Hu, Sugiyama & Silk and Eisenstein & Hu ([1], [2], [3], [4], [5]) These papers are great for getting a deeper understanding of the features in the powerspectra and the results in these papers can also be used to check that the results we get from our code are reasonable.
The CMB powerspectrum
What the line of sight integrals tell us
Recall the line of sight integral solution to the temperature multipoles $$\Theta_\ell^{\rm today}(k) = \int_{x_{\rm ini}}^{0}dx\left[\tilde{g}[\Theta_0 + \Psi + \frac{\Pi}{4}] + \frac{\partial (\Psi\Phi)}{\partial x}e^{\tau}  \frac{1}{k}\frac{d}{dx}(\mathcal{H}\tilde{g}v_b) + \frac{3}{4k^2}\frac{d}{dx}(\mathcal{H}\frac{d}{dx}(\mathcal{H}\tilde{g}\Pi))\right]j_\ell(k(\eta_0\eta))$$ Lets try to understand the CMB from this equation. The dominant term is the first one, the socalled SachsWolfe term, so lets start by studying this one. As we saw earlier in this course the visibility function sharply peaks at recombination so a rough approximation is $\tilde{g} \approx \delta(xx_{\rm rec})$ and in the line of sight integral this gives us $$\Theta_\ell^{\rm today}(k) \approx [\Theta_0 + \Psi + \frac{\Pi}{4}]_{\rm rec}\cdot j_\ell(k(\eta_0\eta_{\rm rec})) \approx [\Theta_0 + \Psi]_{\rm rec}\cdot j_\ell(k\eta_0)$$ since $\eta_{\rm rec} \ll \eta_0$ and the final term is very small. Recall photons in a gravitational well gets a gravitational redshift given by $\Psi$ so the term in the brackets is nothing but an effective photon temperature (with a tiny quadrupolar correction). What this tells us is that the CMB anisotropies we observe today comes from temperature inhomogenities that were present at recombination and froze in when the Universe became transparent. The last factor $j_\ell(k(\eta_0\eta_{\rm rec}))$ comes from the fact that these inhomogenities are freestreamed to the current time and then projected on a sphere. The other terms that we ignored in this simplified discussion represents additional physical effects: the second term is the integrated SachsWolfe effect. Photons might move down through gravitational potentials that change in time. If a gravitational well has deepened when the photon has been traveling through it the photon will loose extra energy when climbing out. Next recall that baryons and photons are tightly coupled early on so $v_b \sim v_\gamma$ so the third term is nothing but a Doppler term. The last one I don't know a simple physical explanation for.
The final part we need to understand is how features in $\Theta_\ell(k)$ gets mapped into the powerspectrum. We know that $$C_\ell = \frac{2}{\pi}\int k^2 P(k)\Theta_\ell^2\frac{dk}{k}$$ The spherical bessel function $j_\ell(x)$ has a maximum around $x\sim \ell$ and since $\Theta_\ell \propto j_\ell(k\eta_0)$ the value of $C_\ell$ will very roughly be proportional to $\Theta_\ell(k = \ell / \eta_0)$. Don't take this too literally  its a rough approximation  but what this argument is telling us is that a feature on some scale $k$ will get mapped into a feature on an angular scale $\ell \sim k \eta_0$. Thus we can understand the basic structure of the CMB powerspectrum by studying how the effective temperature perturbation $(\Theta_0 + \Psi)(k,t)$ evolves up until recombination. This is what we will look at next.
Acoustic oscillations
From the line of sight integral we see that we need to know the value of the effective temperature perturbation at recombination $$(\Theta_0+\Psi)(k, t = t_{\rm rec})$$ to understand the CMB spectrum. This is the aim here. Start with the continuity equation and Euler equation $$\frac{\partial \Theta_0}{\partial \eta} = k\Theta_1  \frac{\partial \Phi}{\partial \eta}$$ $$\frac{\partial \Theta_1}{\partial \eta} = \frac{k}{3}(\Theta_0  2\Theta_2 + \Psi) + \frac{\partial \tau}{\partial \eta}(\Theta_1 + \frac{v_b}{3})$$ Take the $\eta$derivative of the first equation and use the second one to simply and we end up with $$\frac{\partial^2 (\Theta_0 + \Psi)}{\partial \eta^2} + \frac{k^2}{3}(\Theta_0+\Psi) = F$$ where $$F = \frac{k^2}{3}(2\Theta_2)  k\frac{\partial \tau}{\partial \eta}(\Theta_1 + \frac{v_b}{3}) + \frac{\partial^2 (\Psi\Phi)}{\partial \eta^2}$$ This is the equation of a simple driven harmonic oscillator where $F$ is the driving force.
Lets try to understand this equation a bit better. First of all we know why $\Psi$ is present in this equation: the energy of photons change as we move in or out of gravitational potentials. We also know that $\Theta_0$ is nothing but the overdensity of photons (i.e. $\delta_\gamma = 4\Theta_0$). If we now go to realspace then the equation, in the absence of the driving term, can therefore by rewritten as $$\frac{\partial^2 \delta_\gamma}{\partial \eta^2}  c_s^2\nabla^2\delta_\gamma = \frac{4}{3}\nabla^2\Psi$$ This is nothing but a simple waveequation describing waves moving with velocity $c_s = 1/\sqrt{3}$ as we would expect. The right hand side is just the gravitational force (which from the Poisson equation is $\simeq \frac{4}{3}\cdot 4\pi G a^2\overline{\rho}_\gamma \delta_\gamma$ when radiation dominates the energy budget) and the $c_s^2$ term is the pressure force. The rough picture is that gravitational force will try to create bigger overdensities and the pressure force will resist this. We shall get back to how to describe the evolution of the perturbations in real space later on, but to understand all the fine details of this equation and how it maps onto the CMB powerspectrum its much more convenient to continue in Fourier space and analyze the harmonic oscillator.
We will study this is vaying levels of difficulty starting with a free oscillator, then adding in the effects of baryons (the $\frac{d\tau}{d\eta}$ term in $F$), then adding in the effects of driving from decaying potentials (the $\frac{\partial^2 (\Psi\Phi)}{\partial \eta^2}$ term in $F$) and finally the effect of photon diffusion (also contained in the $\frac{d\tau}{d\eta}$ term in $F$).
Review: Driven harmonic oscillator
To understand the evolution of $\Theta_0 + \Psi$ we need some knowledge about the harmonic oscillator. The harmonic oscillator is arguably the most important differential equation in all of physics  it pops up everywhere from classical physics to quantum field theory  and you are hopefuly familiar with it. The simplest case where we see it is when we have a ball attached to a spring. Hooks law then says that the restoring force is $F = kx$ and Newtons law then gives us $$m\ddot{x} + kx = 0$$ which can also be written $$\ddot{x} + w_0^2 x = 0$$ where $w_0^2 = k/m$. The solution is a simple oscillation $$x(t) = A\cos(w_0 t) + B\sin(w_0 t)$$ where $A = x(0)$ and $B = \dot{x}(0)/w_0$. This can also be written on the simpler form $$x(t) = C\sin(w_0 t + \phi)$$ where $C = \sqrt{A^2+B^2}$ and $\sin\phi = A/C$.
Adding in friction:
If the surface the ball rolls on has friction then this can be modelled as $$\ddot{x} + 2\xi\dot{x} + w_0^2 x = 0$$ Why is this a friction term? Multiply the equation above by $\dot{x}$ and it becomes $$\frac{d}{dt}(\frac{\dot{x}^2}{2} + \frac{w_0^2 x^2}{2}) = 2\xi\dot{x}^2$$ The left hand side is just the sum of the kinetic and potential energy so the equation says that $\frac{dE}{dt} = 2\xi\dot{x}^2 \leq 0$ when $\xi\geq 0$. Thus this term drains energy from the system which is just what friction does. If the damping is not too strong then $$x(t) = e^{\xi t}\sin(wt + \phi)$$ where $w = \sqrt{w_0^2  \xi^2}$. We see two main effects: 1) the frequency will be slightly reduced and 2) friction leads to an exponential decay of the amplitude of the oscillations.
Adding in a constant driving force:
If we have an external force acting on the oscillator then the equation of motion is (ignoring friction) $$\ddot{x} + w_0^2 x = F$$ The simplest case would be a constant force $F = g$, which we for example get if we suspend our ball on a spring from the ceiling such that its also affected by gravity. In this case we can write the equation as $$\ddot{x}_{\rm eff} + w_0^2 x_{\rm eff} = 0$$ where $x_{\rm eff} = x + g/w_0^2$ so the solution is simply $$x(t) = \frac{g}{w_0^2} + C\sin(w_0 t + \phi)$$ We see that we still have normal oscillations, but the zero point is shifted as the resting length of the spring is longer due to gravity pulling on it. Thus the oscillations are no longer symmetric around $x=0$. This effect will be very relevant in the discussion below and will help up explain the height of the peaks seen in the CMB powerspectrum.
Adding in a driving force:
Another interesting case is if we have an oscillatory driving force $F = \sin(wt)$. If the frequency $w$ lines up with the natural frequency of the oscillator $w_0$ then we can have a resonance. The steady state solution in this case is $$x(t) = \frac{1}{w_0^2w^2}\sin(wt+\phi) $$ for which the amplitude increases dramaticaly as $w\to w_0$ (if we add in some friction we only get a finite increase in the amplitude and not a divergence as in this simple example). This effect will be relevant for our oscillator as decay of gravitational potentials in the early Universe acts as a driving force in phase with the oscillations (though it dies out when reaching zero so it will not be as dramatic increase as here).
The free oscillator
If we could ignore the whole driving force $F$ then the solution would simply be $$\Theta_0 + \Psi = (\Theta_0 + \Psi)_{\rm ini}\cos(ks) + \frac{1}{kc_s}(\Theta_0 + \Psi)^\prime_{\rm ini}\sin(ks)$$ where $s = \int_0^\eta c_s d\eta$ and $c_s$ is the sound speed (in this case $c_s^2 = \frac{1}{3}$ and $s = \eta / \sqrt{3}$). Note that $s$ is nothing but the distance a sound wave have travelled since the big bang. With the usual adiabatic initial conditions that we are assuming then the last term above is zero and $(\Theta_0 + \Psi)_{\rm ini} = \frac{\Psi_{\rm ini}}{2}$ and we are left with $$\Theta_0 + \Psi = \frac{\Psi_{\rm ini}}{2}\cos(ks)$$ You can see these oscillations in Figure 7. In realspace the solution would be $(\Theta_0 + \Psi)(x,\eta) = f(xc_s\eta)$, i.e. travelling waves. For a given $k$, the effective temperature in fourier space is initially outside the horizon and will stay frozen (here $ks \ll 1$ so $\cos(ks) \approx 1$) and then start to oscillate when it enters the horizon as $k\sim \mathcal{H}$. The modes $k$ for which the temperature perturbation is at a maximum at recombination will then correspond to a peak in the powerspectrum at $\ell \sim k /\eta_0$. The maxima of $\cos(x) = 1$ is $x = n\pi$ so the peaks of the powerspectrum will be at $$\ell_{\rm peak} \sim n\frac{\eta_0}{r_s}$$ for $n=1,2,3,\ldots$ where $r_s = s(\eta=\eta_{\rm rec})$ is the soundhorizon at recombination. Note that there are a few $O(1)$ constants flying around here that I have ignored for simplicity. What about the troughs? Those correpond to modes for which the temperature perturbation is close to zero at recombination. Since $\cos(x) = 0$ for $x = \frac{\pi}{2} + n\pi$ this gives us $$\ell_{\rm trough} \sim (n + \frac{1}{2})\frac{\eta_0}{r_s}$$ This gives us the basic picture of the acoustic oscillations we see in the CMB powerspectrum and this is illustrated in Figure 5.
Baryon loading
We now want to add in the effect of baryons. To do this we start with the Euler equation for photons and baryons which reads (using $v_\gamma = 3\Theta_1$) $$\frac{dv_\gamma}{d\eta} = k\Theta_0 + 2\Theta_2  k\Psi + \frac{d\tau}{d\eta}[v_\gammav_b]$$ $$\frac{dv_b}{d\eta} = kv_b  k\Psi  R\frac{d\tau}{d\eta}[v_\gammav_b]$$ where $R = \frac{\Omega_b}{\Omega_\gamma}$ from which it follows that $$\frac{d(v_\gamma + Rv_b)}{d\eta} = k\Theta_0 + 2\Theta_2  k(1+R)\Psi$$ where the interaction term have dropped out since the left hand side is related to the change of the total momentum of the photon baryon fluid (this is the same thing we did to derive the interaction term for baryons from knowing it for photons). Now assuming that we can neglect the timevariation of $R$ and that we have tight coupling such that $v_b = v_\gamma$ and $\Theta_2 = 0$ then $$\frac{dv_\gamma}{d\eta} = \frac{k}{1+R}\Theta_0  k\Psi$$ This can be combined with the continuity equation for photons $\frac{d\Theta_0}{d\eta} = \frac{k}{3}v_\gamma  \frac{d\Phi}{d\eta}$ to give us $$\frac{d^2(\Theta_0+(1+R)\Psi)}{d\eta^2} + k^2c_s^2(\Theta_0+(1+R)\Psi) = 0$$ where we have also assumed that the derivative of $\Phi,\Psi$ can be neglected (we'll get to this in the next section). It's also instructive to see how this comes about from the original oscillator equation. The calculations above imply that $$k\frac{d\tau}{d\eta}(\Theta_1 + \frac{v_b}{3}) \approx k^2\frac{R}{1+R}\Theta_0$$ so it comes from the photonbaryon momentum exchange term that we have in the driving force $F$. We see that the effect of baryons is to lower the sound speed of the fluid $$c_s^2 = \frac{1}{3(1+R)} \lt \frac{1}{3}$$ and from the solution (again under the same approximations as above) $$\Theta_0 + (1+R)\Psi = \left(\Theta_0 + (1+R)\Psi\right)_{\rm ini}\cos(ks)$$ we get $$(\Theta_0 + \Psi)_{\rm rec} = (\Theta_0 + (1+R)\Psi)_{\rm ini}\cos(ks)  (R\Psi)_{\rm rec}$$ which shows the second effect: the oscillations are no longer symmetric around zero, but offset by $(R\Psi)_{\rm rec}$. This has direct implications for the CMB peaks (remember that these are determined by $\Theta_0+\Psi_{\rm rec}^2$). The even peaks, corresponding to $\cos(ks_{\rm rec}) = +1$, have a smaller amplitude while the odd peaks, $\cos(ks_{\rm rec}) = 1$, are enhanced. The relative height of the first three peaks can be used to tell us how much baryons and dark matter there is in our Universe. For example, as we shall see later, the fact that observations show that the third peak is higher than the second peak gives us strong evidence for nonbaryonic (i.e. dark) matter.
Radiation driving
If we look at the driving force $F$ this has three terms: the quadrupole term $\Theta_2$ (which is typically very small and only relevant for very precise calculations so we continute to ignore it here), the photonbaryon momentum transfer term $\propto \frac{d\tau}{d\tau}$ which we saw lead to the reduction of the sound speed in the previous section and finally we have the term dealing with timevariation of the gravitational potentials $\frac{d^2(\Psi\Phi)}{d\eta^2}$. This last term acts as a driving force to our oscillator. In the radiation dominated era's the gravitational potentials decays as they enter the horizon (radiation has pressure and therefore cannot cluster enough to fight the expansion of the Universe and keep the potentials up). Our oscillator starts its first compression phase as soon as it enters the horizon and when it turns around and starts it rarification phase the gravitational potentials are gone so it does not have to fight gravity on its way out and this leads to it getting a much higher amplitude. The effect of this is to drive up the ampltitude by a factor of $45$. This enhacement only happens for scales that enter the horizon before matterradiation equality $k \gt 0.01(\Omega_Mh^2/0.13) /$ Mpc corresponding to $\ell \gtrsim 100$ in the CMB powerspectrum for our Universe. You can see this driving effect in Figure 7: as the potential decays the amplitude is driven up.
Diffusion damping
In the discussion so far we have assumed that photons and baryons are tightly coupled and move together as a single fluid. This is only a first approximation that is perfect in the limit where interactions happen instantly. In reality the photons travel a finite distance between scatterings. The mean free path for Thomposon scattering is $\lambda = \frac{1}{n_e\sigma_T}$. In a Hubble time $H^{1}$ a photon will scatter $N = \frac{n_e\sigma_T}{H}$ times (this is just $\frac{d\tau}{d\log a}$ which gives a new interpretation of this function). After each collision the photon will continue in some random direction so the photons perform a random walk and the effect is that diffusing photons travels from hot regions of space to cold ones, equalising the temperatures of these regions. A well known result in the theory of random walks is that with $N$ steps of distance $\Delta x$ the mean distance traveled from the starting point is $\sqrt{N}\Delta x$. The mean distance the photons travel in a Hubble time is therefore of order $$D = \frac{1}{n_e\sigma_T} \cdot \sqrt{\frac{n_e\sigma_T}{H}} = \frac{1}{\sqrt{n_e\sigma_T H}} = \frac{1}{\sqrt{ \frac{d\tau}{dt} H}}$$ We expect that photon temperature perturbations on scales smaller than $D$ will be washed out by the diffusing photons. For the perturbations we solve in Fourier space this corresponds to perturbations on small scales $k \gtrsim \frac{2\pi}{D}$ which again will map onto the powerspectrum as a damping of the power for $\ell \gtrsim \ell_{\rm damping} \sim 800$. One can go ahead and estimate more accurately how big this damping is and how it depends on the cosmological parameters (see Dodelson for this), but just from looking at the expression for $D$ we can see that the damping scale should depend on 1) the baryon density $\Omega_b h^2$, 2) the helium abundance $Y_p$ (though $n_e$) and 3) the matter density parameter $\Omega_M h^2$ (though $H$). It's also useful to note that the diffusion length $D = 2\pi / k_D$ at decoupling is nothing but the "thickness" of the last scattering surface, i.e. the width of the visibility function.
How exactly would diffusion manifest itself in the evolution of the Fourier modes? We can understand this from looking at the diffusion equation: $$\dot{u} = F\nabla^2 u$$ where $F$ is the socalled diffusion coefficient. In Fourier space this becomes $$\dot{u} = Fk^2 u \implies u \propto = e^{Fk^2 t} = e^{\left(\frac{k}{k_D(t)}\right)^2}$$ where $k_D(t) = (Ft)^{1/2}$ is the characteristic scale for which diffusion is efficiently washing away the modes.
The random walk argument above allows us to get a rough estimate for the characteristic scale $k_D = \frac{2\pi}{D}$, but one can also do this more accurately by carefully analysing the photon and baryon equations (see Dodelson) and this leads to the effect of damping reducing the amplitude of the perturbations by a factor $D(k)$: $$(\Theta_0 + \Psi)_{\rm rec} = (\Theta_0 + \Psi)^{\rm nodamping}_{\rm rec}D(k)$$ where $$D(k) = e^{(k/k_D(\eta))^2}$$ just as we expected from the diffusion equation and the characteristic scale is given by $$k_D^{2} = \int_0^\eta\frac{1}{6(1+R)n_e\sigma_T a}\left[\frac{R^2}{1+R} + \frac{8}{9}\right]$$ You can see this damping in Figure 7.
Analytical expression including all effects
Putting it all together one can show (see also Dodelson) that in the tighly coupled limit the full solution can to a very good approximation be written (but deriving this is not in the curriculum) $$ \begin{align} (\Theta_0+\Psi)(\eta,k) &= [(\Theta_0+\Psi)_{\rm ini}\cos(ks(\eta)) \\ &+ \frac{k}{\sqrt{3}}\int_0^\eta d\eta'(\Phi(\eta')  \Psi(\eta'))\sin(ks(\eta) ks(\eta'))]D_{\rm damp}\\ \Theta_1(\eta,k) &= [\frac{1}{\sqrt{3}}(\Theta_0+\Psi)_{\rm ini}\sin(ks(\eta)) \\ & \frac{k}{3}\int_0^\eta d\eta'(\Phi(\eta')  \Psi(\eta'))\cos(ks(\eta) ks(\eta'))]D_{\rm damp} \end{align} $$ where $D_{\rm damp}(k,\eta) = e^{k^2/k_D^2(\eta)}$ with $$k_D^{2}(\eta) = \int_0^\eta\frac{c_s^2}{2\frac{d\tau}{d\eta}}\left[3c_s^2 R^2 + \frac{8}{9}\right]$$ where $R(\eta) = \frac{\Omega_b}{\Omega_\gamma}$ and $c_s^2(\eta) = \frac{1}{3(1+R)}$. This approximate solution is good to $\sim 10\%$! The line of sight integral tells us that $$\Theta_\ell^{\rm today} \approx (\Theta_0+\Psi)j_\ell(k\eta_0) + 3\Theta_1(j_{\ell1}(k\eta_0)  \frac{\ell+1}{k\eta_0}j_\ell(k\eta_0)) + \int_0^{\eta_0}d\eta[e^{\tau}\frac{d(\Psi\Phi)}{d\eta}]j_\ell(k\eta_0k\eta)$$
Summary
The line of sight integral tells us that the leading contribution to the temperature multipoles is given by $$\Theta_\ell \approx (\Theta_0+\Psi)_{\rm rec}j_\ell(k\eta_0)$$ and since the spherical bessel function peaks for $k\eta_0 \sim \ell$ and $C_\ell$ being an integral of $\Theta_\ell^2$ we have that $C_\ell$ is (very roughly) proportional to $(\Theta_0+\Psi)(\eta=\eta_{\rm rec}, k = \ell/\eta_0)^2$. We then showed that the effective temperature perturbation $\Theta_0+\Psi$ satisfy the driven oscillator equation $$\frac{\partial^2 (\Theta_0 + \Psi)}{\partial \eta^2} + \frac{k^2}{3}(\Theta_0+\Psi) = F$$ where $$F = \frac{k^2}{3}(2\Theta_2)  k\frac{\partial \tau}{\partial \eta}(\Theta_1 + \frac{v_b}{3}) + \frac{\partial^2 (\Psi\Phi)}{\partial \eta^2}$$ is a driving force. Ignoring the driving force gave us $$(\Theta_0+\Psi)_{\rm rec} = \frac{\Psi_{\rm ini}}{2}\cos(ks_{\rm rec})$$ where $s_{\rm rec}$ is the soundhorizon  the distance sound waves in the plasma have travelled since the Big Bang. This again tell us that the peaks of the CMB powerspectrum corresponds to modes that were at a maximum when they were frozen in at the last scattering surface. There are three important corrections to this:
 Radiation driving  Decay of the gravitational potentials in the radiation dominated era acts as a driving force enhancing the amplitude of the oscillations and thereby the CMB peaks.
 Baryon loading  The baryons and photons move as a single fluid so this lowers the sound speed slightly (which slightly changes the position of the peaks) and makes the oscillations asymmetric which enhances odd peaks relative to even peaks.
 Diffusion damping  On small scales photon diffusion washes out perturbations and leads to a damping of the CMB powerspectrum at large $\ell$.
You might wonder: why is the value of $C_\ell$ at the trough's not zero (or close to zero) if the temperature perturbation is zero at recombination? First of all remember that the spherial bessel function does not just contribute around its peak so the mapping $k\eta_0 \to \ell$ is just a rough approximation and $\ell$ will in general get a contribution from larger and smaller modes. Secondly remember there are other terms in the line of sight integral we have not talked about yet. The second most important term is the Doppler term. This is related to the photon velocity (well the baryon velocity, but in tight coupling they are the same). From the line of sight integral the SachsWolfe term, the Doppler term and the ISW term gives us $$\Theta_\ell \approx (\Theta_0+\Psi)j_\ell(k\eta_0) + 3\Theta_1(j_{\ell1}(k\eta_0)  \frac{\ell+1}{k\eta_0}j_\ell(k\eta_0)) + \int_0^{\eta_0}d\eta[e^{\tau}\frac{d(\Psi\Phi)}{d\eta}]j_\ell(k\eta_0k\eta)$$ To understand the second term we can use the continuity equation to get $$\Theta_1 = \frac{1}{k}\frac{d(\Theta_0 + \Phi)}{d\eta} = \frac{1}{k}\frac{d(\Theta_0 + \Psi)}{d\eta} + \frac{1}{k}\frac{d(\Psi  \Phi)}{d\eta}$$ Ignoring the last term we see that if $\Theta_0+\Psi \propto \cos(ks)$ then the photon velocity will be $\propto \sin(ks)$. When $\cos(ks) = 0$ we have $\sin(ks) = \pm 1$ thus when the leading term in the line of sight integral is zero the next to leading term is at its maximum. This implies the troughs will be lifted.
Cosmological parameter dependence of the CMB powerspectra
Remember this picture from the Milestone I text?
We are now finally at a stage where we can understand all of this in detail. We will below go through the various parts of the CMB powerspectrum and talk about how different physical effects and cosmological parameters leaves their imprints on it.
SachsWolfe plateau ($\ell \lesssim 20$):
The largest scales in the observable Universe (corresponding to the smallest $\ell$) have not entered the horizon before decoupling and the Doppler term is irrelevant so (ignoring the ISW term for now) $$\Theta_\ell \approx (\Theta_0 + \Psi)_{\rm ini}j_\ell(k\eta_0)$$ The CMB powerspectrum is then $$ \begin{align} C_\ell &= 4\pi \int \frac{dk}{k} \Theta_\ell^2A_s (k/k_{\rm pivot})^{n_s1} \\ &= 4\pi A_s (\Theta_0 + \Psi)_{\rm ini}^2 \int \frac{dk}{k} j_\ell^2(k\eta_0) (k/k_{\rm pivot})^{n_s1} \\ &= 4\pi A_s (\Theta_0 + \Psi)_{\rm ini}^2\frac{1}{(k_{\rm pivot}\eta_0)^{n_s1}} \int_0^\infty dx\, j_\ell^2(x) x^{n_s2} \end{align} $$ For the case of a scaleinvariant powerspectrum $n_s=1$ and the integral above is $\frac{1}{2\ell(\ell+1)}$ (this is the reason why the CMB powerspectrum is plotted as $\ell(\ell+1)C_\ell$ instead of just $C_\ell$) so $$\ell(\ell+1)C_\ell/(2\pi) = A_s (\Theta_0 + \Psi)_{\rm ini}^2 = {\rm constant}$$ This is called the SachsWolfe plateau. The constant itself is a measure of the primordial amplitude $A_s$ which again tells us something about the dynamic of inflation (if we know $A_s$ and the tensortoscalar ratio $r$ we would directly know the energy scale of inflation $H_{\rm inflation}$ as $H_{\rm inflation}^2 = A_s r / (16 G)$). If $n_s\not = 1$ then this would induce a tilt $\ell(\ell+1)C_\ell \propto \ell^{n_s1}$. Since $n_s \approx 0.96 \lt 1$ in our Universe it tils up at low $\ell$ and down for high $\ell$. Note that both the spectral index and the primordial amplitude makes it mark on the whole CMB spectrum, but as we will later see the effect of the amplitude is degenerate with the optical depth (which can be broken by adding in polarization data) and the tilt is degenerate with anything that changes the damping tail. We have ignored the latetime ISW effect above. As dark energy starts to dominate the energy budget the Universe starts to accelerate and expansion beats gravitational collapse and gravitational potentials, which stayed constant in the matter era, starts to decay. We get a similar effect if the Universe has a significant curvature component. Increasing each of these components (i.e. $\Omega_{K0}$ and/or $\Omega_{\Lambda 0}$) further tilts up the powerspectrum for the smallest $\ell$'s as can be seen in Figure 7.2 below. We have not yet talked about tensor perturbations, but we will see that these will also add in a tiny bit of power on the largest scales depending on how large their amplitude, paramerrized by the tensortoscalar ratio $r$, is. The problem with the low $\ell$'s is that observationally we have big cosmic variance. What the SachsWolfe plateu does clearly tell us is that $A_s \sim 10^{9}$ or equivalently that the density fluctuations set up by inflation are of the order of $10^{5}$ (which also validates our use of perturbation theory).
First peak ($\ell \sim 200$):
As we showed in the previous section, the location of the first peak is in a flat model determined by $kr_s = \pi$ together with $\ell \sim k(\eta_0\eta_{\rm rec})$ where $r_s = s(\eta_{\rm rec})$ is the sound horizon at decoupling, thus $$\ell \sim \pi \frac{\eta_0  \eta_{\rm rec}}{r_s}$$ or phrased in terms of the angular size $\theta$ ($\ell \sim \pi / \theta $) $$\theta = \frac{r_s}{\eta_0  \eta_{\rm rec}}$$ The right hand side is the ratio of the distance sounds waves have traveled since the Big Bang to the comoving distance to the last scattering surface. If we have curvature this changes to $$\theta = \frac{r_s}{D_A(a_{\rm rec})/a_{\rm rec}}$$ where $D_A$ is the angular diameter distance (the distance to the last scattering surface when interpreted in a flat Euclidean geometry). The sound horizon depends on $\Omega_{M0}h^2$ and any possible extra relativistic degrees of freedom in the early Universe $N_{\rm eff}$ while the angular diameter distance depends on $\Omega_{M0}h^2$, $\Omega_{\Lambda 0} h^2$ and $\Omega_{K0}h^2$. If we add curvature then the spots in a CMB map will appear smaller or larger then what they really are (see Figure 7.1) and shifting $C_\ell$ to the left/right (see Figure 7.2). Dark energy has a similar effect as it changes the distance to the last scattering surface. The dependence on $\Omega_{M0}h^2$ and $N_{\rm eff}$ in the sound horizon $r_s$ is less dramatic. Observations from Planck have measured the angular size of the first peak to great precision $$100\theta = 1.04097 \pm 0.00046 \implies \theta = 0.59643^\circ \pm 0.00026^\circ$$ corresponding to a $0.05\%$ measurement. This again places strong constraints on the curvature and dark energy content of our Universe and the best constraints today .
Figure 7.1: The effect of curvature on the CMB. Images taken from CMB Map Lab.
Figure 7.2: Effect of changing the curvature and dark energy density parameters on the CMB powerspectrum. $1\Omega_M$ in the figure is $\Omega_\Lambda$ and $\Omega_K$ respectively. Image taken from Wayne Hu.
The above just deals with the peak position, but we can also measure its height. The height of the first peak is determined by several effects. First of all we have radiation driving and baryon loading which acts on the SW and Doppler term. This is very sensitive to $\Omega_{b0}h^2$ and $\Omega_{\rm CDM 0}h^2$. Increasing the amount of baryons $\Omega_{b0}h^2$ increases the baryon loading effect so the first peak is enhanced. Increasing the amount of CDM we have that matter radiation equality happens earlier in the universe so the radiation driving effect on the first peak (which only happens for modes that entered before equality) will be reduced. We also have a constribution from the ISW term coming from the decay of potentials after decoupling. This is called the early ISW effect to separate it from the late time ISW effect we talked about above. It adds in phase with the monopole and increase the height of the CMB powerspectrum as we lower $\Omega_{M0}h^2$.
Second and third peak:
As we add in more baryons (increase $\Omega_{b0}h^2$) the baryon loading effect gets stronger and the relative height of the odd peaks gets enhanced relative to the odd peaks. If we add more matter (baryons and dark matter), i.e. increase $\Omega_{M0}h^2$, we change matterradiation equality and thereby lowers radiation driving effect (remember this only acts in the radition dominated regime). Both of these effects can be seen in Figure 7.3. We also see the position of the peak change: this is due to changing the distance to the last scattering surface (more matter lowers the angular diameter distance) and the sound horizon (more baryons lowers the sound speed which lowers the sound horizon). From the relative peak height of the first three peaks we can accurately weigh the amount of baryons and cold dark matter in our Universe. The observed peak heights (the fact that the third peak is not much higher than the second peak) give clear evidence for nonbaryonic dark matter in our Universe. The best constraints from Planck alone for these parameters are $$\Omega_{b0}h^2 = 0.02237 \pm 0.00015$$ $$\Omega_{\rm CDM 0}h^2 = 0.1200\pm 0.0012$$ and for the total matter density $$\Omega_{M0} = 0.3153 \pm 0.007$$
Figure 7.3: How the first three peaks change as we adjust $\Omega_{M0}h^2$ and $\Omega_{b0}h^2$. Image taken from Wayne Hu.
Damping tail ($\ell \gtrsim 800$):
The amount of damping is determined by the diffusion damping scale $k_D^{1}$. Decreasing the amount of baryons $\Omega_{b0}h^2$ decreases the number density of free electrons which again increases the diffusion length and leads to more damping. Increasing the total amount of matter $\Omega_{M0}h^2$ we increase the expansion rate at recombination which again increases the diffusion length and leads to more damping. This can be seen in Figure 7.4. The damping tail is often said to be a consistency check of the CMB powerspectrum: we can measure the baryon and total matter content from the first three peaks and this then tells us exactly what the diffusion damping scale is so we know the amount of damping we would expect there to be. However it is interesting in its own right as it allows us to constrain other parameters and also new physics. For example if we have extra relativistic degrees of freedom in the early Universe, something we can parametrize with the $N_{\rm eff}$ parameter (which is $3.046$ for just having neutrinos). Increasing $N_{\rm eff}$ increases the expansion rate at recombination. This also changes the peak position (the sound horizon gets larger) so if we keep the position of the first peak fixed we end up with less damping. Another parameter we haven't talked about so much is the primordial helium abundance $Y_p\approx 0.245$. This is a prediction of big bang nucleosyntesis, but if new physics is at play in this era it could be different. Increasing $Y_p$ decreases the number density of free electrons which again increases the diffusion length and leads to more damping. Thus $Y_p$ and $N_{\rm eff}$ are degenerate (and anticorrelated). Another parameter that is important for the damping tail is the spectral index. Recall that $C_\ell \propto \ell^{n_s1}$ thus even if $n_s1\approx 0.04$ is small it acts as a damping effect for $n_s \lt 1$ and the effects on the damping tail can be large.
Figure 7.4: How the damping tail change as we adjust $\Omega_{M0}h^2$ and $\Omega_{b0}h^2$. Image taken from Wayne Hu.
Figure 7.5: How the CMB powerspectrum changes with varying $N_{\rm eff}$ as we keep the location of the first peak fixed by adjusting $h$. Image taken from Baumann.
Figure 7.6: Constraints on $Y_p$ and $N_{\rm eff}$ from Planck data showing the degeneracy of these two parameters. The data is so far consistent with the standard BBN prediction and no exotic relativistic species in the early Universe ($N_{\rm eff} = 3.046$). Image taken from Planck.
Reionization:
At late times when stars have formed they emit high energy radiation which reionizes hydrogen and helium atoms. When the Universe reionizes photons can start to scatter and this again washes away some of the CMB fluctuations. The fluctuations will be suppressed by $e^{\tau_{\rm reion}}$ where $\tau_{\rm reion}$ is the optical depth at reionization whose value depends on what redshift $z_{\rm reion}$ the Universe gets reionized at. The effect on the CMB temperature powerspectrum is therefore roughly captured by $$C_\ell \propto e^{2\tau_{\rm reion}}$$ for all subhorizon modes. Thus the whole spectrum apart from the largest scales (corresponding to modes that entered the horizon close to the present time) will be suppressed. This effect is degenerate with the primordial ampltiude so the overall amplitude of the CMB powerspectrum is really a measure of $A_s e^{2\tau_{\rm reion}}$.
Mathematically we can see this as follows. If we imagine starting integrating $\frac{d\tau}{dt}$ from the early Universe going forward in time: one time with and one time without reionisation using the same initial condition. Prior to the reionization time we have $\tau^{\rm with} = \tau^{\rm without}$ as $X_e$ is the same before. Now if the initial conditions are chosen such that $\tau^{\rm without}(t_{\rm today}) = 0$ then we would have $\tau^{\rm with}(t_{\rm today}) = \tau_{\rm reion}$. When we now normalise the solution such that $\tau^{\rm with}(t_{\rm today}) = 0$ as it should be then we see that $$\tau^{\rm with}(t) = \tau^{\rm without}(t) + \tau_{\rm reion}$$ and for the visibility function $$g^{\rm with}(t) = g^{\rm without}(t) e^{ \tau_{\rm reion}}$$ prior to reionisation. Thus all the terms in the line of sight integral is multiplied by $e^{ \tau_{\rm reion}}$ in the times prior to reionization which leads $C_\ell$ to be suppressed by $e^{2\tau_{\rm reion}}$. The modes that have not entered the horizon prior to reionisation, corresponding to the low $\ell$'s, will also get a contribution from the visibility function peaking at reionization and the net effect is that the smallest $\ell$'s are not affected much by reionization.
This combination is inferred from Planck to $0.5\%$ accuracy: $$10^{10}A_s e^{2\tau_{\rm reion}} = 18.84 \pm 0.12$$ We can break this degeneracy with large scale polarization data or with gravitational lensing (see below).
Figure 7.7: Around redshift $z\sim 612$ the Universe gets reionized which forces photons to scatter a bit on their way to us washing away some of the CMB anisotropies and generating mmore large scale polarization. Image taken from Haemmerle et al. 2020.
Gravitational lensing:
Gravitational lensing in any direction is sensitive to the amount of matter along the line of sight from us to the last scattering surface. Gravitational lensing has the effect of slightly transfering power across angular scales, such that the peaks will be slightly lower and troughs are slightly higher the more lensed they are. In other words lensing smooths out sharp features in the powerspectrum and the stronger the smoothing effect is, the stronger the lensing is. Strong lensing again implies large matter fluctuation and therefore a large value of $A_s$. By measuring how much the peaks are smoothed out, we get a direct and second measurement of the primordial amplitude, independent of the optical depth, which can be used to break the $A_s,\tau$ degeneracy. We don't take gravitational lensing into account, but one can include the effects of lensing (see appendix of Milestone IV page for more info) to the CMB powerspectra. What is often done in practice is to measure how much the peaks have been smoothed (one can in practice do this by measuring the four point correlation function) and use this to undo the lensing effect giving us back the unlensed CMB spectrum. Doing this allows us to indirectly measure a quantity related to the matter powerspectrum called the lensing potential $C_\ell^\Psi$. This can be used to constrain cosmological parameters from the CMB alone that affect the latetime expansion ($\Omega_{\Lambda 0}$) and geometry of the Universe ($\Omega_{K0}$), and the growth of structures ($\Omega_{M0}$)  parameters that have degenerate effects in the primary CMB anisotropies.
Figure 7.8: Above: Illustration of how CMB photons are deflected by the gravitational lensing effect of massive cosmic structures as they travel across the Universe. Copyright: ESA and the Planck Collaboration. Below: The effect of lensing on the temperature (left), Emode (middle) and Bmode (right in green; compared to Emode in red and gravitational wave Bmodes in blue) powerspectra. Figure take from Bartelmann and Schneider 2001 and Lewis and Challinor 2006
Emode polarization spectrum:
The line of sight integral for the Emode polarization source gives that the leading contribuion is $\Theta^E_\ell \propto (\Theta_2)_{\rm rec}j_\ell(k\eta_0)$. In tight coupling we have $$\Theta_2 = \frac{8}{15}\frac{k}{\frac{d\tau}{d\eta}}\Theta_1$$ which is suppressed relative to the dipole by the huge optical debth. What this tells us directly, and which you can see in Figure X ($\pi_\gamma \propto \Theta_2$), is that the temperature quadrupole is very small unless $k$ is very large. It also tells us that $\Theta_2$ is in phase with the dipole and therefore perfectly out of phase with the monopole. Physically the quadrupole is generated by diffusion of photons close to the last scattering surface. Recall the diffusion scale corresponds to $\ell_{\rm damping}\sim 800$ so just from this simplistic discussion we can figure out roughly how the Emode powerspectrum $C^{EE}_\ell$ should look like: it will have peaks where the temperature powerspectrum have troughs, it should be suppressed for small $\ell$ before peaking around $\ell \sim 800$ and have a damping tail for much larger $\ell$.
However polarization is not only generated in the early Universe. When the Universe reionizes at late times we have a sizable large scale quadrupole that will generate polarization on the largest scales. This can be seen as a "bump" at low $\ell \lesssim 20$ in $C_\ell^{EE}$. The amplitude of this bump is roughly $\propto \tau^2$ and its shape depends on the details of reionization (like how fast it happened). This can be used to constrain the free electron fraction $X_e(z)$. Measurements from Planck tells us that the optical depth is $$\tau_{\rm reion} = 0.054 \pm 0.007$$ a $1\%$ measurement and corresponds to reionization at a redshift $$z_{\rm reion} = 7.7\pm 0.7$$ This measurement also allows us to break the $A_s,\tau$ degeneracy and the best constraints from Planck are $$10^{10}A_s = 20.92 \pm 0.34$$
Bmode polarization spectrum:
The high $\ell$ Bmode powerspectrum comes from gravitational lensing of Emodes and will consequently have a similar shape as the Emode spectrum just with a much smaller amplitude as gravitational lensing is a very small effect. At low $\ell$ we have the signature from gravitational waves. This peaks at around $\ell \sim 100$ and also has a reionization bump at $\ell \lesssim 20$. This signal has not yet been detected, but if it would then we would learn about the tensortoscalar ratio $r$ which together with our knowledge of $A_s$ tells us about the energy scale of inflation (the Hubble factor $H_{\rm inflation}$ when inflation happened). It will also provide an even stronger case for inflation as the mechanism for providing the seeds of structure and as a solution to the horizon and flatness problems of the big bang model.
Figure 7.9: How the CMB powerspectrum changee as we adjust the optical depth at reionization $\tau_{\rm reion}$ (or equivalently the redshift $z_{\rm reion}$ when reionization happened) and the tensortoscalar ratio $r = A_t / A_s$. You can see the Bmode caused by gravitational waves pop up in white as $r$ is increased. Image taken from Wayne Hu. NB: the right plot shows varying $\tau$ and $r$ at the same time, for just $r$ see see this animation instead (which also allows you to play around with other parameters).
Matter powerspectrum
The shape of the powerspectrum is dictated by two things 1) the primordial powerspectrum and 2) how modes of different scales evolve relative to each other. To understand the shape of this we need to understand how the density perturbation evolve in different regimes: outside and inside the horizon in the matter and radiation era.
Growth of matter perturbations
Lets start with the continuity and Euler equation for the dark matter perturbations $$\frac{d\delta_{\rm CDM}}{d\eta} = kv_{\rm CDM}  3\frac{d\Phi}{d\eta}$$ $$\frac{dv_{\rm CDM}}{d\eta} = \mathcal{H}v_{\rm CDM}  k\Psi$$ The baryon perturbations satisfy the same equations after photon decouping and we will in the following work with the total matter contrast $\delta_M = \frac{\Omega_{\rm CDM 0}\delta_{\rm CDM} + \Omega_{b0}\delta_b}{\Omega_{M0}}$ (which also satisfy the equaions above). Taking the derivative of the first equation and using the second we get $$\frac{d^2\delta_{\rm M}}{d\eta^2} + \mathcal{H}\frac{d\delta_{\rm M}}{d\eta} = k^2\Psi + 3\left(\mathcal{H}\frac{d\Phi}{d\eta}  \frac{d^2\Phi}{d\eta^2}\right)$$ On small scales $k\gg \mathcal{H}$ the last term on the right hand side can be neglected and the Poisson equation tells us that $$k^2\Psi = 4\pi G \overline{\rho}_M a^2 \delta_M$$ when radiation perturbations can be neglected. With this the equation above reduces to $$\frac{d^2\delta_M}{d\eta^2} + \mathcal{H}\frac{d\delta_M}{d\eta} = 4\pi G\overline{\rho}_M a^2\delta_M$$ To understand this is a bit better lets go to realspace where this equation can be written on the suggestive form $$\frac{d^2\delta_M}{d\eta^2} + \mathcal{H}\frac{d\delta_M}{d\eta}  c_{sM}^2\nabla^2\delta_M = 4\pi G\overline{\rho}_M a^2\delta_M$$ where we have introduced the soundspeed $c_{sM}^2 = 0$. This is yet another waveequation very similar to the one we say for photons. The driving force here is now just the gravitatational force. The big difference is that whereas photons have pressure (for which $c_s^2 = 1/3$), nonrelativistic matter is practically pressureless ($c_{sM}^2 = 0$) so the pressure force is not present and instead of a continuous fight between gravity and pressure leading to oscillations we will instead get growth.
How the perturbations grow depends on if we are in the radiation dominated era or the matter era and inside or outside the horizon. See Section 5.2.3 in Baumann or Chapter 7 in Dodelson for how to derive this. Working with the gaugeinvariant density perturbation $$\Delta = \delta  \frac{\mathcal{3H}}{k}v$$ is more convenient so we will do this here ($\Delta$ will satisfy the same equation as given above on subhorizon scales). The results can be summarized as follows:

Radiation era:
 Modes outside the horizon: $\Delta_M \propto a^2$
 Modes inside the horizon: $\Delta_M \propto \log(a)$ (i.e. roughly constant)

Matter era:
 Modes outside the horizon: $\Delta_M \propto a$
 Modes inside the horizon: $\Delta_M \propto a$
For the following recall that the matter powerspectrum is (with the way we normalize our perturbations initially) $$P(k,t) = \Delta_M(k,t)^2 \cdot A_s \left(\frac{k}{k_{\rm pivot}}\right)^{n_s1}$$
A mode enter the horizon when $k \sim \mathcal{H}$. The largest scales (small $k$) correspond to modes that enter late in the matter era (or haven't entered yet). The smallest scales (large $k$) corresponds to modes that entered early in the radiation era. Due to the fact that perturbations hardly grow inside the horizon in the radiation era the small scale powerspectrum will be suppressed relative to the largescale powerspectrum. The largest scale that does not get any suppression corresponds to those that enter exactly when the matter era starts $k_{\rm eq} = \mathcal{H}(a_{\rm eq})$ where $a_{\rm eq} = \Omega_{R0} / \Omega_{M0}$ (where $\Omega_{R0} = \Omega_{\gamma 0} + \Omega_{\nu 0}$) is the scalefactor for matterradiation equality. This represents the peak of the matter powerspectrum. At larger scales it grows at $k^{n_s}$ (so roughly $\propto k$) while at smaller scales $\Delta_M$ is suppressed by roughly $(k_{\rm eq}/k)^2$ and consequently the powerspectrum is suppressed by $(k_{\rm eq}/k)^4$ and therefore it goes down as $k^{3}$. See the figure above. This explains the basic shape.
To understand the more fine features (e.g. the small oscillations seen in Figure 2) its useful to go back and look at this in realspace instead.
Perturbations in real space
Our discussion have so far been in Fourier space, but lets look at how the perturbations evolve in realspace. To simplify it, lets consider a single perturbation (a sharp peak) with dark matter, photons, baryons and massless neutrinos.
How do we compute how this evolves? Well we already know how it evolves in fourier space $\delta_i(\vec{k},t)$ and if we take the initial profile $\delta_i(\vec{k},t^{\rm ini}) = 1$ (the fourier transform of a deltafunction $\delta^{(3)}(\vec{x})$ in real space) then we can find the profile $\delta_i(\vec{x},t)$ in real space by just doing a fourier transform $$\delta_i(\vec{x},t) = \int \frac{d^3k}{(2\pi)^3}e^{i\vec{k}\cdot \vec{x}}\delta_i(\vec{k},t) = \int_0^\infty \frac{k^2dk}{2\pi^2} \frac{\sin(kr)}{kr} \delta_i(k,t)$$ where $r = \vec{x}$. The system is initially (and therefore stays) spherically symmetric so the integral becomes as simple 1D integral (a Henkel transform that can here easily be rewritten as a 1D Fourier transform). This can easily be solved numerically by direct integration, as an ODE or performing a fourier transform using, say, the FFTW library (one can also use the socalled FFTLog algorithm). In the movie below we show the results of such a calculation.
The initial peak will attract matter from the surroundings so it grows in time (and brodens). However photons and neutrinos have pressure which will produce a sound wave moving outwards. Photons and baryons are tightly coupled and they move together as one fluid so the baryons are dragged along for the ride. Neutrinos have long been decoupled so the perturbations travel with sound speed $c/\sqrt{3} \gt c/\sqrt{3(1+R)}$ so they will therefore be slightly ahead of the photons and baryons (this acts as a gravitational drag on the baryons  an effect that has been measured recently in galaxy surveys and used to constrain the number of neutrinos. Likewise it also acts to shift the peak positions in the CMB angular powerspectrum).
When we get to recombination $z\sim 1100$ the photons and baryons finally decouple from each other and photons travel outwards by itself. When this happens we have a spherical shell of matter at the socalled soundhorizon $r_s \sim 150$ Mpc (the distance this sound wave have travelled since the big bang). This shell  the BAO peak  remains in place and dark matter will also start to cluster around the excess of matter here. This leaves us with a socalled standardruler  a distance we can compute theoretically and can also measure from the correlation function of matter. Just as the first peak of the CMB is determined by the sound horizon at decoupling $r_s$ and the angular diameter distance to the last scattering surface the BAO peak is determined by the sound horizon in the baryon drag era $r_{\rm drag}$ (which is just slightly larger than $r_s$) and the angular diameter distance to the redshift where we have our galaxy sample which we use to measure the peak in. With $r_{\rm drag}$ from the CMB a BAO galaxy survey is therefore able to measure the Hubble parameter at different redshifts. Because of this and the fact that its very robust (apposed to the full correlation function / powerspectrum which is complicated by the fact that we measure galaxies and not dark matter directly) the standard ruler is one of the key things that we want to measure in galaxy surveys.
Now this was only one peak and the Universe is surely much more complicated than this. Thats true, but remember the evolution of perturbations is linear so the true Universe can be thought of as a sum over many such peaks as illustrated in the figure above. This implies that this signal is imprinted in the statistical observables: the two point correlation function (which is close to what is shown in the video above) and the matter powerspectrum. So what does this imply about the finestructure of the matter powerspectrum? Remember that a spherical shell in real space ($\delta(rr_*)$) corresponds to oscillations ($e^{ir_*k}$) in Fourierspace (do the fourier transform to see this). Thus the matter power spectrum will have osciallations imprinted in it (which are similar to the acoustic oscillations we see in the CMB powerspectrum). They are not as visible as most of the matter is dark matter, but you can see them in Figure 2.
Summary
The CMB powerspectra $\ell(\ell+1)C_\ell$:
 TT has a (SachsWolfe) plateau for low $\ell \lesssim 30$ whose amplitude tells us about inflation.
 TT has peaks corresponding to acustic oscillations that froze in at their maximum excursion at decoupling. There are thre main effects that modify this: baryon loading, radiation driving and diffusion damping.
 Decay of potentials after recombination increases the height of the first peak and potentials decay again when the Universe starts to accelerate when dark energy takes over leading to an increase in the low $\ell$ powerspectrum. This is the socalled ISW effect (early time ISW and late time ISW respectively).
 The location of the first peak in TT is determined by the soundhorizon at decoupling to the angular diameter distance to the last scattering surface. It can therefore tell us about the geometry of our Universe (how curved is it) and how much dark energy there is.
 The relative height of the first three peaks in TT tells us about the amount of baryons and dark matter in our Universe.
 Diffusion of photons close to deoupling washes away perturbation and gives us a damping tail. This acts as a consistency check and allows us to search for new exotic physics in the early Universe.
 Emode polarization is generated by the temperature quadrupole which is generated by photon diffusion close to decoupling.
 Emode spectra grows from zero up to a maximum at $\ell \sim 800$. Large scale Emodes are also generated by reionization leading to a bump at $\ell \lesssim 30$.
 Emode spectra has peaks where the temperature spectra has throughs. This acts as a consistency check of the whole picture of the CMB as being caused by acustic oscillations.
 Reionization leads to a reduction of the amplitude of the CMB spectra $\propto A_se^{2\tau_{\rm reion}}$. This degeneracy can be broken by measuring the reionization bump in the Emode powerspectrum.
 Bmodes are generated by two things: 1) gravitational lensing of Emodes and 2) caused by gravitational waves from inflation. The former has been measured while the latter is the holy grail of CMB cosmology and will hopefully be measured soon.
 Gravitational lensing smooths out the peaks. How smoothed they are allows us to measure the amount of matter along lines of sight from us to last scattering and tells us something about how much matter has clustered on different scales. This can again be used to "undo" the lensing giving us unlensed spectra.
The matter powerspectrum:
 Grows as $P \propto k^{n_s}$ on large scales
 Has a peak close to the equality scale $k_{\rm eq} = \mathcal{H}(a_{\rm eq}) \sim 10^{2} h/\text{Mpc}$ (wavelength of $\sim 400$ Mpc/h) where $a_{\rm eq} = \Omega_{R0}/\Omega_{m0} \sim 3500$ (where $\Omega_{R0} = \Omega_{\gamma 0} + \Omega_{\nu 0}$) is the scale factor for matterradiation equality.
 Decays (roughly) as $P \propto k^{n_s4} \sim k^{3}$ on small scales.
 Has oscillations due to the fact that baryons were dragged along with the photons in the early Universe leading to an excess of matter at separations $r_{\rm drag} \sim 150$ Mpc (the soundhorizon in the drag era which is roughly equal to $r_s$). These oscillations corresponds to a peak at $r = r_{\rm drag}$ in the two point correlation function and this distance is a standard ruler that we can measure in galaxy surveys.