Paper Explainer: Applying Liouville's Theorem to Gaia Data

Cartoon of the structure of dark matter objects. From Buckley and Peter

Cartoon of the structure of dark matter objects. From Buckley and Peter

This is an explainer for my recent paper with David Hogg and Adrian Price-Whelan. This is a very different kind of paper for me, as evidenced by the fact that it is coming out on arXiv’s astro-ph (astrophysics) list and not even cross-listed to hep-ph (high energy phenomenology). In the end, the goal of the research that produced this paper is to learn about dark matter, but this paper by itself barely mentions the subject. There is a connection though.

One of my research interests of late (as described in detail in my paper with Annika Peter) is trying to learn about the particle physics of dark matter by studying gravitationally bound systems of dark matter in the Universe, and seeing if their properties are consistent with a “vanilla” type of dark matter without significant non-gravitational interactions, or if there is some evidence of new dark matter physics within those objects.

The type of dark matter system I think has a lot of promise are the smallest dark matter structures: collections of dark matter much smaller than the Milky Way. We know of some of these objects: dwarf galaxies, but what about even smaller ones?

The reason we don’t know much about such objects is that, as you decrease the amount of dark matter in a galaxy, you also decrease the amount of normal matter. Eventually, there isn’t enough gas around to make stars, or at least, not that many stars. Galaxies an order of magnitude less massive than the smallest known dwarf galaxies might only have ${\cal O}(10-100)$ stars; smaller ones might have none.

That makes the smallest star-containing galaxies very dim, and hard to find if they’re in “the field” (far from the Milky Way). But if they are close enough to the Earth for us to see, then the gravitational tides of the Galaxy will start pulling the dwarf galaxy apart, so what we would see is less a gravitationally-bound object, and more a shredded “stream” of stars.

Indeed, we suspect that our Milky Way was built from such ripped-up dwarf galaxies; finding the evidence for the “tidal debris” of the galaxies consumed by the Milky Way could tell us about the structure of dark matter over cosmological history. These streams are not theoretical objects: we know of many streams of stars that come from either dark-matter dominated dwarf galaxies, or tidally-disrupted balls of stars called globular clusters (which are not believed to contain significant amounts of dark matter). And we’re finding even more, thanks to the Gaia mission (about which more later). I’d like to know what these objects can tell us about the physics of dark matter. One thing I’d like to know is which streams came from dwarf galaxies, and what was the mass of the dwarf galaxy before it was tidally disrupted.

But again, such objects are torn apart, no longer gravitationally bound. And that presents a big problem for measuring the mass of a stream. All our methods of directly measuring the mass of astronomical objects rely on those objects being self-gravitating: the stars (or galaxies, or planets or whatever) need to be orbiting in their own static (or nearly static) gravitational potential for a long time. Streams are orbiting in the gravity of the Milky Way, and are changing over time. None of the typical techniques will work to measure their mass. The best you can do is compare the properties of the stars in a stream with known objects, and make an estimate that way. I’d like something more direct.

The usual techniques to measure mass use conserved quantities for the system, things like energy or angular momentum. The problem with the streams is that they are exchanging energy and momentum with the larger Galaxy, so these are not useful conserved quantity. So I need something else.

There is something that is conserved during the merger of a dwarf galaxy and the Milky Way (indeed, conserved in many dynamic systems). Something called the phase-space density. Phase space is the set of coordinates that describe where a point (or a star) is and how it’s moving. In our Universe, there are six numbers you need: the position (3 numbers) and the velocity (another 3 numbers). If you specify these 6 numbers, you’ve given the phase-space location. If you have a collection of points (like the stars in a cluster), you have a bunch of phase-space coordinates. You can then describe the phase-space density: the number of points per length cubed per speed cubed. There will be positions and speeds with lots of stars; these have high phase-space density, and places where there are few stars. These have low phase-space density.

Now here’s the cool bit: under most dynamic evolution, phase-space density is conserved. This is called Liouville’s Theorem. That is, if there is a dense region of phase space (i.e., a region with lots of stars), then after being shredded by the Galaxy, there will still be a region with lots of stars. Another region of phase-space (another set of locations and velocities) perhaps, but still: there’s something conserved.

Simulation of a stream forming from a Globular Cluster in the Milky Way (galaxy center marked with Gold star).

Simulation of a stream forming from a Globular Cluster in the Milky Way (galaxy center marked with Gold star).

In particular, a cluster being disrupted by the Galaxy will become extended in position-space (it’s turning into a stream, after all). But all the stars will be moving in the same direction (around the Galaxy on nearly the same orbit), so they become much more compact in velocity-space. Result: conservation of phase-space density.

So this suggests a way to measure the mass of a cluster of stars, even if it has been ripped apart by the Galaxy. Measure the 6D location of each star in the cluster or stream in phase space. Determine the density in phase space. Then, compare that to the predicted phase-space density distribution of stars in a (simulated) cluster that before tidal disruption, and figure out what mass (and other structural parameters, like the radius) of the cluster most matches the observed densities. Simple: mass measurement of systems that aren’t in equilibrium.

Of course, there are a few obstacles. The first is that you have to measure the phase-space locations of a bunch of stars. Which means you need to know the position and velocities of each star. That’s hard. I don’t know if you know this, but stars are kind of far away. That makes it hard to judge the distance to a star. Further, stars are moving at hundreds of km/s (relative to the Earth), but due to their immense distance, it isn’t easy to see that motion. So how do you figure out their velocities?

The solution is the Gaia satellite. Gaia is a European Space Agency mission whose goal is measuring the positions and velocities of the nearest billion stars. It does this by repeatedly measuring the angular location of the stars on the sky, and looking for the drift of the star over time. That gives the “proper motion” (motion across the plane of the sky), and also the parallax (the change in the apparent position due to the different location of the Gaia satellite as it orbits around the Sun every year). That allows you to get the 3D position and 3D velocities for stars. To get the required accuracy, the Gaia satellite is capable of measuring a star’s position on the sky to within a few microarcseconds, which is the equivalent of measuring the diameter of a human hair in San Francisco while standing in New York City.

In reality, I’m oversimplifying: Gaia will get full 6D phase space information with decent errors only for the closest or brightest stars. In this first paper, I’ll work around that issue. Long term, I think it should be possible to combine data sets with other star surveys or use only the brightest tracer stars to measure phase-space positions of stars.

The next problem is measuring the phase-space density. This is non-trivial, and there isn’t really a great way to do it. I’m using a code called EnBid, which is a very cool piece of code designed for measuring $N$-body simulations of galaxies. It takes a list of 6D phase-space points and subdivides them recursively, giving a density estimate. It requires a bit of baby-sitting to find the right splitting parameters that give the right densities, but it works better than anything else I found right now. But then there are two more serious problems, and trying to overcome them is what this whole paper is all about. Basically, to extract good estimates of the densities, I need to deal with the fact that

  1. Gaia measurements aren’t perfect for streams of stars, and
  2. the best way to measure density requires knowing the orbit. But knowing the orbit requires the potential of the Galaxy, which we don’t know.

The first problem is the problem of measurement errors. Gaia, like all experiments, is not perfect: you can never measure anything perfectly, and so Gaia reports both the best-estimate for position and velocity, but also the estimate of how wrong those measurements are. Usually the error in velocity dominates, since velocity is measured by measuring positions over and over, so the errors accumulate.

M4 Globular Cluster, NASA/ESA Hubble

M4 Globular Cluster, NASA/ESA Hubble

What this does is make the cluster of stars more diffuse. Mismeasurement takes things that are clumped close together in position and velocity, and smears them out. I've been describing errors as the equivalent of taking a box of stars, and shaking it. If the errors were real changes in the star's phase-space coordinates, the cluster would be "hotter" than it was before the errors were applied. That is, it injects entropy into a system.

Preferentially, this makes all the stars look like they have less density than they really do. You can understand this via the analogy with entropy: it is statistically unlikely for the errors to move a star in towards the center of the phase-space overdensity that corresponds to a cluster; it is much more likely that a random error will move the star away from everyone else, because there are just more directions that move the star "out" rather than "in."

This entropy injection and the reduction of phase-space density if a huge problem if I want to use the distribution of phase-space density to estimate the mass. You can see this in simulation. In the paper I work with a specific collection of stars: the M4 cluster, the closest globular cluster to Earth. It's a ball of stars with a mass of around 100,000 $M_\odot$. It isn't a tidally disrupted object, but that means I can worry only about the measurement error issue, without the additional complication that I'll address next. M4 is 2000 pc away (about 6200 lightyears). This means that Gaia doesn't have great measurement of the distance and radial velocity (the speed towards or away from the Earth) for individual stars, so I have to work in 4D phase space. This is a minor complication for spherically symmetric objects, but I mention it for completeness.

Anyway, if I simulate a M4-like globular cluster without any errors, I show in the first slide of the slideshop what the distribution of stars with respect to radius and speed would look like. You can see a sharp edge: for each radius, there's a maximum speed a star can be moving before it would be gravitationally unbound and escape the cluster. So if you wanted to measure the mass of the cluster directly, you'd use that maximum velocity as a function of radius to get the best possible measurement. But for me, that would be cheating, since I want a method that doesn't rely on the system being gravitationally bound (as I want to apply it to tidally disrupted systems eventually).

In the next slide, I show what happens if you add in realistic Gaia measurement errors and also include stars that aren't gravitationally bound to the cluster, but just happen to be passing through (or in front of or behind) the cluster and so appear to be part of the system. You can see the "heating" of the cluster now: stars at a given radius appear to be moving much faster than they should be. Critically, for my purposes, these errors also greatly reduce the average phase-space density of the stars, destroying my measurement technique.

So the first thing to try is just take the well-measured stars: those stars that Gaia believes have small errors (this is calculated by the Gaia mission by considering how long they looked at each star). The 3rd slide shows the simulated cluster after this cut on the error is applied. You can see that I've mostly restored the sharp edge in the $r-v$ diagram, but there are still stars that have way too small densities, and are moving way too fast to be bound to the cluster. These are mostly the foreground stars, it turns out. I can throw those out by placing a second cut on phase-space density, which implies there might be a cool way to determine cluster membership using phase-space density. Something to come back to in the future, perhaps.

So I have these (simulated) stars, and their phase-space densities. Can I measure the mass and density profile of the simulated cluster using them? It turns out, no, not yet. What I can try is to say "here's a set of measured phase-space densities. What is the probability that this set was drawn from a cluster of a specific mass and radius parameter?" and then say that the mass and radius that maximizes this probability is the right answer. But even though I'm only looking at the best-measured stars, the entropy-injection from the mismeasurement still biases the measurement. I find a mass that is much larger than the real mass of the simulated cluster. So I'm still not correcting for errors in the right way.

But this isn't using the fact that mismeasurement tends to bias the phase-space densities to lower values, rather than higher. So I do a somewhat hacky trick (and I'd like to come up with a better approach in the future). In the first calculation, I assume that the phase-space density for each star has some error distributed evenly around the best-measured value: maybe the real density is higher than the measured, and maybe it is lower. To do better, I assume that the density is between the central value and the upper limit only: I throw out the possibility that the density is really lower than what I measured, because statistically, errors almost never decrease the density for a gravitationally bound object like a globular cluster. And what do you know: in simulation, this works: I get the mass bang-on.

So now let's try that with real data. As a particle physicist, this was one of the really fun things of working in astronomy for this project: I get to play with real data, rather than being firewalled off from it. Granted, the Gaia data is a lot simplier to work with than the LHC data, so it's apples to oranges, but there's something very satisfying with manipulating data that you know represents something concrete out there in the Universe.

First, compare the real data (in the $r-v$ scatter plot) with my simulation. It looks pretty similar, so I can maybe trust that my approach that works in simulation might work in real data. There is one key difference, which is a bit subtle to see. If you look at the real data after I cut on stars with large errors, you'll see more stars at small radius with higher velocities than appear in the simulation. This is because, at the center of the cluster, there are a lot of stars very close together, and that makes it harder for Gaia to measure their velocities as well as they expect. This is called "crowding." If I wanted, I could probably come up with a better model of the cluster to include this in my simulation, but for the purposes of this paper, I will just point out that it exists, and move on.

I then measure the phase-space densities of each star using EnBid, and take the asymmetric errors (central value to upper value), and then calculate the probability that these densities could have come from a cluster with a specific mass and radius. Here's what I find.

Measurement of mass and King Radius for the M4 globular cluster using Gaia Data. Best-fit value of phase-space density technique is shown with black star, accepted best-fit value for the cluster is shown with a gold star.

Measurement of mass and King Radius for the M4 globular cluster using Gaia Data. Best-fit value of phase-space density technique is shown with black star, accepted best-fit value for the cluster is shown with a gold star.

I'm close to the correct mass (though a bit low than the currently accepted value extracted from more traditional techniques, which is the result of crowding). I get the radius parameter wrong though, again this is mostly due to crowding. Both could be corrected for by additional tweaks to the simulation, but the point of this exercise is not to get the best measurement of the M4 globular cluster, it is to show that measurement errors don't prevent me from using conservation of phase-space density to measure the mass of a collection of stars. Which I think this does.

Again, the issue is that errors inject entropy, and entropy tends to move stars from regions of high phase-space density to lower values. By picking only the best-measured stars, and then using the fact that the shift in density is asymmetry (high to low, never low to high), I can get a decent estimate of the true density distribution. There are likely better ways to go about this than the technique I'm using in this paper, which I'm investigating now and I think have some really cool potential applications.


So now to the next problem. I'm not really interested in globular clusters - I'm a dark matter physicist after all. I'm even less interested in globular clusters that are nice and spherical and not tidally disrupted. I want streams of stars, because streams of stars are all the remnants of dwarf galaxies I'm going to see in the Gaia data.

But streams present a problem, and that problem is getting any measure of the phase-space density. The problem is that estimating phase-space density is hard even for a nice spherical object like M4. If you take those stars and spread them out in a line over 10 kpc in the Galaxy, measuring the phase-space density accurately becomes next to impossible. The problem as far as I can tell is just a matter of dissimiliar scales: the streams are very long but also very narrow, and EnBid and every other technique I tried fell apart.

But that's a problem with the coordinates I've chosen to work in: position and velocity. It is possible to define the location of a star in the Galaxy using another set of coordinates, which also have the property that phase-space density is conserved over time, just as with the more familar position-velocity ones.

These coordinates are called actions and phase angles, which are calculated for an orbit (in this case, the orbit of a star around the Galaxy). There are three actions, which can be thought of as generalizations of angular momentum (one component for each of the three spatial directions), and there are three phase angles (which are basically where the star is in the orbit, and cycle around from 0 to $2\pi$ as the star completes an orbit. In these action-angle coordinates, I can get $\textsc{EnBid}$ to give good estimates of the phase-space density, largely because the actions are constant (angular momentum is conserved), and the angles are in a finite box ($0--2\pi$).

However, observant readers will see a problem. Gaia measures the location and velocity of a star. To get an orbit from that, you need to know the gravitational pull of the Galaxy within which the star is moving: you need to know the Milky Way's gravitational potential. We don't actually know that, at least, not perfectly. So without a potential, I can't build action-angles, and I can't measure the phase-space density to determine the mass of a stream. A conundrum.

But again, entropy comes to the result. Which is nice of entropy, as usually it is just slowly killing us.

A stream of stars is a very low entropy configuration: the odds of all these stars moving in nearly the same direction spread out nicely in a line? Very improbable. The only way that can happen is if you start with a nice compact collection of stars (like in a dwarf galaxy or globular cluster) and then let them get disrupted by the Galaxy.

Indeed, if you disrupted that collection of stars in any other potential other than the right one, you would never find exactly the same stream. Which suggests a way to find the correct Milky Way potential from a stream of stars. Basically, if you have a collection of stars measured in a stream by Gaia, the potential in which they are orbiting will minimize the entropy of their phase-space densities, as measured in action/angle space. Any incorrect potential will tend to drive stars the stream away from each other in as they orbit; only the true potential will keep them all close together.

So, the idea is as follows:

  1. Take the position-velocity phase-space coordinates for stars in a stream. Assuming a Milky Way potential, calculate their orbits.
  2. From this orbit, calculate the actions and phase-space angles for each star.
  3. From the collection of action/angles, calculate the phase-space density of each star.
  4. Repeat this process for a large number of variations of the assumed Milky Way potential. The correct potential is the one that minimizes the entropy of the collection of densities for the stars.
Orbits of stars from a globular cluster (initial positions in red) as it is tidally disrupted in a simulation of the Milky Way. Blue dots are the locations of the stars after 500 million years. The future orbits of the stars over the next 500 millio…

Orbits of stars from a globular cluster (initial positions in red) as it is tidally disrupted in a simulation of the Milky Way. Blue dots are the locations of the stars after 500 million years. The future orbits of the stars over the next 500 million years (in the assumed Milky Way potential) are shown in gold. 10 sets of orbits assuming incorrect orbits are shown in black.

Entropy S of the phase-space densities of stars in a simulated stream for 1000 iterations of the Milky Way POtential versus the Kullback-Leiblier test statistic (a measure of how similar the Iterated potential is to the real potential). Note that th…

Entropy S of the phase-space densities of stars in a simulated stream for 1000 iterations of the Milky Way POtential versus the Kullback-Leiblier test statistic (a measure of how similar the Iterated potential is to the real potential). Note that the real potential has the minimum entropy, and that there is a trend of more similar potentials having lower entropies.

I'm showing here what this would look like in position-space. You have a stream of stars (the blue dots in the figure). You can evolve those stars in a future orbit using the correct Milky Way potential (gold lines), and calculate action/angles and then densities from these orbits. You can then see how "compact" that density distribution is (this is the entropy). You can then try that again with a bunch of incorrect assumptions for the Milky Way potential - getting wrong orbits and wrong densities. Each of those sets of densities will be less compact - that is, higher entropy - than the correct answer using the real potential.

Thus, by varying the Milky Way potential, you can not only recover the true density distribution of the stream (and thus measure the mass of the object that gave birth to the stream), but you might also be able to measure the gravitational potential of the Milky Way. And that's pretty cool, and something I'm very interested in trying in real data, as for this paper I stuck to simulation (there are a lot of issues with getting accurate 6D phase-space coordinates for Gaia streams, but it is something I'm working on right now).

So that's why I spent a very long time working on a pure astrophysics problem: because I want to be able to measure the mass and structure of the objects that created the stellar streams we are seeing in the Milky Way. And, as I discovered, I might want to use those streams to measure the distribution of dark matter within the Milky Way, because, after all, the Milky Way potential is due in no small part to the dark matter in the Galaxy.