add old post about random points

This commit is contained in:
Joe Ardent 2025-07-21 16:10:15 -07:00
parent c3499eadf1
commit e518080e12
20 changed files with 514 additions and 3 deletions

Binary file not shown.

After

Width:  |  Height:  |  Size: 25 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 65 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 75 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 37 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 23 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 11 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 276 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 276 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 257 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 260 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 276 KiB

View file

@ -0,0 +1,512 @@
+++
title = "Right and wrong ways to pick random points inside a sphere"
slug = "not-rand-but-rand-y"
date = "2018-07-22"
updated = "2018-11-29"
[taxonomies]
tags = [
"fun",
"rust",
"raytracing",
"statistics",
"profiling",
"random"
]
+++
## Last things first
I'm going to cut to the chase: I could not find a way to choose
uniformly distributed random points inside the volume of a unit sphere that was
faster than picking one in the 8-unit cube, testing to see if it was inside the
unit-sphere, and trying again if it wasn't -- but I did come real close. But
that's getting ahead of myself. Before we come to the stunning conclusion, there
are many rabbit holes and digressions awaiting.
## First things first
A couple months ago, I started working my way through a nice little book called
*[Ray Tracing in One
Weekend](https://in1weekend.blogspot.com/2016/01/ray-tracing-in-one-weekend.html)*,
which all the cool kids are doing, it seems. Which makes sense, because making
pretty pictures is fun.
Anyway, around midway through, you get to the point where you're rendering some
spheres that are supposed to act as [diffuse
reflectors](https://en.wikipedia.org/wiki/Diffuse_reflection), like a subtly
fuzzy mirror[^1]. This involves performing a task that is, in slightly technical
words, randomly choosing a point from uniformly distributed set of points inside
the volume of the unit sphere (the sphere with a radius of 1). The book
describes a method of doing so that involves picking three uniformly distributed
points inside the 8-unit cube (whose corners go from *(-1, -1, -1)* to *(+1, +1,
+1)*),
![8-unit cube](./8unit_cube.png)
testing to to see if the point so described has a length less than 1.0,
returning it if it does, tossing it and trying again if it doesn't. In
pseudo-Python code, it looks like this:
``` python
while True:
point = Point(random(-1, 1), random(-1, 1), random(-1, 1))
if length(point) < 1:
return point # victory!
else:
pass # do nothing, pick a new random point the next roll through
# the loop
```
This seems like it might be wasteful, because you don't know how many times
you'll have to try generating a random point before you luck out by making a
valid one. But, you can come up with a pretty confident guess based on the
relative volumes. We know that the 8-unit cube is 8 cubic units in volume, since
its sides are 2 units long (from -1 to +1). Also, the name is a dead
giveaway. It's obviously larger than the unit sphere, because the latter is
contained within the former, shown pinkly below:
![unit sphere](./unit_sphere.png)
The sphere touches the cube at the center of each cubic face.
In three dimensions, the volume of a sphere is ```4/3
π r^3```, or "four-thirds times Pi times the radius cubed". By definition, the
unit sphere's radius is one, leaving us with 4/3 π, or about 4.1888. So if you
had a four-dimensional dart you were throwing at an 8-unit three-dimensional
cubic dartboard with an equal chance of hitting any particular single point inside
it, your chance of hitting one inside the unit sphere is (4.1888)/8 or about
52%. Meaning that on average, you have to try just a smidge less than twice
before you get lucky. That also means that we're basically generating six random
numbers each time.
I wanted to know if it were possible to randomly select a valid point *by
construction*, meaning, at the end of the process, I'd know that the point was
chosen from the uniformly-distributed set of random points inside the unit
sphere and could be used without having to check it. If you've been using it to
implement the raytracer from the book and you've done it right, at the end of
Chapter 7 you get an image that looks like this:
![matte gray spheres](./chapter7.png)
Technically, this is an image of a small ball sitting on top of a larger ball
that's made of the same material. They're both matte gray, and are just
reflecting the image of the sky from the background.
## The first wrong way
The first obvious but wrong way to choose a random point *(x, y, z)*, where x, y,
and z are uniformly distributed random numbers, *normalize* its length to 1 in
order to place it on the surface of the unit sphere, then *scale* it by some
random factor *r* that is less than 1 (but more than 0). In pseudo-Python code,
``` python
point = Point(random(-1, 1),
random(-1, 1),
random(-1, 1))
point = point / length(point) # normalize length to 1
point = point * random(0, 1) # randomly scale it to the interior of the unit sphere
return point
```
There are actually two things wrong with this, but first, a brief digression
into some math-ish stuff.
### Length of a point?
Do you remember how to calculate the length of a right triangle's hypotenuse with Pythagoras'
Theorem? You know, *a-squared plus b-squared equals c-squared*?
![right triangle](./right_triangle.png)
What if we replaced *a* with *x* and interpreted it as a coordinate on the
horizontal axis, with an origin at 0? Similarly, replace *b* with *y*, and think
of it as a 0-origined coordinate on a vertical axis. If you now have not a
triangle but instead a two-dimensional point at the [Cartesian
coordinate](https://www.mathsisfun.com/data/cartesian-coordinates.html) *(x, y)*, what is
*c*? It's now the *length* of the point.
![length of the point (x, y)](./2d_point_length.png)
The length here is the square root of 25 plus 16, or about 6.4.
That's all well and good, you may say, but what about a point in THREE
dimensions?? It turns out the same trick works, you just do
![Pythagoras in 3D](./pythagoras_length.png)
![Length of a point in 3D](./3d_point_length.png)
So if we have a point *(3, 1, 2)*, its length is the square root of 3^2 + 1^2 +
2^2 which is the square root of fourteen, which is about 3.74. What if we wanted
to make it so that it was pointing in the same direction from the origin, but
the length was 1? That's called *normalizing*, and is done by dividing each
coordinate by the length. Below, the point P'[^2] has the same direction as point P,
but the length of P' is 1.
![normalized 3D point](./3d_point_normalized.png)
Now, we could have chosen to multiply each component coordinate by a number
other than the reciprocal of the length of P to make P'. No matter what number
we chose (as long as it was greater than zero), it would still point in the same
direction as P, but its length would be something else. This operation, keeping
the direction but changing the length by multiplying each coordinate by some
number, is called *scaling*. Normalizing is just scaling by the reciprocal of
the length.
## OK, back to being wrong about constructing random points
So, the first obvious way is to pick a random, uniformly distributed point
inside the 8-unit cube, change its length so that it lies on the surface of the
unit sphere, then scale it by some random amount so that it's somewhere in the
interior of the sphere. Here's what it looks like when you do that:
![first wrong image](./chapter7-normed_cube_point.png)
Hmm.... close... Let's see our reference, correct image again:
![matte gray spheres](./chapter7.png)
The reference image has a much softer shadow under the smaller ball, meaning the
edge is less distinctly defined, than the image using the first wrong technique
for generating random points. The experimental shadow also seems to be smaller
than the reference one, but it's hard to tell for sure given the diffuse way the
correct shadow dissipates. Still, something is obviously wrong.[^3]
## The first obviously wrong thing about the first obvious wrong way
Let's look again at the unit sphere inside the 8-unit cube:
![unit sphere](./unit_sphere.png)
Taking a moment to become unblinded by the brilliance of the diagram, we can see
that if we're taking a truly random point inside the cube, and if the point
happens to not be inside the unit sphere (which, as you know, happens about half
the time), it's a lot more likely that the invalid point is in the direction of
one of the eight corners, than near the center of the cubic face, because
there's a lot more non-sphere space there. So what would it mean, on average,
for us to not throw those points away, but instead normalize them so that they
have a length of 1 and are on the surface of the unit sphere? You'd have eight
"hot spots" on the sphere's surface where points would be more likely to be
found, and your surface distribution would not be uniform.
### You keep saying "uniformly distributed"; what does that mean?
I've been meaning to take a brief aside to talk about what I mean by "uniformly
distributed", or, "random". In everyday speech, people usually take "random" to
mean, "each distinct event in a set of related events has an equal probability
of happening." Examples are rolls of fair dice or flipping a fair coin; ones and
sixes and fives and fours and twos and threes should each come up 1/6 the time,
heads and tails should be 50/50. When this happens, you say that the values are
*uniformly distributed*, or they have a *uniform distribution*.
Just because we usually want to talk about random events that happen with a
uniform distribution doesn't mean that non-uniform distributions are
unimportant. One of the most important and useful distributions is called the
[Standard, or Normal,
Distribution](https://en.wikipedia.org/wiki/Normal_distribution). Another name
is the "Gaussian Distribution", but really, it's Normal when it's a Gaussian
distribution whose expected value is 0 with a [standard
deviation](https://en.wikipedia.org/wiki/Standard_deviation) of 1. It looks like
this:
![normal distribution](./Standard_deviation_diagram.svg.png)
*taken from the wikipedia page on "standard deviation"*
A good way of thinking about distributions is, if you pretend that they're like
a one-dimensional (numberline) dartboard, and you throw a bunch of darts at
them, then the distribution is the chance that the dart lands within the given
interval. When it's uniform, the line is a straight horizontal line at some
height; if the height were 1, the odds are 100%, and if the line were at 0, the
odds are 0%, but the take-away is that each point on the numberline-dartboard
would be equally likely to be hit by a dart in a uniform distribution. If your
dart throws were normally distributed, you'd be hitting near the middle about
40% of the time; hit within the interval -1 to +1 about 68% of the time, and
within -3 to +3 like 99% of the time. There's technically no theoretical limit
to the magnitude of a normally distributed random value. Your dart could land at
like -1,000,000, but the odds are really, really, *really* low.
### The Normal Distribution to the rescue!
Early on in my search for methods for choosing a random point inside a sphere,
I'd seen a fancy Javascript math font formula on like math.stackexchange.com[^4]
that looked something like this,
![formula for constructing uniform random points inside a
sphere](./too_soon_normals.png)
*note: this formula doesn't make total sense so don't worry too hard about the
details; it's meant to reflect the fact of my own confusion about it making it
hard to remember it accurately*
but it was too soon in my search, so I'd not become familiar with the
distinction between differently distributed sets of random values, and all I
took from it immediately was that they were scaling a nomalized point. Later,
after grasping that it was incorrect to just normalize points whose coordinates were
uniformly distributed, I read somewhere else[^5] something like, "Three
independantly selected Gaussian variables, when normalized, will be uniformly
distributed on the unit sphere,"[^6] and it clicked what those ".. = N"s meant: they
were random values sampled from the Normal Distribution. I quickly updated my
code to something like the following:
``` python
point = Point(random(0, 1, gaussian=True), # first arg is expected val,
# second is standard deviation
random(0, 1, gaussian=True),
random(0, 1, gaussian=True))
point = point / length(point) # normalize length to 1
point = point * random(0, 1) # randomly scale it to the interior of the unit sphere
return point
```
As you can see, it's basically exactly the same as the previous code, except our
initial *x*, *y*, and *z* component coordinates are random values with a
Gaussian distribution, instead of a uniform distribution. I eagerly re-ran the
render, and it produced the following image:
![gaussian variables incorrectly
scaled](./chapter7-incorrectly_scaled_point.png)
Which.. looks... a LOT like the previous one, which was wrong. If anything, the
shadow is darker and sharper than the previous wrong one, so it's even MORE
wrong now. Which brings us to
## The second wrong thing about all previous wrong things
Those of you who are, unlike me, **not** infested with brain worms may have been
wondering to yourselves, *What the heck was up with the "1/3" next to the "U" in
that awesomely hand-written formula from just a second ago? Shouldn't there be a
term in the code that reflects whatever that might mean?* You would be right to
wonder that. That means, "take the cube root of a uniform random number".
Because of the brain worms, though, I did not at first make that
connection. However, luckily for me, at the time I was trying to figure all this
out, I was hanging out with my friends [Allen](http://www.allenhemberger.com/)
and [Mike](https://twitter.com/mfrederickson), two incredibly creative, smart,
and curious people, and who had started to be taken in by this problem I was
having. Both of them are highly skilled and talented professional visual effects
artists, and have good eyes for seeing when images are wrong, so this was right
up their alley.
Mike suggested taking a look directly at the random points being generated, so I
changed my program to also print out each randomly generated point in a format
that Mike could import into
[Houdini](https://www.sidefx.com/products/houdini-fx/) in order to visualize
them. When he did, it was super obvious that the points were incorrectly distributed,
because they were mostly clustered very close to the center of the sphere,
instead of being evenly spread out throughout the volume of it.
Incredibly, even this wasn't enough for me to connect the dots. I wound up going
on a convoluted journey through trying to understand how to generate the right
probability distribution to compensate for the fact that as distance from the
center increases, the amount of available space drastically increases. Increases
as the cube of the radius. If only there were an inverse operation for cubing
something...
I don't know what it finally was, but the whole, "volume goes up as radius
cubed" and "U to the 1/3 power" and probably Mike saying, "What about the
cube root?" gestalt made me realize: I needed to take the cube root of the
uniformly distributed random scaling factor! I changed the code to do that, and
handed Mike a new set of data to visualize. He put together this spiffy
animation showing the effect of taking the cube root of the random number and
then scaling the points with that, instead of just scaling them by a uniform
random number (ignore the captions embedded in the image and just notice how the
distrubition changes):
![animation of points being
redistributed](./sphere_points3.gif)
You can see pretty clearly how the points are clustered near the origin, where
there are a lot fewer possible places to be, vs. when using the cube root. Taking the
cube root of the uniformly distributed random number between 0 and 1 "pushes"
it away from the origin; check out a graph of the cube root of *x* between 0 and
1:
![graph of cube root between 0 and
1](./cbrt_function.png)
This shows that if you randomly got, say, 0.2, it would get pushed to near
0.6. But if you got something close to 1, it won't go past 1. In other words,
this changes the distribution from uniform to a Power Law distribution; thanks
to [this
commenter](https://lobste.rs/s/bkjqjc/right_wrong_ways_pick_random_points#c_lieynp)
on lobste.rs!
Anyway, the proof, of course, is in the pudding, which is weird, because how did
it get there. Anyway, when I ran the render with the updated code, it produced
the following pretty picture, which we all agreed looked right:
![correctly rendered
image](./chapter7-correctly_scaled_point.png)
Time to celebrate!
## Not so fast
Although this new method of randomly choosing points within the unit sphere with
a uniform distribution seemed to be correct, I noticed that my renders were a
little slower. By which I meant, about 50% slower, so like 13 seconds instead
of 9. This wasn't yet unacceptably slow in any absolute sense, but one of the
reasons I wanted to try to do this by construction was to do it less
wastefully. In general with programming, that means, "faster". Taking 150% of
the time was not less wasteful. But where was the extra work being done?
I knew that generating a Gaussian random number would be more work than a
uniformly distributed one, but didn't think it would be THAT much more. It turns
out that it takes almost twice as long to make a Gaussian one than a uniform one for
my system[^7]. With the textbook method of generating and testing to see if you should
keep it, you'll have to generate almost six uniformly distributed random
numbers. With our constructive method, we have to generate three Gaussian
values, plus one uniform one, for an effective cost of seven uniform random
number generations. Hmm, so, not really a win there, random-number-generation-wise.
The only other expensive thing we're doing is taking a cube root[^8], and
indeed, almost half the time of running the new code is spent performing that
operation! I suspected this might be the case; I figured that even the square
root would have direct support by the hardware of the CPU, but the cube root
would have to be done "in software", which means, "slowly". But I know that
there are some tricks you can pull if you're willing to trade accuracy for
speed.
A [quick and dirty implementation of "fast cube
root"](http://www.hackersdelight.org/hdcodetxt/acbrt.c.txt) later, and I got the
following image:
![correct image with fast cube
root](./chapter7-fast_cube_root.png)
which looked pretty good! And after benchmarking, it was... ***ALMOST*** as fast
as the original method from the book. Still more wasteful, and technically less
accurate (the name of the page I got the algorithm from is "approximate cube
root", and the function is most inaccurate in the region near zero, sometimes
wrong by over 300%).
## Lessons learned
This really has been a long and twisted voyage of distractions for me. For as
long as this post is, it's really only scratching the surface on most of the
material, and omitting quite a bit of other stuff completely. Things I've left
out completely:
* going into detail on the fast cube root; I had a bunch of graphs of things
like the absolute and percentage error values for it vs. the native "correct"
cube root, and will probably attempt to improve my implementation in the
future to work better for values close to zero and write that up.
* Going into detail on how to generate random Gaussian numbers. The most common
way that I see referenced is called the [Box-Muller
Transform](https://en.wikipedia.org/wiki/Box%E2%80%93Muller_transform), and it
requires first generating two uniform random values then doing some
trigonometry with them, which would be a very expensive operation for a
computer to do. But there are [clever ways](https://arxiv.org/abs/1403.6870)
to generate them that are almost as fast as generating uniformly distributed
ones; [the random number library I'm
using](https://docs.rs/rand/0.5.4/rand/distributions/struct.StandardNormal.html)
is pretty clever in that way. There's still room for improvement, though!
* Details on using ```cargo bench``` to benchmark my algorithms, which is
intrinsic to [Rust](https://www.rust-lang.org/en-US/), the hip new-ish
language from Mozilla, the web browser people. This would be of interest to
programming nerds like me.
* You can generalize spheres to more than three dimensions; for example, in 4D,
it's called a [glome](https://en.wikipedia.org/wiki/3-sphere). As the number
of dimensions increases, it becomes less and less likely that a randomly
selected point from the enclosing cube will be inside the sphere, so the
"rejection sampling" method of the text becomes more and more slow compared to
the various constructive techniques.
* Yet MORE ways to do this task, and how I might be able to get a constructive
technique to actually be faster, and hot-rodding software in general.
Some of these will come up in the near future, in more blog posts. Some will
just have to be taken up by someone else, or never at all (or already
exhaustively covered by smarter people than I, of course). Most of all, though,
I've been surprised and delighted by how rich and deep this problem is, and I
hope some of that has come through here.
I also want to give huge thanks to Allen and Mike, who were both extremely
supportive of cracking this nut in person and over email. They said figuring
this out should lead to the mother of all blog posts, so I hope this isn't a
disappointment. No pressure all around!
## da code
Not that it's of that much interest to most, but of course, my code is freely
available. See https://github.com/nebkor/weekend-raytracer/ for the repository,
and take a look at
https://github.com/nebkor/weekend-raytracer/blob/random_points_blog_post/src/lib.rs
to see the implementations of the different techniques, right and wrong, from
this post.
Eventually, I'll be parallelizing the code, and possibly attempting to use the
GPU. But before then, I need to finish the last couple chapters of this book,
and maybe chase fewer rabbits until I've done that :)
-------------------------------------------------------------------------------
[^1]: A more common name for this is that it's a Lambertian surface or
material. See https://en.wikipedia.org/wiki/Lambertian_reflectance
[^2]: The little single-quote mark after *P* in *P'* is pronounced "prime", so
the whole thing is pronounced "pee prime". "Prime" is meant to indicate
that the thing so annotated is derived directly from some original thing; in
our case, *P*. I don't know why the word "prime" is chosen to mean that;
something like "secondary" would make more semantic sense, but here we are,
stuck in Hell.
[^3]: This may sound weird, but one of the most fun things about writing
software that makes images (or, as I and most others like to say, "pretty
pictures") is that when you have a bug, it shows up in a way that you might
not realize is wrong, or it produces an unexpected but pleasing image,
etc. It's fun to look at the picture, try to reason about what must have
happened to make it incorrect, then try to fix and re-render to see if you
were right.
[^4]: If you want to feel like an ignorant moron, try googling for stuff like
"random point on sphere" and "probability distribution". I promise you, it
made about as much sense to me as to you, assuming it makes very little
sense to you on first glance. Look at this shit from
http://mathworld.wolfram.com/SpherePointPicking.html
![quaternion transformation
obvs](./simple_quaternion.png)
I mean come ON with that. I'm trying real hard here to explain everything so
that even if you already know it, you'll still be entertained, and if you
don't, you won't run away screaming because it's all incomprehensible jargon.
[^5]: OK, looks like it's the Mathworld.wolfram link from above, because right
below that quaternion nonsense there's
![normalized gaussian
point](./mathworld_normed_point.png)
which is pretty much exactly what I remember reading.
[^6]: Getting into the reason WHY three Gaussian random values will be uniformly
distributed on the surface of the unit sphere when normalized is a rabbit
hole I did not go deeply down, but I think it would be another whole thing
if I did. The magic phrase is "rotational invariance", if you'd like to look
into it.
[^7]: How do I know how long it takes on my system to produce a Gaussian random
value vs. a uniform one? I [wrote a tiny program that generates random
values, either Gaussian or
uniform](https://github.com/nebkor/misfit_toys/tree/master/gauss_the_answer),
and timed how long it took to generate a a few billion values of each
kind. On my system, that program will take about five seconds to
generate and sum a billion Gaussian values, while doing the same for uniform
values takes less than three. Also, it turns out that the "fast" Gaussian
value generator technique is, like the fast way to pick a random point in
the sphere, to generate a random value, check to see if it's a valid
Gaussian one, and guess again. It seems that it takes a couple rounds of
that on average no matter what.
[^8]: The act of normalizing the point requires a square root, but most
processors have pretty fast ways of calculating those, and as with the cube
root, there are hacks to trade accuracy for speed if desired. But my
processor, a 64-bit Intel CPU, has direct support for taking a square root,
so any hack to trade accuracy for speed is likely to simply be worse in both
respects.

Binary file not shown.

After

Width:  |  Height:  |  Size: 15 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 18 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 21 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 29 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 12 MiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 38 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 39 KiB

View file

@ -274,10 +274,9 @@ those crates! Feel free to steal code from me any time!
---- ----
[^julid-package]: The Rust crate *package's* [^julid-package]: The Rust crate *package's*
[name](https://git.kittencollective.com/nebkor/julid-rs/src/commit/e333ea52637c9fe4db60cfec3603c7d60e70ecab/Cargo.toml#L2) [name](https://git.kittencollective.com/nebkor/julid-rs/src/commit/e333ea52637c9fe4db60cfec3603c7d60e70ecab/Cargo.toml#L2)
is "julid-rs"; that's the name you add to your `Cargo.toml` file, that's how it's listed on is "julid-rs"; that's the name you add to your `Cargo.toml` file, that's how it's listed on
[crates.io](https://crates.io/crates/julid-rs), etc. The crate's *library* [crates.io](https://crates.io/crates/julid-rs), etc. The crate's *library* [name](https://git.kittencollective.com/nebkor/julid-rs/src/commit/e333ea52637c9fe4db60cfec3603c7d60e70ecab/Cargo.toml#L32)
[name](https://git.kittencollective.com/nebkor/julid-rs/src/commit/e333ea52637c9fe4db60cfec3603c7d60e70ecab/Cargo.toml#L32)
is just "julid"; that's how you refer to it in a `use` statement in your Rust program. is just "julid"; that's how you refer to it in a `use` statement in your Rust program.
[^httm]: Remember in *Hot Tub Time Machine*, where Rob Cordry's character, "Lew", decides to stay in [^httm]: Remember in *Hot Tub Time Machine*, where Rob Cordry's character, "Lew", decides to stay in