ready to publish
This commit is contained in:
parent
7955a3aa69
commit
0e2220cfc1
2 changed files with 194 additions and 31 deletions
|
@ -1,9 +1,9 @@
|
|||
+++
|
||||
title = "A One-Part Serialized Mystery, Part 2: The Benchmarks"
|
||||
slug = "one-part-serialized-mystery-part-2"
|
||||
date = "2023-07-09"
|
||||
date = "2023-07-15"
|
||||
[taxonomies]
|
||||
tags = ["software", "rnd", "proclamation", "upscm", "rust", "macros"]
|
||||
tags = ["software", "rnd", "proclamation", "upscm", "rust", "sqlite"]
|
||||
+++
|
||||
|
||||
# A one-part serial mystery post-hoc prequel
|
||||
|
@ -18,17 +18,17 @@ get-go, so I decided it couldn't hurt to switch to them before I had any actual
|
|||
|
||||
And that was correct: it didn't hurt performance, but it also didn't help much either. I've spent a
|
||||
bunch of time now doing comparative benchmarks between ULIDs and UUIDs, and as I explain below, the
|
||||
anticipated space savings did not materialize, and the speed-up is merely augmenting what was
|
||||
already more than fast enough into slightly more faster than that. Of course, of course, and as
|
||||
always, the real treasure was the friends we made along the way etc., etc. So come along on a brief
|
||||
journey of discovery!
|
||||
anticipated space savings did not materialize, and the initial speed-ups I got were merely
|
||||
augmenting what was already more than fast enough into slightly more fasterer than that. Of course,
|
||||
of course, and as always, the real treasure was the friends we made along the way etc., etc. So come
|
||||
along on a brief journey of discovery!
|
||||
|
||||
# Bottom Line Up Front
|
||||
|
||||
With sqlite and my final table schema, the size difference and speed differences are negligible,
|
||||
|
||||
TODO MOR STUFFF
|
||||
|
||||
ULIDs have a slight edge over UUIDv4s when used as primary keys, but the best primany keys are
|
||||
simple integers if you can get away with it. With my final DB schema and import/benchmarking code,
|
||||
there was no difference in terms of time taken or space used when using ULIDs vs UUIDs as primary
|
||||
keys.
|
||||
|
||||
However, with my initial database layout and import code, ULIDs resulted in about 5% less space and
|
||||
took only about 2/3rds as much time as when using UUIDs (5.7 vs 9.8 seconds). The same space and
|
||||
|
@ -46,7 +46,9 @@ My benchmark is pretty simple: starting from an empty database, do the following
|
|||
1. for each user, randomly select around 100 movies from the 10,000 available and put them on their list of
|
||||
things to watch
|
||||
|
||||
Only that last part is significant, and is where I got my timing information from.
|
||||
Only that last part is significant, and is where I got my [timing
|
||||
information](https://gitlab.com/nebkor/ww/-/blob/897fd993ceaf9c77433d44f8d68009eb466ac3aa/src/bin/import_users.rs#L47-58)
|
||||
from.
|
||||
|
||||
The table that keeps track of what users want to watch was defined[^not-final-form] like this:
|
||||
|
||||
|
@ -87,8 +89,8 @@ this project. Any time I need a database, my first reach is for SQLite:
|
|||
And, it's extremely performant. When using the [WAL journal mode](https://www.sqlite.org/wal.html)
|
||||
and the [recommended durability setting](https://www.sqlite.org/pragma.html#pragma_synchronous) for
|
||||
WAL mode, along with all other production-appropriate settings, I got almost 20,000 *writes* per
|
||||
second[^nothing is that slow]. There were multiple concurrent writers, and each write was a transaction that inserted about
|
||||
100 rows at a time. I had [retry
|
||||
second[^nothing is that slow]. There were multiple concurrent writers, and each write was a
|
||||
transaction that inserted about 100 rows at a time. I had [retry
|
||||
logic](https://gitlab.com/nebkor/ww/-/blob/4c44aa12b081c777c82192755ac85d1fe0f5bdca/src/bin/import_users.rs#L143-145)
|
||||
in case a transaction failed due to the DB being locked by another writer, but that never happened:
|
||||
each write was just too fast.
|
||||
|
@ -129,7 +131,7 @@ that link:
|
|||
|
||||
<div class="caption">sorry what was that about secondary indices i didn't quite catch that</div>
|
||||
|
||||
HALF the disk space *and* TWICE as fast?? Yes, sign me up, please!
|
||||
HALF the disk space, *and* TWICE as fast?? Yes, sign me up, please!
|
||||
|
||||
## Sorry, the best I can do is all the disk space
|
||||
|
||||
|
@ -166,23 +168,56 @@ Imagine my surprise when it took nearly 20% longer to run, and the total size on
|
|||
larger. Using random UUIDs was even slower, so there's still a relative speed win for ULIDs, but it
|
||||
was still an overall loss to go without the rowid. Maybe it was time to think outside the box?
|
||||
|
||||
## Schema pruning
|
||||
## Schema husbandry
|
||||
|
||||
I had several goals with this whole benchmarking endeavor. One, of course, was to get data on ULIDs
|
||||
vs. UUIDs in terms of performance, at the very least so that I could write about when I publicly
|
||||
said I would. But another, and actually-more-important goal, was to optimize the design of my
|
||||
database and software, especially as it came to size on disk (my most-potentially-scare computing
|
||||
resource; network and CPU are not problems until you get *very* large, and you would have long ago
|
||||
bottlenecked on disk size if you weren't careful).
|
||||
I had several goals with this whole benchmarking endeavor. One, of course, was to get performance
|
||||
data on ULIDs vs. UUIDs, at the very least so that I could write about it when I publicly had said I
|
||||
would. But another, and actually-more-important goal, was to optimize the design of my database and
|
||||
software, especially as it came to size on disk (my most-potentially-scarce computing resource;
|
||||
network and CPU are not problems until you get *very* large, and you would have long ago
|
||||
bottlenecked on storage if you weren't careful).
|
||||
|
||||
So it was Cool and Fine to take advantage of the new capabilities that ULIDs offered if those new
|
||||
capabilities resulted in better resource use. Every table in my original, UUID-based schema had had
|
||||
a `created_at` column, stored as a 64-bit signed offset from the [UNIX
|
||||
epoch](https://en.wikipedia.org/wiki/Unix_time). Because ULIDs encode their creation time, I could
|
||||
remove that column from every table that used ULIDs as their primary key. Doing so dropped the
|
||||
overall DB size by 5-10% compared to UUID-based tables with a `created_at` column.
|
||||
remove that column from every table that used ULIDs as their primary key. [Doing
|
||||
so](https://gitlab.com/nebkor/ww/-/commit/5782651aa691125f11a80e241f14c681dda7a7c1) dropped the
|
||||
overall DB size by 5-10% compared to UUID-based tables with a `created_at` column. This advantage
|
||||
was unique to ULIDs as opposed to UUIDv4s, and so using the latter with a schema that excludude a
|
||||
"created at" column was giving an unrealistic edge to UUIDs, but for my benchmarks, I was interested
|
||||
in isolating their effect on index sizes, so it was OK.
|
||||
|
||||
But I also realized that for the `watch_quests` table, no explicit
|
||||
I also realized that for the `watch_quests` table, no explicit ID needed to be added; there were
|
||||
already two `UNIQUE` constraints for each row, that would together uniquely identify that row: the
|
||||
ID of the user that wanted to watch something, and the ID of the thing they wanted to watch. Primary
|
||||
keys don't need to be a single column; when two or more columns in a table are used as a primary
|
||||
key, it's called a "composite key". You may recall from the "when should you use `without rowid`"
|
||||
section that composite keys were one such situation where it may be beneficial. Surely this would
|
||||
help!
|
||||
|
||||
``` sql
|
||||
create table if not exists witch_watch (
|
||||
witch blob not null,
|
||||
watch blob not null,
|
||||
[...]
|
||||
primary key (witch, watch)
|
||||
) without rowid;
|
||||
```
|
||||
<div class="caption">"witch" and "watch" are still foreign keys</div>
|
||||
|
||||
And, it did, a little. I also took a more critical eye to that table as a whole, and realized I
|
||||
could [tidy up the
|
||||
DB](https://gitlab.com/nebkor/ww/-/commit/0e016552ab6c66d5fdd82704b6277bd857c94188?view=parallel#f1043d50a0244c34e4d056fe96659145d03b549b_34_34)
|
||||
a little more, and remove one more redundant field; this helped a little bit, too.
|
||||
|
||||
But overall, things were still looking like ULIDs had no real inherent advantage over UUIDs in the
|
||||
context of clustered indexes, given the schema I was using, when it came to disk space. For sure,
|
||||
ULIDs continued to enjoy an advantage in insertion speed, but as I tightened up my code for
|
||||
inserting these values for this benchmark, the marginal advantage there kept shrinking. Ultimately,
|
||||
this advantage completely shrank as I made the schema and code more optimal, but that's getting
|
||||
slightly ahead of things. I had to this point achieved almost the final form, but one more change
|
||||
had to be made.
|
||||
|
||||
# At last, I've reached my final form
|
||||
|
||||
|
@ -194,20 +229,78 @@ or UUID primary key, the indexes looked like, eg, this:
|
|||
``` text
|
||||
16-byte blob -> 16-byte blob
|
||||
```
|
||||
<div class="caption">left side is, eg, user id, and right side is id of a row in the quests table</div>
|
||||
<div class="caption">left side is a user id or watch id, and right side is the id of a row in the
|
||||
quests table</div>
|
||||
|
||||
But, in the case that there was a `rowid` primary key in the `watch_quest` table, the index entries for,
|
||||
eg, `user` to "watch quest" would look like:
|
||||
|
||||
``` text
|
||||
16-byte blob -> 8-byte number (rowid)
|
||||
```
|
||||
|
||||
The astute among you may note that 8 is only half of 16, and if you recall that there are two
|
||||
secondary indexes that look like that, the total number of secondary index bytes is 64 in the
|
||||
`without rowid` case, and only 48 in the case that there is a rowid.
|
||||
|
||||
There's also a bit of cautious wisdom about performance implications of the implementation that
|
||||
backs the `without rowid` tables:
|
||||
|
||||
> WITHOUT ROWID tables are implemented using ordinary B-Trees with content stored on both leaves and
|
||||
> intermediate nodes. Storing content in intermediate nodes causes each intermediate node entry to
|
||||
> take up more space on the page and thus reduces the fan-out, increasing the search cost.
|
||||
|
||||
the fan-out when using `without rowid` was about 20% lower than when using the rowids, and it seems
|
||||
like this was slowing things down.
|
||||
|
||||
Thinking on it some more, there's really no real reason to give this table a distinct and robust
|
||||
identity for its rows; the real identity is carried by its combination of `(user, watch)` columns,
|
||||
but even then, the value of distinct identity for these rows is low. If that's the case, which it
|
||||
is, then why give it an explicit primary key at all? The program and the users don't need to worry
|
||||
about the primary key for that table. It would also eliminate an entire index (an
|
||||
automatically-generated "primary key to rowid" index), resulting in the ultimate space savings.
|
||||
|
||||
So, [that's what I
|
||||
did](https://gitlab.com/nebkor/ww/-/commit/2c7990ff09106fa2a9ec30974bbc377b44082082):
|
||||
|
||||
``` sql
|
||||
-- table of what people want to watch
|
||||
create table if not exists watch_quests (
|
||||
user blob not null,
|
||||
watch blob not null,
|
||||
priority int, -- 1-5 how much do you want to watch it
|
||||
public boolean not null default true,
|
||||
watched boolean not null default false,
|
||||
when_watched int,
|
||||
created_at int not null default (unixepoch()),
|
||||
last_updated int not null default (unixepoch()),
|
||||
foreign key (user) references users (id) on delete cascade on update no action,
|
||||
foreign key (watch) references watches (id) on delete cascade on update no action
|
||||
);
|
||||
|
||||
create index if not exists quests_user_dex on watch_quests (user);
|
||||
create index if not exists quests_watch_dex on watch_quests (watch);
|
||||
```
|
||||
|
||||
There's the full and final schema.
|
||||
|
||||
In the default benchmark, with 1,000 users each saving about 100 things to watch, that schema change
|
||||
dropped the total size on disk about 25% (from 17 megabytes to 13), and the percentage of the total
|
||||
database consumed by the indexes of the `watch_quests` table went from 51% to 43% (that means the
|
||||
indexes went from being about 8.6MB to 5.6MB, 35% less than when using a composite primary key).
|
||||
|
||||
using implicit rowid with ULIDs:
|
||||
|
||||
``` text
|
||||
*** Indices of table WATCH_QUESTS *********************************************
|
||||
|
||||
Percentage of total database...................... 43.3%
|
||||
Number of entries................................. 199296
|
||||
Average fanout.................................... 106.00
|
||||
```
|
||||
|
||||
It also dropped the total time to insert the 100k records from >6 seconds to just 5; I ran the
|
||||
benchmark multiple times and got the same results, then tried running it with 2,000 users saving 200
|
||||
movies (4x the previous benchmark), and the results held uncannily:
|
||||
|
||||
``` text
|
||||
$ cargo run --release --bin import_users -- -d ~/movies.db -u 2000 -m 200
|
||||
[...]
|
||||
|
@ -215,21 +308,91 @@ Added 398119 quests in 20.818506 seconds
|
|||
```
|
||||
<div class="caption">20k writes/second, baby</div>
|
||||
|
||||
size on disk is 75% of previous size (13M vs 17M)
|
||||
Just for kicks, I tried it with UUID-based IDs, and the time and space characteristics were finally
|
||||
completely indistinguishable. This pleased me; real-world perf with ULIDs would be better than with
|
||||
UUIDs with a production schema that included `created_at` columns, and UUIDs would obligate columns
|
||||
like that if you wanted to keep track of, you know, when things were created. Ironically, by moving
|
||||
to implicit integer rowid primary keys for the `watch_quests` table, I had to make sure that there
|
||||
was a `created_at` column for it. Still a win, though!
|
||||
|
||||
## Next steps with IDs
|
||||
|
||||
This project is supposed to be more than just a testbed for learning about databases and web
|
||||
frameworks and sortable unique identifiers; it's supposed to be an actual thing that my wife and I
|
||||
can use for ourselves and with our friends. I even made a snazzy logo!
|
||||
|
||||
![what to watch][logo]
|
||||
|
||||
The gods, it seems, have other plans.
|
||||
|
||||
Namely, it bothers me that ID generation is not done inside the database itself. Aside from being a
|
||||
generally bad idea, this lead to at least one frustrating debug session where I was inserting one ID
|
||||
but reporting back another. SQLite doesn't have native support for this, but it does have good
|
||||
native support for [loading shared libraries as plugins](https://www.sqlite.org/loadext.html) in
|
||||
order to add functionality to it, and so my next step is to write one of those, and remove the ID
|
||||
generation logic from the application.
|
||||
|
||||
Doing so would also allow me to address an underlying error in the way the application generates
|
||||
them. The [ULID spec](https://github.com/ulid/spec) contains the following note about IDs generated
|
||||
within the same millisecond:
|
||||
|
||||
> When generating a ULID within the same millisecond, we can provide some guarantees regarding sort
|
||||
> order. Namely, if the same millisecond is detected, the random component is incremented by 1 bit
|
||||
> in the least significant bit position (with carrying).
|
||||
|
||||
I don't do that[^sequential ids], because doing so requires a single ID factory, and I don't want to
|
||||
have to thread that through the web app backend code. On the other hand, I *do* want to have a
|
||||
single ID factory inside the database, which an extension plugin would provide.
|
||||
|
||||
Then I'll get back to the web app.
|
||||
|
||||
# Thanks and goodbye
|
||||
|
||||
OK, well, here we are, at the end of yet another three-thousand-word meeting that could have been an
|
||||
email; sorry about that, and thanks for sticking with it until the end! As usual, it was hard to not
|
||||
just keep adding more commentary and footnotes and explication, and I give myself a 'C+' there, at
|
||||
best. At least there are only four footnotes.
|
||||
|
||||
Still, I read and watched a lot of different things in the course of doing this work. Obviously the
|
||||
SQLite project was critical, and every time I need to consult their documentation, I appreciate it
|
||||
more (aside from the software itself, of course!). Towards the end of the this work, right as I was
|
||||
starting to write this post, I discovered this [series of
|
||||
videos](https://www.youtube.com/playlist?list=PLWENznQwkAoxww-cDEfIJ-uuPDfFwbeiJ) about SQLite, from
|
||||
[Mycelial](https://github.com/mycelial), who are "a maker of local-first software development
|
||||
libraries". I'm a huge fan of [local-first software](https://www.inkandswitch.com/local-first/), and
|
||||
one of the reasons I initially chose SQLite was for its suitability for that paradigm. Thank you,
|
||||
SQLite and Mycelial!
|
||||
|
||||
Good bye :)
|
||||
|
||||
|
||||
----
|
||||
|
||||
[^random-users]: I did the classic "open `/usr/share/dict/words` and randomly select a couple things
|
||||
to stick together" method of username generation, which results in gems like
|
||||
"Hershey_motivations84" and "italicizes_creaminesss54". This is old-skool generative AI.
|
||||
"Hershey_motivations84" and "italicizes_creaminesss54". This is old-skool generative content.
|
||||
|
||||
[^not-final-form]: The original schema was defined some time ago, and it took me a while to get to
|
||||
the point where I was actually writing code that used it. In the course of doing the benchmarks,
|
||||
and even in the course of writing this post, I've made changes in response to things I learned
|
||||
from the benchmarks and to things I realized by thinking more about it and reading more docs.
|
||||
|
||||
[^nothing is that slow]: old job python 100 reqs/sec fall down
|
||||
[^nothing is that slow]: At one of my previous jobs, there was a rather important internal service,
|
||||
written in Python and talking to a PostgreSQL backend, that would basically completely fall over
|
||||
if more than 100 or so requests per second were made to it. Its introduction to
|
||||
mission-criticality had pre-dated my time there, and when it had first been deployed, the
|
||||
demands upon it had been more modest. But it was now a problem, and I and a teammate put aside
|
||||
some time to pluck some low-hanging fruit. A colleague on a peer team, who was that team's tech
|
||||
lead and truly a beast of a programmer, said that he thought that the reason it could handle
|
||||
only 100 requests/second was that "Python is slow." This shocked me; Python is not that
|
||||
slow. PostgreSQL is not that slow. Nothing is that slow, especially in an enterprise environment
|
||||
where you're just slinging data around via API; if it's that slow, you're doing it wrong. What
|
||||
he said haunts me to this very day. Anyway, we tweaked the slowest query in the API callchain a
|
||||
smidge and sped it up by a few factors; we left a ton of perf on the floor there still, but
|
||||
c'est la vie.
|
||||
|
||||
[an_image]: /images/programmers_creed.jpg "some kinda image idunno"
|
||||
[^sequential ids]: At one point, I was worried that because all the entries in my benchmark were
|
||||
being created at close to 20 per millisecond, that the resulting IDs would be essentially
|
||||
random, so I forced the IDs to be sequential. This wound up being a red herring.
|
||||
|
||||
[logo]: ./what2watch_logo.png "what to watch logo; an eyeball filled with static, and with a red iris, looking down at you"
|
||||
|
|
BIN
content/rnd/ulid_benchmarks/what2watch_logo.png
Normal file
BIN
content/rnd/ulid_benchmarks/what2watch_logo.png
Normal file
Binary file not shown.
After Width: | Height: | Size: 115 KiB |
Loading…
Reference in a new issue