checkpoint
This commit is contained in:
parent
568352597b
commit
7955a3aa69
1 changed files with 235 additions and 0 deletions
235
content/rnd/ulid_benchmarks/index.md
Normal file
235
content/rnd/ulid_benchmarks/index.md
Normal file
|
@ -0,0 +1,235 @@
|
|||
+++
|
||||
title = "A One-Part Serialized Mystery, Part 2: The Benchmarks"
|
||||
slug = "one-part-serialized-mystery-part-2"
|
||||
date = "2023-07-09"
|
||||
[taxonomies]
|
||||
tags = ["software", "rnd", "proclamation", "upscm", "rust", "macros"]
|
||||
+++
|
||||
|
||||
# A one-part serial mystery post-hoc prequel
|
||||
|
||||
I [wrote recently](/rnd/one-part-serialized-mystery) about switching the types of the primary keys in
|
||||
the database for an [in-progress web app](https://gitlab.com/nebkor/ww) I'm building. At that time,
|
||||
I'd not yet done any benchmarking, but had reason to believe that using [sortable primary
|
||||
keys](https://github.com/ulid/spec) would yield some possibly-significant gains in performance, in
|
||||
both time and space. I'd also read accounts of regret that databases had not used ULIDs (instead of
|
||||
[UUIDs](https://en.wikipedia.org/wiki/Universally_unique_identifier#Version_4_(random))) from the
|
||||
get-go, so I decided it couldn't hurt to switch to them before I had any actual data in my DB.
|
||||
|
||||
And that was correct: it didn't hurt performance, but it also didn't help much either. I've spent a
|
||||
bunch of time now doing comparative benchmarks between ULIDs and UUIDs, and as I explain below, the
|
||||
anticipated space savings did not materialize, and the speed-up is merely augmenting what was
|
||||
already more than fast enough into slightly more faster than that. Of course, of course, and as
|
||||
always, the real treasure was the friends we made along the way etc., etc. So come along on a brief
|
||||
journey of discovery!
|
||||
|
||||
# Bottom Line Up Front
|
||||
|
||||
With sqlite and my final table schema, the size difference and speed differences are negligible,
|
||||
|
||||
TODO MOR STUFFF
|
||||
|
||||
|
||||
However, with my initial database layout and import code, ULIDs resulted in about 5% less space and
|
||||
took only about 2/3rds as much time as when using UUIDs (5.7 vs 9.8 seconds). The same space and
|
||||
time results held whether or not [`without rowid`](https://www.sqlite.org/withoutrowid.html) was
|
||||
specified on table creation, which was counter to expectation, though I now understand why; I'll
|
||||
explain at the end.
|
||||
|
||||
# It's a setup
|
||||
|
||||
My benchmark is pretty simple: starting from an empty database, do the following things:
|
||||
|
||||
1. insert 10,000 randomly chosen movies (title and year of release, from between 1965 and 2023) into
|
||||
the database
|
||||
1. create 1,000 random users[^random-users]
|
||||
1. for each user, randomly select around 100 movies from the 10,000 available and put them on their list of
|
||||
things to watch
|
||||
|
||||
Only that last part is significant, and is where I got my timing information from.
|
||||
|
||||
The table that keeps track of what users want to watch was defined[^not-final-form] like this:
|
||||
|
||||
``` sql
|
||||
create table if not exists witch_watch (
|
||||
id blob not null primary key,
|
||||
witch blob not null, -- "user"
|
||||
watch blob not null, -- "thing to watch"
|
||||
[...]
|
||||
foreign key (witch) references witches (id) on delete cascade on update no action,
|
||||
foreign key (watch) references watches (id) on delete cascade on update no action
|
||||
);
|
||||
[...]
|
||||
create index if not exists ww_witch_dex on witch_watch (witch);
|
||||
create index if not exists ww_watch_dex on witch_watch (watch);
|
||||
```
|
||||
|
||||
The kind of queries I'm trying to optimize with those indices is "what movies does a certain user
|
||||
want to watch?" and "what users want to watch a certain movie?". The IDs are 16-byte blobs; an
|
||||
entire row in the table is less than 100 bytes.
|
||||
|
||||
## A digression on SQLite and performance
|
||||
|
||||
I've mentioned once or twice before that I'm using [SQLite](https://www.sqlite.org/index.html) for
|
||||
this project. Any time I need a database, my first reach is for SQLite:
|
||||
|
||||
* the database is a single file, along with a couple temp files that live alongside it, simplifying
|
||||
management
|
||||
* there's no network involved between the client and the database; a connection to the database is
|
||||
a pointer to an object that lives in the same process as the host program; this means that read
|
||||
queries return data back in just a [few
|
||||
*microseconds*](https://www.youtube.com/watch?v=qPfAQY_RahA)
|
||||
* it scales vertically extremely well; it can handle database sizes of many terabytes
|
||||
* it's one of the most widely-installed pieces of software in the world; there's at least one
|
||||
sqlite database on every smartphone, and there's a robust ecosystem of [useful
|
||||
extensions](https://litestream.io/) and other bits of complimentary code freely available
|
||||
|
||||
And, it's extremely performant. When using the [WAL journal mode](https://www.sqlite.org/wal.html)
|
||||
and the [recommended durability setting](https://www.sqlite.org/pragma.html#pragma_synchronous) for
|
||||
WAL mode, along with all other production-appropriate settings, I got almost 20,000 *writes* per
|
||||
second[^nothing is that slow]. There were multiple concurrent writers, and each write was a transaction that inserted about
|
||||
100 rows at a time. I had [retry
|
||||
logic](https://gitlab.com/nebkor/ww/-/blob/4c44aa12b081c777c82192755ac85d1fe0f5bdca/src/bin/import_users.rs#L143-145)
|
||||
in case a transaction failed due to the DB being locked by another writer, but that never happened:
|
||||
each write was just too fast.
|
||||
|
||||
# Over-indexing on sortability
|
||||
|
||||
The reason I had hoped that ULIDs would help with keeping the sizes of the indexes down was the
|
||||
possibility of using [clustered
|
||||
indexes](https://www.sqlite.org/withoutrowid.html#benefits_of_without_rowid_tables). To paraphrase
|
||||
that link:
|
||||
|
||||
> In an ordinary SQLite table, the PRIMARY KEY is really just a UNIQUE index. The key used to look
|
||||
> up records on disk is the rowid. [...]any other kind of PRIMARY KEYs, including "INT PRIMARY KEY"
|
||||
> are just unique indexes in an ordinary rowid table.
|
||||
>
|
||||
> ...
|
||||
>
|
||||
> Consider querying this table to find the number of occurrences of the word "xsync".:
|
||||
> SELECT cnt FROM wordcount WHERE word='xsync';
|
||||
>
|
||||
> This query first has to search the index B-Tree looking for any entry that contains the matching
|
||||
> value for "word". When an entry is found in the index, the rowid is extracted and used to search
|
||||
> the main table. Then the "cnt" value is read out of the main table and returned. Hence, two
|
||||
> separate binary searches are required to fulfill the request.
|
||||
>
|
||||
> A WITHOUT ROWID table uses a different data design for the equivalent table. [in those tables],
|
||||
> there is only a single B-Tree... Because there is only a single B-Tree, the text of the "word"
|
||||
> column is only stored once in the database. Furthermore, querying the "cnt" value for a specific
|
||||
> "word" only involves a single binary search into the main B-Tree, since the "cnt" value can be
|
||||
> retrieved directly from the record found by that first search and without the need to do a second
|
||||
> binary search on the rowid.
|
||||
>
|
||||
> Thus, in some cases, a WITHOUT ROWID table can use about half the amount of disk space and can
|
||||
> operate nearly twice as fast. Of course, in a real-world schema, there will typically be secondary
|
||||
> indices and/or UNIQUE constraints, and the situation is more complicated. But even then, there can
|
||||
> often be space and performance advantages to using WITHOUT ROWID on tables that have non-integer
|
||||
> or composite PRIMARY KEYs.
|
||||
|
||||
<div class="caption">sorry what was that about secondary indices i didn't quite catch that</div>
|
||||
|
||||
HALF the disk space *and* TWICE as fast?? Yes, sign me up, please!
|
||||
|
||||
## Sorry, the best I can do is all the disk space
|
||||
|
||||
There are some [guidelines](https://www.sqlite.org/withoutrowid.html#when_to_use_without_rowid)
|
||||
about when to use `without rowid`:
|
||||
|
||||
> The WITHOUT ROWID optimization is likely to be helpful for tables that have non-integer or
|
||||
> composite (multi-column) PRIMARY KEYs and that do not store large strings or BLOBs.
|
||||
>
|
||||
> [...]
|
||||
>
|
||||
> WITHOUT ROWID tables work best when individual rows are not too large. A good rule-of-thumb is
|
||||
> that the average size of a single row in a WITHOUT ROWID table should be less than about 1/20th
|
||||
> the size of a database page. That means that rows should not contain more than ... about 200 bytes
|
||||
> each for 4KiB page size.
|
||||
|
||||
As I mentioned, each row in that table was less than 100 bytes, so comfortably within the given
|
||||
heuristic. In order to test this out, all I had to do was change the table creation statement to:
|
||||
|
||||
``` sql
|
||||
create table if not exists witch_watch (
|
||||
id blob not null primary key,
|
||||
witch blob not null, -- "user"
|
||||
watch blob not null, -- "thing to watch"
|
||||
[...]
|
||||
foreign key (witch) references witches (id) on delete cascade on update no action,
|
||||
foreign key (watch) references watches (id) on delete cascade on update no action
|
||||
) without rowid;
|
||||
```
|
||||
|
||||
So I did.
|
||||
|
||||
Imagine my surprise when it took nearly 20% longer to run, and the total size on disk was nearly 5%
|
||||
larger. Using random UUIDs was even slower, so there's still a relative speed win for ULIDs, but it
|
||||
was still an overall loss to go without the rowid. Maybe it was time to think outside the box?
|
||||
|
||||
## Schema pruning
|
||||
|
||||
I had several goals with this whole benchmarking endeavor. One, of course, was to get data on ULIDs
|
||||
vs. UUIDs in terms of performance, at the very least so that I could write about when I publicly
|
||||
said I would. But another, and actually-more-important goal, was to optimize the design of my
|
||||
database and software, especially as it came to size on disk (my most-potentially-scare computing
|
||||
resource; network and CPU are not problems until you get *very* large, and you would have long ago
|
||||
bottlenecked on disk size if you weren't careful).
|
||||
|
||||
So it was Cool and Fine to take advantage of the new capabilities that ULIDs offered if those new
|
||||
capabilities resulted in better resource use. Every table in my original, UUID-based schema had had
|
||||
a `created_at` column, stored as a 64-bit signed offset from the [UNIX
|
||||
epoch](https://en.wikipedia.org/wiki/Unix_time). Because ULIDs encode their creation time, I could
|
||||
remove that column from every table that used ULIDs as their primary key. Doing so dropped the
|
||||
overall DB size by 5-10% compared to UUID-based tables with a `created_at` column.
|
||||
|
||||
But I also realized that for the `watch_quests` table, no explicit
|
||||
|
||||
# At last, I've reached my final form
|
||||
|
||||
In the course of writing this post, I had a minor epiphany, which is that the reason for the
|
||||
regressed performance when using `without rowid` was that the secondary indices needed to point to
|
||||
the entries in the table, using the primary key of the table as the target. So when there was a ULID
|
||||
or UUID primary key, the indexes looked like, eg, this:
|
||||
|
||||
``` text
|
||||
16-byte blob -> 16-byte blob
|
||||
```
|
||||
<div class="caption">left side is, eg, user id, and right side is id of a row in the quests table</div>
|
||||
|
||||
|
||||
|
||||
using implicit rowid with ULIDs:
|
||||
|
||||
``` text
|
||||
*** Indices of table WATCH_QUESTS *********************************************
|
||||
|
||||
Percentage of total database...................... 43.3%
|
||||
Number of entries................................. 199296
|
||||
Average fanout.................................... 106.00
|
||||
```
|
||||
|
||||
``` text
|
||||
$ cargo run --release --bin import_users -- -d ~/movies.db -u 2000 -m 200
|
||||
[...]
|
||||
Added 398119 quests in 20.818506 seconds
|
||||
```
|
||||
<div class="caption">20k writes/second, baby</div>
|
||||
|
||||
size on disk is 75% of previous size (13M vs 17M)
|
||||
|
||||
|
||||
|
||||
----
|
||||
|
||||
[^random-users]: I did the classic "open `/usr/share/dict/words` and randomly select a couple things
|
||||
to stick together" method of username generation, which results in gems like
|
||||
"Hershey_motivations84" and "italicizes_creaminesss54". This is old-skool generative AI.
|
||||
|
||||
[^not-final-form]: The original schema was defined some time ago, and it took me a while to get to
|
||||
the point where I was actually writing code that used it. In the course of doing the benchmarks,
|
||||
and even in the course of writing this post, I've made changes in response to things I learned
|
||||
from the benchmarks and to things I realized by thinking more about it and reading more docs.
|
||||
|
||||
[^nothing is that slow]: old job python 100 reqs/sec fall down
|
||||
|
||||
[an_image]: /images/programmers_creed.jpg "some kinda image idunno"
|
Loading…
Reference in a new issue