checkpoint

2023-07-14 18:10:48 -07:00 · 2023-07-14 18:10:48 -07:00 · 7955a3aa69
commit 7955a3aa69
parent 568352597b
1 changed files with 235 additions and 0 deletions
--- a/content/rnd/ulid_benchmarks/index.md
+++ b/content/rnd/ulid_benchmarks/index.md
@ -0,0 +1,235 @@
+++
+title = "A One-Part Serialized Mystery, Part 2: The Benchmarks"
+slug = "one-part-serialized-mystery-part-2"
+date = "2023-07-09"
+[taxonomies]
+tags = ["software", "rnd", "proclamation", "upscm", "rust", "macros"]
+++
+
+# A one-part serial mystery post-hoc prequel
+
+I [wrote recently](/rnd/one-part-serialized-mystery) about switching the types of the primary keys in
+the database for an [in-progress web app](https://gitlab.com/nebkor/ww) I'm building. At that time,
+I'd not yet done any benchmarking, but had reason to believe that using [sortable primary
+keys](https://github.com/ulid/spec) would yield some possibly-significant gains in performance, in
+both time and space. I'd also read accounts of regret that databases had not used ULIDs (instead of
+[UUIDs](https://en.wikipedia.org/wiki/Universally_unique_identifier#Version_4_(random))) from the
+get-go, so I decided it couldn't hurt to switch to them before I had any actual data in my DB.
+
+And that was correct: it didn't hurt performance, but it also didn't help much either. I've spent a
+bunch of time now doing comparative benchmarks between ULIDs and UUIDs, and as I explain below, the
+anticipated space savings did not materialize, and the speed-up is merely augmenting what was
+already more than fast enough into slightly more faster than that. Of course, of course, and as
+always, the real treasure was the friends we made along the way etc., etc. So come along on a brief
+journey of discovery!
+
+# Bottom Line Up Front
+
+With sqlite and my final table schema, the size difference and speed differences are negligible,
+
+TODO MOR STUFFF
+
+
+However, with my initial database layout and import code, ULIDs resulted in about 5% less space and
+took only about 2/3rds as much time as when using UUIDs (5.7 vs 9.8 seconds). The same space and
+time results held whether or not [`without rowid`](https://www.sqlite.org/withoutrowid.html) was
+specified on table creation, which was counter to expectation, though I now understand why; I'll
+explain at the end.
+
+# It's a setup
+
+My benchmark is pretty simple: starting from an empty database, do the following things:
+
+ 1. insert 10,000 randomly chosen movies (title and year of release, from between 1965 and 2023) into
+   the database
+ 1. create 1,000 random users[^random-users]
+ 1. for each user, randomly select around 100 movies from the 10,000 available and put them on their list of
+   things to watch
+
+Only that last part is significant, and is where I got my timing information from.
+
+The table that keeps track of what users want to watch was defined[^not-final-form] like this:
+
+``` sql
+create table if not exists witch_watch (
+  id blob not null primary key,
+  witch blob not null, -- "user"
+  watch blob not null, -- "thing to watch"
+  [...]
+  foreign key (witch) references witches (id) on delete cascade on update no action,
+  foreign key (watch) references watches (id) on delete cascade on update no action
+);
+[...]
+create index if not exists ww_witch_dex on witch_watch (witch);
+create index if not exists ww_watch_dex on witch_watch (watch);
+```
+
+The kind of queries I'm trying to optimize with those indices is "what movies does a certain user
+want to watch?" and "what users want to watch a certain movie?". The IDs are 16-byte blobs; an
+entire row in the table is less than 100 bytes.
+
+## A digression on SQLite and performance
+
+I've mentioned once or twice before that I'm using [SQLite](https://www.sqlite.org/index.html) for
+this project. Any time I need a database, my first reach is for SQLite:
+
+ * the database is a single file, along with a couple temp files that live alongside it, simplifying
+   management
+ * there's no network involved between the client and the database; a connection to the database is
+   a pointer to an object that lives in the same process as the host program; this means that read
+   queries return data back in just a [few
+   *microseconds*](https://www.youtube.com/watch?v=qPfAQY_RahA)
+ * it scales vertically extremely well; it can handle database sizes of many terabytes
+ * it's one of the most widely-installed pieces of software in the world; there's at least one
+   sqlite database on every smartphone, and there's a robust ecosystem of [useful
+   extensions](https://litestream.io/) and other bits of complimentary code freely available
+
+And, it's extremely performant. When using the [WAL journal mode](https://www.sqlite.org/wal.html)
+and the [recommended durability setting](https://www.sqlite.org/pragma.html#pragma_synchronous) for
+WAL mode, along with all other production-appropriate settings, I got almost 20,000 *writes* per
+second[^nothing is that slow]. There were multiple concurrent writers, and each write was a transaction that inserted about
+100 rows at a time. I had [retry
+logic](https://gitlab.com/nebkor/ww/-/blob/4c44aa12b081c777c82192755ac85d1fe0f5bdca/src/bin/import_users.rs#L143-145)
+in case a transaction failed due to the DB being locked by another writer, but that never happened:
+each write was just too fast.
+
+# Over-indexing on sortability
+
+The reason I had hoped that ULIDs would help with keeping the sizes of the indexes down was the
+possibility of using [clustered
+indexes](https://www.sqlite.org/withoutrowid.html#benefits_of_without_rowid_tables). To paraphrase
+that link:
+
+> In an ordinary SQLite table, the PRIMARY KEY is really just a UNIQUE index. The key used to look
+> up records on disk is the rowid. [...]any other kind of PRIMARY KEYs, including "INT PRIMARY KEY"
+> are just unique indexes in an ordinary rowid table.
+>
+> ...
+>
+> Consider querying this table to find the number of occurrences of the word "xsync".:
+>    SELECT cnt FROM wordcount WHERE word='xsync';
+>
+> This query first has to search the index B-Tree looking for any entry that contains the matching
+> value for "word". When an entry is found in the index, the rowid is extracted and used to search
+> the main table. Then the "cnt" value is read out of the main table and returned. Hence, two
+> separate binary searches are required to fulfill the request.
+>
+> A WITHOUT ROWID table uses a different data design for the equivalent table. [in those tables],
+> there is only a single B-Tree...  Because there is only a single B-Tree, the text of the "word"
+> column is only stored once in the database. Furthermore, querying the "cnt" value for a specific
+> "word" only involves a single binary search into the main B-Tree, since the "cnt" value can be
+> retrieved directly from the record found by that first search and without the need to do a second
+> binary search on the rowid.
+>
+> Thus, in some cases, a WITHOUT ROWID table can use about half the amount of disk space and can
+> operate nearly twice as fast. Of course, in a real-world schema, there will typically be secondary
+> indices and/or UNIQUE constraints, and the situation is more complicated. But even then, there can
+> often be space and performance advantages to using WITHOUT ROWID on tables that have non-integer
+> or composite PRIMARY KEYs.
+
+<div class="caption">sorry what was that about secondary indices i didn't quite catch that</div>
+
+HALF the disk space *and* TWICE as fast?? Yes, sign me up, please!
+
+## Sorry, the best I can do is all the disk space
+
+There are some [guidelines](https://www.sqlite.org/withoutrowid.html#when_to_use_without_rowid)
+about when to use `without rowid`:
+
+> The WITHOUT ROWID optimization is likely to be helpful for tables that have non-integer or
+> composite (multi-column) PRIMARY KEYs and that do not store large strings or BLOBs.
+>
+> [...]
+>
+> WITHOUT ROWID tables work best when individual rows are not too large. A good rule-of-thumb is
+> that the average size of a single row in a WITHOUT ROWID table should be less than about 1/20th
+> the size of a database page. That means that rows should not contain more than ... about 200 bytes
+> each for 4KiB page size.
+
+As I mentioned, each row in that table was less than 100 bytes, so comfortably within the given
+heuristic. In order to test this out, all I had to do was change the table creation statement to:
+
+``` sql
+create table if not exists witch_watch (
+  id blob not null primary key,
+  witch blob not null, -- "user"
+  watch blob not null, -- "thing to watch"
+  [...]
+  foreign key (witch) references witches (id) on delete cascade on update no action,
+  foreign key (watch) references watches (id) on delete cascade on update no action
+) without rowid;
+```
+
+So I did.
+
+Imagine my surprise when it took nearly 20% longer to run, and the total size on disk was nearly 5%
+larger. Using random UUIDs was even slower, so there's still a relative speed win for ULIDs, but it
+was still an overall loss to go without the rowid. Maybe it was time to think outside the box?
+
+## Schema pruning
+
+I had several goals with this whole benchmarking endeavor. One, of course, was to get data on ULIDs
+vs. UUIDs in terms of performance, at the very least so that I could write about when I publicly
+said I would. But another, and actually-more-important goal, was to optimize the design of my
+database and software, especially as it came to size on disk (my most-potentially-scare computing
+resource; network and CPU are not problems until you get *very* large, and you would have long ago
+bottlenecked on disk size if you weren't careful).
+
+So it was Cool and Fine to take advantage of the new capabilities that ULIDs offered if those new
+capabilities resulted in better resource use. Every table in my original, UUID-based schema had had
+a `created_at` column, stored as a 64-bit signed offset from the [UNIX
+epoch](https://en.wikipedia.org/wiki/Unix_time). Because ULIDs encode their creation time, I could
+remove that column from every table that used ULIDs as their primary key. Doing so dropped the
+overall DB size by 5-10% compared to UUID-based tables with a `created_at` column.
+
+But I also realized that for the `watch_quests` table, no explicit
+
+# At last, I've reached my final form
+
+In the course of writing this post, I had a minor epiphany, which is that the reason for the
+regressed performance when using `without rowid` was that the secondary indices needed to point to
+the entries in the table, using the primary key of the table as the target. So when there was a ULID
+or UUID primary key, the indexes looked like, eg, this:
+
+``` text
+16-byte blob -> 16-byte blob
+```
+<div class="caption">left side is, eg, user id, and right side is id of a row in the quests table</div>
+
+
+
+using implicit rowid with ULIDs:
+
+``` text
+*** Indices of table WATCH_QUESTS *********************************************
+
+Percentage of total database......................  43.3%
+Number of entries................................. 199296
+Average fanout.................................... 106.00
+```
+
+``` text
+$ cargo run --release --bin import_users -- -d ~/movies.db -u 2000 -m 200
+[...]
+Added 398119 quests in 20.818506 seconds
+```
+<div class="caption">20k writes/second, baby</div>
+
+size on disk is 75% of previous size (13M vs 17M)
+
+
+
+----
+
+[^random-users]: I did the classic "open `/usr/share/dict/words` and randomly select a couple things
+    to stick together" method of username generation, which results in gems like
+    "Hershey_motivations84" and "italicizes_creaminesss54". This is old-skool generative AI.
+
+[^not-final-form]: The original schema was defined some time ago, and it took me a while to get to
+    the point where I was actually writing code that used it. In the course of doing the benchmarks,
+    and even in the course of writing this post, I've made changes in response to things I learned
+    from the benchmarks and to things I realized by thinking more about it and reading more docs.
+
+[^nothing is that slow]: old job python 100 reqs/sec fall down
+
+[an_image]: /images/programmers_creed.jpg "some kinda image idunno"