update benchmark post with actual final schema

This commit is contained in:
Joe Ardent 2023-07-21 16:02:03 -07:00
parent b8b81131a9
commit 62e0ccaaa8
2 changed files with 15 additions and 2 deletions

View file

@ -2,6 +2,7 @@
title = "A One-Part Serialized Mystery, Part 2: The Benchmarks"
slug = "one-part-serialized-mystery-part-2"
date = "2023-07-15"
updated = "2023-07-21"
[taxonomies]
tags = ["software", "rnd", "proclamation", "upscm", "rust", "sqlite"]
+++
@ -289,7 +290,6 @@ dropped the total size on disk about 25% (from 17 megabytes to 13), and the perc
database consumed by the indexes of the `watch_quests` table went from 51% to 43% (that means the
indexes went from being about 8.6MB to 5.6MB, 35% less than when using a composite primary key).
``` text
*** Indices of table WATCH_QUESTS *********************************************
@ -315,6 +315,19 @@ like that if you wanted to keep track of, you know, when things were created. Ir
to implicit integer rowid primary keys for the `watch_quests` table, I had to make sure that there
was a `created_at` column for it. Still a win, though!
### *UPDATE (2023-07-21)!*
Something I realized with the "final" schema is that you could have duplicate rows, since the only
unique field was the `rowid`. I didn't want this. So, rather than create a `unique index on
watch_quests (user, watch)`, I [just
added](https://gitlab.com/nebkor/ww/-/commit/c685dc1a6b08d9ff6bafc72582acb539651a350c) a `primary
key (user, watch)`.
If that looks familiar, good eye! Doing this brings the disk usage back up to 17MB in the baseline
benchmark, but the insert rate is still the same. In the grand scheme of things, this is still not a
lot of data, so I'll take it anyway.
## Next steps with IDs
This project is supposed to be more than just a testbed for learning about databases and web

@ -1 +1 @@
Subproject commit eb02b7d3c18a397fe5baa394b50fe2c199208dbe
Subproject commit 54e1c70c93ad5fe261a00ddf697856c621d8fc87