update git urls
This commit is contained in:
parent
aeb93c29af
commit
c3499eadf1
4 changed files with 36 additions and 36 deletions
|
@ -2,7 +2,7 @@
|
|||
title = "A One-Part Serialized Mystery"
|
||||
slug = "one-part-serialized-mystery"
|
||||
date = "2023-06-29"
|
||||
updated = "2023-07-29"
|
||||
updated = "2025-07-21"
|
||||
[taxonomies]
|
||||
tags = ["software", "rnd", "proclamation", "upscm", "rust", "ulid", "sqlite"]
|
||||
+++
|
||||
|
@ -12,8 +12,8 @@ tags = ["software", "rnd", "proclamation", "upscm", "rust", "ulid", "sqlite"]
|
|||
I recently spent a couple days moving from [one type of universally unique
|
||||
identifier](https://commons.apache.org/sandbox/commons-id/uuid.html) to a [different
|
||||
one](https://github.com/ulid/spec), for an in-progress [database-backed
|
||||
web-app](https://gitlab.com/nebkor/ww). The [initial
|
||||
work](https://gitlab.com/nebkor/ww/-/commit/be96100237da56313a583be6da3dc27a4371e29d#f69082f7433f159d627269b207abdaf2ad52b24c)
|
||||
web-app](https://git.kittencollective.com/nebkor/what2watch). The [initial
|
||||
work](https://git.kittencollective.com/nebkor/what2watch/src/commit/be96100237da56313a583be6da3dc27a4371e29d/src/ids.rs)
|
||||
didn't take very long, but debugging the [serialization and
|
||||
deserialization](https://en.wikipedia.org/wiki/Serialization) of the new IDs took another day and a
|
||||
half, and in the end, the alleged mystery of why it wasn't working was a red herring due to my own
|
||||
|
@ -65,7 +65,7 @@ representation and efficiency!
|
|||
And at first, that's what I did. The [external library](https://docs.rs/sqlx/latest/sqlx/) I'm using
|
||||
to interface with my database automatically writes UUIDs as a sequence of sixteen bytes, if you
|
||||
specified the type in the database[^sqlite-dataclasses] as "[blob](https://www.sqlite.org/datatype3.html)", which [I
|
||||
did](https://gitlab.com/nebkor/ww/-/commit/65a32f1f20df6c572580d796e1044bce807fd3b6#f1043d50a0244c34e4d056fe96659145d03b549b_0_5).
|
||||
did](https://git.kittencollective.com/nebkor/what2watch/src/commit/65a32f1f20df6c572580d796e1044bce807fd3b6/migrations/20230426221940_init.up.sql).
|
||||
|
||||
But then I saw a [blog post](https://shopify.engineering/building-resilient-payment-systems) where
|
||||
the following tidbit was mentioned:
|
||||
|
@ -403,7 +403,7 @@ method in my deserialization code.
|
|||
<div class = "caption">fine, fine, i see the light</div>
|
||||
|
||||
You can see that
|
||||
[here](https://gitlab.com/nebkor/ww/-/blob/656e6dceedf0d86e2805e000c9821e931958a920/src/db_id.rs#L194-216)
|
||||
[here](https://git.kittencollective.com/nebkor/what2watch/src/commit/656e6dceedf0d86e2805e000c9821e931958a920/src/db_id.rs#L194-L215)
|
||||
if you'd like, but I'll actually come back to it in a second. The important part was that my logins
|
||||
were working again; time to party!
|
||||
|
||||
|
@ -415,13 +415,13 @@ day, I dove back into it.
|
|||
|
||||
|
||||
All my serialization code was calling a method called
|
||||
[`bytes()`](https://gitlab.com/nebkor/ww/-/blob/656e6dceedf0d86e2805e000c9821e931958a920/src/db_id.rs#L18),
|
||||
[`bytes()`](https://git.kittencollective.com/nebkor/what2watch/src/commit/656e6dceedf0d86e2805e000c9821e931958a920/src/db_id.rs#L18),
|
||||
which simply called another method that would return an array of 16 bytes, in big-endian order, so
|
||||
it could go into the database and be sortable, as discussed.
|
||||
|
||||
But all[^actually_not_all] my *deserialization* code was constructing the IDs as [though the bytes
|
||||
were
|
||||
*little*-endian](https://gitlab.com/nebkor/ww/-/blob/656e6dceedf0d86e2805e000c9821e931958a920/src/db_id.rs#L212). Which
|
||||
*little*-endian](https://git.kittencollective.com/nebkor/what2watch/src/commit/656e6dceedf0d86e2805e000c9821e931958a920/src/db_id.rs#L212). Which
|
||||
lead me to ask:
|
||||
|
||||
what the fuck?
|
||||
|
@ -437,9 +437,9 @@ then were "backwards" coming out and "had to be" cast using little-endian constr
|
|||
What had actually happened is that as long as there was agreement about what order to use for reconstructing the
|
||||
ID from the bytes, it didn't matter if it was big or little-endian, it just had to be the same on
|
||||
both the
|
||||
[SQLx](https://gitlab.com/nebkor/ww/-/commit/84d70336d39293294fd47b4cf115c70091552c11#ce34dd57be10530addc52a3273548f2b8d3b8a9b_106_105)
|
||||
[SQLx](https://git.kittencollective.com/nebkor/what2watch/src/commit/4211ead59edc008e65aca2ed69e9f87de26e37b2/src/db_id.rs#L101-L107)
|
||||
side and on the
|
||||
[Serde](https://gitlab.com/nebkor/ww/-/commit/84d70336d39293294fd47b4cf115c70091552c11#ce34dd57be10530addc52a3273548f2b8d3b8a9b_210_209)
|
||||
[Serde](https://git.kittencollective.com/nebkor/what2watch/src/commit/4211ead59edc008e65aca2ed69e9f87de26e37b2/src/db_id.rs#L209)
|
||||
side. This is also irrespective of the order they were written out in, but again, the two sides must
|
||||
agree on the convention used. Inside the Serde method, I had added some debug printing of the bytes
|
||||
it was getting, and they were in little-endian order. What I had not realized is that that was
|
||||
|
@ -472,7 +472,7 @@ issues. Collaboration is a great technique for navigating these situations, and
|
|||
focus a bit more on enabling that[^solo-yolo-dev].
|
||||
|
||||
In the course of debugging this issue, I tried to get more insight via
|
||||
[testing](https://gitlab.com/nebkor/ww/-/commit/656e6dceedf0d86e2805e000c9821e931958a920#ce34dd57be10530addc52a3273548f2b8d3b8a9b_143_251),
|
||||
[testing](https://git.kittencollective.com/nebkor/what2watch/src/commit/656e6dceedf0d86e2805e000c9821e931958a920/src/db_id.rs#L231-L251)
|
||||
and though that helped a little, it was not nearly enough; the problem was that I misunderstood how
|
||||
something worked, not that I had mistakenly implemented something I was comfortable with. Tests
|
||||
aren't a substitute for understanding!
|
||||
|
@ -521,10 +521,10 @@ is no longer an exercise in eye-glaze-control. Maybe this has helped you with th
|
|||
[^actually_not_all]: Upon further review, I discovered that the only methods that were constructing
|
||||
with little-endian order were the SQLx `decode()` method, and the Serde `visit_seq()` method,
|
||||
which were also the only ones that were being called at all. The
|
||||
[`visit_bytes()`](https://gitlab.com/nebkor/ww/-/blob/656e6dceedf0d86e2805e000c9821e931958a920/src/db_id.rs#L152)
|
||||
[`visit_bytes()`](https://git.kittencollective.com/nebkor/what2watch/src/commit/656e6dceedf0d86e2805e000c9821e931958a920/src/db_id.rs#L152)
|
||||
and `visit_byte_buf()` methods, that I had thought were so important, were correctly treating
|
||||
the bytes as big-endian, but were simply never actually used. I fixed [in the next
|
||||
commit](https://gitlab.com/nebkor/ww/-/commit/84d70336d39293294fd47b4cf115c70091552c11#ce34dd57be10530addc52a3273548f2b8d3b8a9b)
|
||||
commit](https://git.kittencollective.com/nebkor/what2watch/commit/84d70336d39293294fd47b4cf115c70091552c11#diff-ce34dd57be10530addc52a3273548f2b8d3b8a9b)
|
||||
|
||||
[^solo-yolo-dev]: I've described my current practices as "solo-yolo", which has its plusses and
|
||||
minuses, as you may imagine.
|
||||
|
|
|
@ -2,7 +2,7 @@
|
|||
title = "A One-Part Serialized Mystery, Part 2: The Benchmarks"
|
||||
slug = "one-part-serialized-mystery-part-2"
|
||||
date = "2023-07-15"
|
||||
updated = "2023-07-29"
|
||||
updated = "2025-07-21"
|
||||
[taxonomies]
|
||||
tags = ["software", "rnd", "proclamation", "upscm", "rust", "sqlite", "ulid"]
|
||||
+++
|
||||
|
@ -10,7 +10,7 @@ tags = ["software", "rnd", "proclamation", "upscm", "rust", "sqlite", "ulid"]
|
|||
# A one-part serial mystery post-hoc prequel
|
||||
|
||||
I [wrote recently](/rnd/one-part-serialized-mystery) about switching the types of the primary keys in
|
||||
the database for an [in-progress web app](https://gitlab.com/nebkor/ww) I'm building. At that time,
|
||||
the database for an [in-progress web app](https://git.kittencollective.com/nebkor/what2watch) I'm building. At that time,
|
||||
I'd not yet done any benchmarking, but had reason to believe that using [sortable primary
|
||||
keys](https://github.com/ulid/spec) would yield some possibly-significant gains in performance, in
|
||||
both time and space. I'd also read accounts of regret that databases had not used ULIDs (instead of
|
||||
|
@ -47,9 +47,7 @@ My benchmark is pretty simple: starting from an empty database, do the following
|
|||
1. for each user, randomly select around 100 movies from the 10,000 available and put them on their list of
|
||||
things to watch
|
||||
|
||||
Only that last part is significant, and is where I got my [timing
|
||||
information](https://gitlab.com/nebkor/ww/-/blob/897fd993ceaf9c77433d44f8d68009eb466ac3aa/src/bin/import_users.rs#L47-58)
|
||||
from.
|
||||
Only that last part is significant; the first two steps are basically instantaneous.
|
||||
|
||||
The table that keeps track of what users want to watch was defined[^not-final-form] like this:
|
||||
|
||||
|
@ -92,7 +90,7 @@ and the [recommended durability setting](https://www.sqlite.org/pragma.html#prag
|
|||
WAL mode, along with all other production-appropriate settings, I got almost 20,000 *writes* per
|
||||
second[^nothing is that slow]. There were multiple concurrent writers, and each write was a
|
||||
transaction that inserted about 100 rows at a time. I had [retry
|
||||
logic](https://gitlab.com/nebkor/ww/-/blob/4c44aa12b081c777c82192755ac85d1fe0f5bdca/src/bin/import_users.rs#L143-145)
|
||||
logic](https://git.kittencollective.com/nebkor/what2watch/src/commit/4c44aa12b081c777c82192755ac85d1fe0f5bdca/src/bin/import_users.rs#L134-L148)
|
||||
in case a transaction failed due to the DB being locked by another writer, but that never happened:
|
||||
each write was just too fast.
|
||||
|
||||
|
@ -183,7 +181,7 @@ capabilities resulted in better resource use. Every table in my original, UUID-b
|
|||
a `created_at` column, stored as a 64-bit signed offset from the [UNIX
|
||||
epoch](https://en.wikipedia.org/wiki/Unix_time). Because ULIDs encode their creation time, I could
|
||||
remove that column from every table that used ULIDs as their primary key. [Doing
|
||||
so](https://gitlab.com/nebkor/ww/-/commit/5782651aa691125f11a80e241f14c681dda7a7c1) dropped the
|
||||
so](https://git.kittencollective.com/nebkor/what2watch/commit/5782651aa691125f11a80e241f14c681dda7a7c1) dropped the
|
||||
overall DB size by 5-10% compared to UUID-based tables with a `created_at` column. This advantage
|
||||
was unique to ULIDs as opposed to UUIDv4s, and so using the latter with a schema that excludude a
|
||||
"created at" column was giving an unrealistic edge to UUIDs, but for my benchmarks, I was interested
|
||||
|
@ -209,7 +207,7 @@ create table if not exists witch_watch (
|
|||
|
||||
And, it did, a little. I also took a more critical eye to that table as a whole, and realized I
|
||||
could [tidy up the
|
||||
DB](https://gitlab.com/nebkor/ww/-/commit/0e016552ab6c66d5fdd82704b6277bd857c94188?view=parallel#f1043d50a0244c34e4d056fe96659145d03b549b_34_34)
|
||||
DB](https://git.kittencollective.com/nebkor/what2watch/commit/0e016552ab6c66d5fdd82704b6277bd857c94188?view=parallel#diff-f1043d50a0244c34e4d056fe96659145d03b549b)
|
||||
a little more, and remove one more redundant field; this helped a little bit, too.
|
||||
|
||||
But overall, things were still looking like ULIDs had no real inherent advantage over UUIDs in the
|
||||
|
@ -262,7 +260,7 @@ about the primary key for that table. It would also eliminate an entire index (a
|
|||
automatically-generated "primary key to rowid" index), resulting in the ultimate space savings.
|
||||
|
||||
So, [that's what I
|
||||
did](https://gitlab.com/nebkor/ww/-/commit/2c7990ff09106fa2a9ec30974bbc377b44082082):
|
||||
did](https://git.kittencollective.com/nebkor/what2watch/commit/2c7990ff09106fa2a9ec30974bbc377b44082082):
|
||||
|
||||
``` sql
|
||||
-- table of what people want to watch
|
||||
|
@ -320,7 +318,7 @@ was a `created_at` column for it. Still a win, though!
|
|||
Something I realized with the "final" schema is that you could have duplicate rows, since the only
|
||||
unique field was the `rowid`. I didn't want this. So, rather than create a `unique index on
|
||||
watch_quests (user, watch)`, I [just
|
||||
added](https://gitlab.com/nebkor/ww/-/commit/c685dc1a6b08d9ff6bafc72582acb539651a350c) a `primary
|
||||
added](https://git.kittencollective.com/nebkor/what2watch/commit/c685dc1a6b08d9ff6bafc72582acb539651a350c) a `primary
|
||||
key (user, watch)`.
|
||||
|
||||
If that looks familiar, good eye! Doing this brings the disk usage back up to 17MB in the baseline
|
||||
|
|
|
@ -2,6 +2,7 @@
|
|||
title = "Presenting Julids, another fine sundry from Nebcorp Heavy Industries and Sundries"
|
||||
slug = "presenting-julids"
|
||||
date = "2023-07-31"
|
||||
updated = "2025-07-21"
|
||||
[taxonomies]
|
||||
tags = ["software", "sundry", "proclamation", "sqlite", "rust", "ulid", "julid"]
|
||||
+++
|
||||
|
@ -9,7 +10,7 @@ tags = ["software", "sundry", "proclamation", "sqlite", "rust", "ulid", "julid"]
|
|||
# Presenting Julids
|
||||
Nebcorp Heavy Industries and Sundries, long the world leader in sundries, is proud to announce the
|
||||
public launch of the official identifier type for all Nebcorp companies' assets and database
|
||||
entries, [Julids](https://gitlab.com/nebkor/julid). Julids are globally unique sortable identifiers,
|
||||
entries, [Julids](https://git.kittencollective.com/nebkor/julid-rs). Julids are globally unique sortable identifiers,
|
||||
backwards-compatible with [ULIDs](https://github.com/ulid/spec), *but better*.
|
||||
|
||||
Inside your Rust program, simply add `julid-rs`[^julid-package] to your project's `Cargo.toml` file, and use it
|
||||
|
@ -100,7 +101,7 @@ you already have one.
|
|||
|
||||
The Julid crate can be used in two different ways: as a regular Rust library, declared in your Rust
|
||||
project's `Cargo.toml` file (say, by running `cargo add julid-rs`), and used as shown above. There's
|
||||
a rudimentary [benchmark](https://gitlab.com/nebkor/julid/-/blob/main/examples/benchmark.rs) example
|
||||
a rudimentary [benchmark](https://git.kittencollective.com/nebkor/julid-rs/src/branch/main/benches/simple.rs) example
|
||||
in the repo, which I'll talk more about below. But the primary use case for me was as a loadable
|
||||
SQLite extension, as I [previously
|
||||
wrote](/rnd/one-part-serialized-mystery-part-2/#next-steps-with-ids). Both are covered in the
|
||||
|
@ -125,7 +126,7 @@ The extension, when loaded into SQLite, provides the following functions:
|
|||
|
||||
If you want to use it as a SQLite extension:
|
||||
|
||||
* clone the [repo](https://gitlab.com/nebkor/julid)
|
||||
* clone the [repo](https://git.kittencollective.com/nebkor/julid-rs)
|
||||
* build it with `cargo build --features plugin` (this builds the SQLite extension)
|
||||
* copy the resulting `libjulid.[so|dylib|whatevs]` to some place where you can...
|
||||
* load it into SQLite with `.load /path/to/libjulid` as shown at the top
|
||||
|
@ -159,7 +160,7 @@ create table if not exists watches (
|
|||
```
|
||||
|
||||
and then [some
|
||||
code](https://gitlab.com/nebkor/ww/-/blob/cc14c30fcfbd6cdaecd85d0ba629154d098b4be9/src/import_utils.rs#L92-126)
|
||||
code](https://git.kittencollective.com/nebkor/what2watch/src/commit/72ca947cf6092e7d9719e0780ab37e3f498b99b0/src/import_utils.rs#L15-L27)
|
||||
that inserted rows into that table like
|
||||
|
||||
``` sql
|
||||
|
@ -237,7 +238,8 @@ Like Marge, I just think they're neat! We're not the only ones; here are just so
|
|||
* [UUIDv7](https://www.ietf.org/archive/id/draft-peabody-dispatch-new-uuid-format-01.html#name-uuidv7-layout-and-bit-order);
|
||||
these are *very* similar to Julids; the primary difference is that the lower 62 bits are left up
|
||||
to the implementation, rather than always containing pseudorandom bits as in Julids (which use
|
||||
the lower 64 bits for that, instead of UUIDv7's 62)
|
||||
the lower 64 bits for that, instead of UUIDv7's 62) -- UPDATE! Julids are now able to
|
||||
[interconvert with UUIDv7s](https://git.kittencollective.com/nebkor/julid-rs/src/commit/e333ea52637c9fe4db60cfec3603c7d60e70ecab/src/uuid.rs).
|
||||
* [Snowflake ID](https://en.wikipedia.org/wiki/Snowflake_ID), developed by Twitter in 2010; these
|
||||
are 63-bit identifiers (so they fit in a signed 64-bit number), where the top 41 bits are a
|
||||
millisecond timestamp, the next 10 bits are a machine identifier[^twitter machine count], and the
|
||||
|
@ -246,7 +248,7 @@ Like Marge, I just think they're neat! We're not the only ones; here are just so
|
|||
|
||||
and I'm sure the list can go on.
|
||||
|
||||
I wanted to use them in my SQLite-backed [web app](https://gitlab.com/nebkor/ww), in order to fix
|
||||
I wanted to use them in my SQLite-backed [web app](https://git.kittencollective.com/nebkor/what2watch), in order to fix
|
||||
some deficiencies in ULIDs and the way I was using them, as [I said
|
||||
before](/rnd/one-part-serialized-mystery-part-2/#next-steps-with-ids):
|
||||
|
||||
|
@ -272,10 +274,10 @@ those crates! Feel free to steal code from me any time!
|
|||
----
|
||||
|
||||
[^julid-package]: The Rust crate *package's*
|
||||
[name](https://gitlab.com/nebkor/julid/-/blob/2484d5156bde82a91dcc106410ed56ee0a5c1e07/Cargo.toml#L2)
|
||||
[name](https://git.kittencollective.com/nebkor/julid-rs/src/commit/e333ea52637c9fe4db60cfec3603c7d60e70ecab/Cargo.toml#L2)
|
||||
is "julid-rs"; that's the name you add to your `Cargo.toml` file, that's how it's listed on
|
||||
[crates.io](https://crates.io/crates/julid-rs), etc. The crate's *library*
|
||||
[name](https://gitlab.com/nebkor/julid/-/blob/2484d5156bde82a91dcc106410ed56ee0a5c1e07/Cargo.toml#L24)
|
||||
[name](https://git.kittencollective.com/nebkor/julid-rs/src/commit/e333ea52637c9fe4db60cfec3603c7d60e70ecab/Cargo.toml#L32)
|
||||
is just "julid"; that's how you refer to it in a `use` statement in your Rust program.
|
||||
|
||||
[^httm]: Remember in *Hot Tub Time Machine*, where Rob Cordry's character, "Lew", decides to stay in
|
||||
|
@ -291,7 +293,7 @@ those crates! Feel free to steal code from me any time!
|
|||
|
||||
[^monotonic]: At least, they will still have a total order if they're all generated within the same
|
||||
process in the same way; the code uses a [64-bit atomic
|
||||
integer](https://gitlab.com/nebkor/julid/-/blob/2484d5156bde82a91dcc106410ed56ee0a5c1e07/src/julid.rs#L11-12)
|
||||
integer](https://git.kittencollective.com/nebkor/julid-rs/src/commit/2484d5156bde82a91dcc106410ed56ee0a5c1e07/src/julid.rs#L11-12)
|
||||
to ensure that IDs generated within the same millisecond have incremented counters, but that
|
||||
atomic counter is not global; calling `Julid::new()` in Rust and `select julid_new()` in SQLite
|
||||
would be as though they were generated on different machines. I just make sure to only generate
|
||||
|
|
|
@ -2,7 +2,7 @@
|
|||
title = "Shit-code and Other Performance Arts"
|
||||
slug = "shit-code-and-performance-art"
|
||||
date = "2023-02-08"
|
||||
updated = "2023-02-09"
|
||||
updated = "2025-07-21"
|
||||
[taxonomies]
|
||||
tags = ["software", "art", "sundry", "proclamation", "chaos"]
|
||||
[extra]
|
||||
|
@ -42,7 +42,7 @@ and using the font used by the alien in *Predator*
|
|||
![get to the choppah][katabastird_predator]
|
||||
|
||||
But by far its greatest feature is an undocumented option, `-A`, that will play an [airhorn
|
||||
salvo](https://gitlab.com/nebkor/katabastird/-/blob/4ccc2e4738df3f9d3af520e2d3875200534f4f6f/resources/airhorn_alarm.mp3)
|
||||
salvo](https://git.kittencollective.com/nebkor/katabastird/src/commit/4ccc2e4738df3f9d3af520e2d3875200534f4f6f/resources/airhorn_alarm.mp3)
|
||||
when it's done. This option is visible in the program's help text, but it's not described.
|
||||
|
||||
Truly honestly, this is not a great program. Once it's launched, it only understands two keyboard
|
||||
|
@ -103,7 +103,7 @@ OPTIONS:
|
|||
with millisecond precision.
|
||||
```
|
||||
|
||||
The [README](https://github.com/nebkor/randical/blob/main/README.md) contains some examples of using
|
||||
The [README](https://git.kittencollective.com/nebkor/randical#readme) contains some examples of using
|
||||
it to do various things, like simulate a fair coin toss, or an *unfair* coin toss, or "a *Sliding
|
||||
Doors*-style garden of forking paths alternate timeline for Ferris Bueller's presence or absence on
|
||||
that fateful day."
|
||||
|
@ -241,7 +241,7 @@ requirements about semver[^smegver].
|
|||
## goldver
|
||||
|
||||
When I version software for public consumption, I tend to use a scheme I call
|
||||
"[goldver](https://gitlab.com/nebkor/katabastird/-/blob/main/VERSIONING.md)", short for "Golden
|
||||
"[goldver](https://git.kittencollective.com/nebkor/katabastird/src/branch/main/VERSIONING.md)", short for "Golden
|
||||
Versioning". It works like this:
|
||||
|
||||
> When projects are versioned with goldver, the first version is "1". Note that it is not "1.0", or,
|
||||
|
@ -264,7 +264,7 @@ software. It was Windows 95 and then Windows 2000; obviously there was a lot of
|
|||
about arguing about the whether or not this is a "patch release" or a "minor release" or a "major
|
||||
change". There are no downstream dependents who need to make sure they don't accidentally upgrade to
|
||||
the latest release. If someone wants to update it, they know what they're getting into, and they do
|
||||
it in an inherently manual way.
|
||||
it in an inherently manual way.
|
||||
|
||||
## chaos license
|
||||
|
||||
|
|
Loading…
Reference in a new issue