From 51ff082cbf34093a231f4e48745a8b2f3cfc73b6 Mon Sep 17 00:00:00 2001 From: Joe Ardent Date: Sun, 30 Jul 2023 13:11:29 -0700 Subject: [PATCH] ready to publish --- content/sundries/presenting-julids/index.md | 62 ++++++++++++--------- 1 file changed, 35 insertions(+), 27 deletions(-) diff --git a/content/sundries/presenting-julids/index.md b/content/sundries/presenting-julids/index.md index 715b84b..e1d895f 100644 --- a/content/sundries/presenting-julids/index.md +++ b/content/sundries/presenting-julids/index.md @@ -1,5 +1,5 @@ +++ -title = "Presenting Julids, another fine sundry by Nebcorp Heavy Industries and Sundries" +title = "Presenting Julids, another fine sundry from Nebcorp Heavy Industries and Sundries" slug = "presenting-julids" date = "2023-07-31" [taxonomies] @@ -53,9 +53,12 @@ sqlite> select julid_counter(julid_new()); 0 ``` +Intrigued? Confused? Disgusted? Enraged?? Well, read on! + ## Julids vs ULIDs -Julids are a drop-in replacement for ULIDs; all Julids are valid ULIDs, but not all ULIDs are valid Julids. +Julids are a drop-in replacement for ULIDs: all Julids are valid ULIDs, but not all ULIDs are valid +Julids. Given their compatibility relationship, Julids and ULIDs must have quite a bit in common, and indeed they do: @@ -63,7 +66,7 @@ they do: * they are 128-bits long * they are lexicographically sortable * they encode their creation time as the number of milliseconds since the [UNIX - epoch](https://en.wikipedia.org/wiki/Unix_time) + epoch](https://en.wikipedia.org/wiki/Unix_time) in their top 48 bits * their string representation is a 26-character [base-32 Crockford](https://en.wikipedia.org/wiki/Base32) encoding of their big-endian bytes * IDs created within the same millisecond are still meant to sort in their order of creation @@ -85,8 +88,8 @@ To address these shortcomings, Julids (Joe's ULIDs) have the following structure ![Julid bit structure](./julid.svg) As with ULIDs, the 48 most-significant bits encode the time of creation. Unlike ULIDs, the next 16 -most-significant bits are not random: they're a monotonic counter for IDs created within the same -millisecond[^monotonic]. Since it's only 16 bits, it will saturate after 65,536 IDs +most-significant bits are not random[^counter idea]: they're a monotonic counter for IDs created +within the same millisecond[^monotonic]. Since it's only 16 bits, it will saturate after 65,536 IDs intra-millisecond creations, after which, IDs in that same millisecond will not have an intrinsic total order (the random bits will still be different, so you shouldn't have collisions). My PC, which is no slouch, can only generate about 20,000 per millisecond, so hopefully this is not an @@ -95,12 +98,11 @@ you already have one. # How to use -As noted, the Julid crate can be used in two different ways: as a regular Rust library, declared -in your Rust project's `Cargo.toml` file (say, by running `cargo add julid-rs`), and used as shown -above. There's a rudimentary -[benchmark](https://gitlab.com/nebkor/julid/-/blob/main/examples/benchmark.rs) example in the repo, -which I'll talk more about below. But the primary use case for me was as a loadable SQLite -extension, as I [previously +The Julid crate can be used in two different ways: as a regular Rust library, declared in your Rust +project's `Cargo.toml` file (say, by running `cargo add julid-rs`), and used as shown above. There's +a rudimentary [benchmark](https://gitlab.com/nebkor/julid/-/blob/main/examples/benchmark.rs) example +in the repo, which I'll talk more about below. But the primary use case for me was as a loadable +SQLite extension, as I [previously wrote](/rnd/one-part-serialized-mystery-part-2/#next-steps-with-ids). Both are covered in the [documentation](https://docs.rs/julid-rs/latest/julid/), but let's go over them here, starting with the extension. @@ -116,8 +118,8 @@ The extension, when loaded into SQLite, provides the following functions: * `julid_counter(julid)`: show the value of this julid's monotonic counter * `julid_sortable(julid)`: return the 64-bit concatenation of the timestamp and counter * `julid_string(julid)`: show the [base-32 Crockford](https://en.wikipedia.org/wiki/Base32) - encoding of this julid; the raw bytes won't be valid UTF-8, so use this or the built-in `hex()` - function to `select` a human-readable representation + encoding of this julid; the raw bytes of Julids won't be valid UTF-8, so use this or the built-in + `hex()` function to `select` a human-readable representation ### Building and loading @@ -148,10 +150,9 @@ create table if not exists watches ( id blob not null primary key default (julid_new()), kind int not null, -- enum for movie or tv show or whatev title text not null, - metadata_url text, -- possible url for imdb or other metadata-esque site to show the user length int, release_date int, - added_by blob not null, -- ID of the user that added it + added_by blob not null, last_updated int not null default (unixepoch()), foreign key (added_by) references users (id) ); @@ -177,7 +178,8 @@ a simple benchmark in the examples folder of the repo, the important parts of wh use julid::Julid; fn main() { - [....] + /* snip some stuff */ + let start = Instant::now(); for _ in 0..num { v.push(Julid::new()); @@ -213,14 +215,14 @@ timestamp as a [`DateTime`](https://docs.rs/chrono/latest/chrono/struct.DateTime `created_at(&self)` method to `Julid`. Something to note: don't enable the `plugin` feature in your Cargo.toml if you're using this crate -inside your Rust application, *especially* if you're also loading it as an extension in SQLite in +inside your Rust application, especially if you're *also* loading it as an extension in SQLite in your application. You'll get a long and confusing runtime panic due to there being multiple entrypoints defined with the same name. # Why Julids? -The astute may have noticed that this is the third time I've written about globally unique -sortable IDs ([here is part one](/rnd/one-part-serialized-mystery), and [part two is +The astute may have noticed that this is the third time I've written about globally unique sortable +IDs ([here is part one](/rnd/one-part-serialized-mystery), and [part two is here](/rnd/one-part-serialized-mystery-part-2)). What's, uh... what's up with that? ![marge just thinks they're neat][marge ids] @@ -255,17 +257,17 @@ before](/rnd/one-part-serialized-mystery-part-2/#next-steps-with-ids): > and so my next step is to write one of those, and remove the ID generation logic from the > application. -Now that I've accomplished all I've set out to, is this the last time I'll time I'll be writing at -length about these things? It's hard to say for sure, but signs point to "yes". I hope you've found -them at least a little interesting! +Now that I've accomplished all that I've set out to do, is this the last time I'll time I'll be +writing at length about these things? It's hard to say for sure, but signs point to "yes". I hope +you've found them at least a little interesting! # Thanks -This crate wouldn't have been possible without a lot of inspiration (and a little shameless -stealing) from the [ulid-rs](https://github.com/dylanhart/ulid-rs) crate. For the loadable -extension, the [sqlite-loadable-rs](https://github.com/asg017/sqlite-loadable-rs) crate made it -*extremely* easy to write; what I thought would take a couple days instead took a couple -hours. Thank you, authors of those crates! Feel free to steal from this project! +This project wouldn't have happened without a lot of inspiration (and a little shameless stealing) +from the [ulid-rs](https://github.com/dylanhart/ulid-rs) crate. For the loadable extension, the +[sqlite-loadable-rs](https://github.com/asg017/sqlite-loadable-rs) crate made it *extremely* easy to +write; what I thought would take a couple days instead took a couple hours. Thank you, authors of +those crates! Feel free to steal code from me any time! ---- @@ -276,6 +278,12 @@ hours. Thank you, authors of those crates! Feel free to steal from this project! [name](https://gitlab.com/nebkor/julid/-/blob/2484d5156bde82a91dcc106410ed56ee0a5c1e07/Cargo.toml#L24) is just "julid"; that's how you refer to it in a `use` statement in your Rust program. +[^counter idea]: Sticking the counter bits after the timestamp bits was stolen from + , though they use only 15 bits + for the counter, due to each character in the string encoding representing five bits, and using + three whole characters for the counter. That gives them one more random bit than Julids, and + lowers the number of available unique intra-millisecond IDs in the same process to 32,678. + [^monotonic]: At least, they will still have a total order if they're all generated within the same process in the same way; the code uses a [64-bit atomic integer](https://gitlab.com/nebkor/julid/-/blob/2484d5156bde82a91dcc106410ed56ee0a5c1e07/src/julid.rs#L11-12)