ready to publish

This commit is contained in:
Joe Ardent 2023-07-30 13:11:29 -07:00
parent ca1c3ea2d9
commit 51ff082cbf

View file

@ -1,5 +1,5 @@
+++ +++
title = "Presenting Julids, another fine sundry by Nebcorp Heavy Industries and Sundries" title = "Presenting Julids, another fine sundry from Nebcorp Heavy Industries and Sundries"
slug = "presenting-julids" slug = "presenting-julids"
date = "2023-07-31" date = "2023-07-31"
[taxonomies] [taxonomies]
@ -53,9 +53,12 @@ sqlite> select julid_counter(julid_new());
0 0
``` ```
Intrigued? Confused? Disgusted? Enraged?? Well, read on!
## Julids vs ULIDs ## Julids vs ULIDs
Julids are a drop-in replacement for ULIDs; all Julids are valid ULIDs, but not all ULIDs are valid Julids. Julids are a drop-in replacement for ULIDs: all Julids are valid ULIDs, but not all ULIDs are valid
Julids.
Given their compatibility relationship, Julids and ULIDs must have quite a bit in common, and indeed Given their compatibility relationship, Julids and ULIDs must have quite a bit in common, and indeed
they do: they do:
@ -63,7 +66,7 @@ they do:
* they are 128-bits long * they are 128-bits long
* they are lexicographically sortable * they are lexicographically sortable
* they encode their creation time as the number of milliseconds since the [UNIX * they encode their creation time as the number of milliseconds since the [UNIX
epoch](https://en.wikipedia.org/wiki/Unix_time) epoch](https://en.wikipedia.org/wiki/Unix_time) in their top 48 bits
* their string representation is a 26-character [base-32 * their string representation is a 26-character [base-32
Crockford](https://en.wikipedia.org/wiki/Base32) encoding of their big-endian bytes Crockford](https://en.wikipedia.org/wiki/Base32) encoding of their big-endian bytes
* IDs created within the same millisecond are still meant to sort in their order of creation * IDs created within the same millisecond are still meant to sort in their order of creation
@ -85,8 +88,8 @@ To address these shortcomings, Julids (Joe's ULIDs) have the following structure
![Julid bit structure](./julid.svg) ![Julid bit structure](./julid.svg)
As with ULIDs, the 48 most-significant bits encode the time of creation. Unlike ULIDs, the next 16 As with ULIDs, the 48 most-significant bits encode the time of creation. Unlike ULIDs, the next 16
most-significant bits are not random: they're a monotonic counter for IDs created within the same most-significant bits are not random[^counter idea]: they're a monotonic counter for IDs created
millisecond[^monotonic]. Since it's only 16 bits, it will saturate after 65,536 IDs within the same millisecond[^monotonic]. Since it's only 16 bits, it will saturate after 65,536 IDs
intra-millisecond creations, after which, IDs in that same millisecond will not have an intrinsic intra-millisecond creations, after which, IDs in that same millisecond will not have an intrinsic
total order (the random bits will still be different, so you shouldn't have collisions). My PC, total order (the random bits will still be different, so you shouldn't have collisions). My PC,
which is no slouch, can only generate about 20,000 per millisecond, so hopefully this is not an which is no slouch, can only generate about 20,000 per millisecond, so hopefully this is not an
@ -95,12 +98,11 @@ you already have one.
# How to use # How to use
As noted, the Julid crate can be used in two different ways: as a regular Rust library, declared The Julid crate can be used in two different ways: as a regular Rust library, declared in your Rust
in your Rust project's `Cargo.toml` file (say, by running `cargo add julid-rs`), and used as shown project's `Cargo.toml` file (say, by running `cargo add julid-rs`), and used as shown above. There's
above. There's a rudimentary a rudimentary [benchmark](https://gitlab.com/nebkor/julid/-/blob/main/examples/benchmark.rs) example
[benchmark](https://gitlab.com/nebkor/julid/-/blob/main/examples/benchmark.rs) example in the repo, in the repo, which I'll talk more about below. But the primary use case for me was as a loadable
which I'll talk more about below. But the primary use case for me was as a loadable SQLite SQLite extension, as I [previously
extension, as I [previously
wrote](/rnd/one-part-serialized-mystery-part-2/#next-steps-with-ids). Both are covered in the wrote](/rnd/one-part-serialized-mystery-part-2/#next-steps-with-ids). Both are covered in the
[documentation](https://docs.rs/julid-rs/latest/julid/), but let's go over them here, starting with [documentation](https://docs.rs/julid-rs/latest/julid/), but let's go over them here, starting with
the extension. the extension.
@ -116,8 +118,8 @@ The extension, when loaded into SQLite, provides the following functions:
* `julid_counter(julid)`: show the value of this julid's monotonic counter * `julid_counter(julid)`: show the value of this julid's monotonic counter
* `julid_sortable(julid)`: return the 64-bit concatenation of the timestamp and counter * `julid_sortable(julid)`: return the 64-bit concatenation of the timestamp and counter
* `julid_string(julid)`: show the [base-32 Crockford](https://en.wikipedia.org/wiki/Base32) * `julid_string(julid)`: show the [base-32 Crockford](https://en.wikipedia.org/wiki/Base32)
encoding of this julid; the raw bytes won't be valid UTF-8, so use this or the built-in `hex()` encoding of this julid; the raw bytes of Julids won't be valid UTF-8, so use this or the built-in
function to `select` a human-readable representation `hex()` function to `select` a human-readable representation
### Building and loading ### Building and loading
@ -148,10 +150,9 @@ create table if not exists watches (
id blob not null primary key default (julid_new()), id blob not null primary key default (julid_new()),
kind int not null, -- enum for movie or tv show or whatev kind int not null, -- enum for movie or tv show or whatev
title text not null, title text not null,
metadata_url text, -- possible url for imdb or other metadata-esque site to show the user
length int, length int,
release_date int, release_date int,
added_by blob not null, -- ID of the user that added it added_by blob not null,
last_updated int not null default (unixepoch()), last_updated int not null default (unixepoch()),
foreign key (added_by) references users (id) foreign key (added_by) references users (id)
); );
@ -177,7 +178,8 @@ a simple benchmark in the examples folder of the repo, the important parts of wh
use julid::Julid; use julid::Julid;
fn main() { fn main() {
[....] /* snip some stuff */
let start = Instant::now(); let start = Instant::now();
for _ in 0..num { for _ in 0..num {
v.push(Julid::new()); v.push(Julid::new());
@ -213,14 +215,14 @@ timestamp as a [`DateTime`](https://docs.rs/chrono/latest/chrono/struct.DateTime
`created_at(&self)` method to `Julid`. `created_at(&self)` method to `Julid`.
Something to note: don't enable the `plugin` feature in your Cargo.toml if you're using this crate Something to note: don't enable the `plugin` feature in your Cargo.toml if you're using this crate
inside your Rust application, *especially* if you're also loading it as an extension in SQLite in inside your Rust application, especially if you're *also* loading it as an extension in SQLite in
your application. You'll get a long and confusing runtime panic due to there being multiple your application. You'll get a long and confusing runtime panic due to there being multiple
entrypoints defined with the same name. entrypoints defined with the same name.
# Why Julids? # Why Julids?
The astute may have noticed that this is the third time I've written about globally unique The astute may have noticed that this is the third time I've written about globally unique sortable
sortable IDs ([here is part one](/rnd/one-part-serialized-mystery), and [part two is IDs ([here is part one](/rnd/one-part-serialized-mystery), and [part two is
here](/rnd/one-part-serialized-mystery-part-2)). What's, uh... what's up with that? here](/rnd/one-part-serialized-mystery-part-2)). What's, uh... what's up with that?
![marge just thinks they're neat][marge ids] ![marge just thinks they're neat][marge ids]
@ -255,17 +257,17 @@ before](/rnd/one-part-serialized-mystery-part-2/#next-steps-with-ids):
> and so my next step is to write one of those, and remove the ID generation logic from the > and so my next step is to write one of those, and remove the ID generation logic from the
> application. > application.
Now that I've accomplished all I've set out to, is this the last time I'll time I'll be writing at Now that I've accomplished all that I've set out to do, is this the last time I'll time I'll be
length about these things? It's hard to say for sure, but signs point to "yes". I hope you've found writing at length about these things? It's hard to say for sure, but signs point to "yes". I hope
them at least a little interesting! you've found them at least a little interesting!
# Thanks # Thanks
This crate wouldn't have been possible without a lot of inspiration (and a little shameless This project wouldn't have happened without a lot of inspiration (and a little shameless stealing)
stealing) from the [ulid-rs](https://github.com/dylanhart/ulid-rs) crate. For the loadable from the [ulid-rs](https://github.com/dylanhart/ulid-rs) crate. For the loadable extension, the
extension, the [sqlite-loadable-rs](https://github.com/asg017/sqlite-loadable-rs) crate made it [sqlite-loadable-rs](https://github.com/asg017/sqlite-loadable-rs) crate made it *extremely* easy to
*extremely* easy to write; what I thought would take a couple days instead took a couple write; what I thought would take a couple days instead took a couple hours. Thank you, authors of
hours. Thank you, authors of those crates! Feel free to steal from this project! those crates! Feel free to steal code from me any time!
---- ----
@ -276,6 +278,12 @@ hours. Thank you, authors of those crates! Feel free to steal from this project!
[name](https://gitlab.com/nebkor/julid/-/blob/2484d5156bde82a91dcc106410ed56ee0a5c1e07/Cargo.toml#L24) [name](https://gitlab.com/nebkor/julid/-/blob/2484d5156bde82a91dcc106410ed56ee0a5c1e07/Cargo.toml#L24)
is just "julid"; that's how you refer to it in a `use` statement in your Rust program. is just "julid"; that's how you refer to it in a `use` statement in your Rust program.
[^counter idea]: Sticking the counter bits after the timestamp bits was stolen from
<https://github.com/ahawker/ulid/issues/306#issuecomment-451850395>, though they use only 15 bits
for the counter, due to each character in the string encoding representing five bits, and using
three whole characters for the counter. That gives them one more random bit than Julids, and
lowers the number of available unique intra-millisecond IDs in the same process to 32,678.
[^monotonic]: At least, they will still have a total order if they're all generated within the same [^monotonic]: At least, they will still have a total order if they're all generated within the same
process in the same way; the code uses a [64-bit atomic process in the same way; the code uses a [64-bit atomic
integer](https://gitlab.com/nebkor/julid/-/blob/2484d5156bde82a91dcc106410ed56ee0a5c1e07/src/julid.rs#L11-12) integer](https://gitlab.com/nebkor/julid/-/blob/2484d5156bde82a91dcc106410ed56ee0a5c1e07/src/julid.rs#L11-12)