|
|
|
@ -1,5 +1,5 @@
|
|
|
|
|
+++
|
|
|
|
|
title = "Presenting Julids, another fine sundry from Nebcorp Heavy Industries and Sundries"
|
|
|
|
|
title = "Presenting Julids, another fine sundry by Nebcorp Heavy Industries and Sundries"
|
|
|
|
|
slug = "presenting-julids"
|
|
|
|
|
date = "2023-07-31"
|
|
|
|
|
[taxonomies]
|
|
|
|
@ -53,12 +53,9 @@ sqlite> select julid_counter(julid_new());
|
|
|
|
|
0
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
Intrigued? Confused? Disgusted? Enraged?? Well, read on!
|
|
|
|
|
|
|
|
|
|
## Julids vs ULIDs
|
|
|
|
|
|
|
|
|
|
Julids are a drop-in replacement for ULIDs: all Julids are valid ULIDs, but not all ULIDs are valid
|
|
|
|
|
Julids.
|
|
|
|
|
Julids are a drop-in replacement for ULIDs; all Julids are valid ULIDs, but not all ULIDs are valid Julids.
|
|
|
|
|
|
|
|
|
|
Given their compatibility relationship, Julids and ULIDs must have quite a bit in common, and indeed
|
|
|
|
|
they do:
|
|
|
|
@ -66,7 +63,7 @@ they do:
|
|
|
|
|
* they are 128-bits long
|
|
|
|
|
* they are lexicographically sortable
|
|
|
|
|
* they encode their creation time as the number of milliseconds since the [UNIX
|
|
|
|
|
epoch](https://en.wikipedia.org/wiki/Unix_time) in their top 48 bits
|
|
|
|
|
epoch](https://en.wikipedia.org/wiki/Unix_time)
|
|
|
|
|
* their string representation is a 26-character [base-32
|
|
|
|
|
Crockford](https://en.wikipedia.org/wiki/Base32) encoding of their big-endian bytes
|
|
|
|
|
* IDs created within the same millisecond are still meant to sort in their order of creation
|
|
|
|
@ -83,13 +80,13 @@ guess a new possibly-valid ULID simply by incrementing an already-known one. And
|
|
|
|
|
that sorting will need to read all the way to the end of the ULID for IDs created in the same
|
|
|
|
|
millisecond.
|
|
|
|
|
|
|
|
|
|
To address these shortcomings, Julids (Joe's[^httm] ULIDs) have the following structure:
|
|
|
|
|
To address these shortcomings, Julids (Joe's ULIDs) have the following structure:
|
|
|
|
|
|
|
|
|
|
![Julid bit structure](./julid.svg)
|
|
|
|
|
|
|
|
|
|
As with ULIDs, the 48 most-significant bits encode the time of creation. Unlike ULIDs, the next 16
|
|
|
|
|
most-significant bits are not random[^counter idea]: they're a monotonic counter for IDs created
|
|
|
|
|
within the same millisecond[^monotonic]. Since it's only 16 bits, it will saturate after 65,536 IDs
|
|
|
|
|
most-significant bits are not random: they're a monotonic counter for IDs created within the same
|
|
|
|
|
millisecond[^monotonic]. Since it's only 16 bits, it will saturate after 65,536 IDs
|
|
|
|
|
intra-millisecond creations, after which, IDs in that same millisecond will not have an intrinsic
|
|
|
|
|
total order (the random bits will still be different, so you shouldn't have collisions). My PC,
|
|
|
|
|
which is no slouch, can only generate about 20,000 per millisecond, so hopefully this is not an
|
|
|
|
@ -98,11 +95,12 @@ you already have one.
|
|
|
|
|
|
|
|
|
|
# How to use
|
|
|
|
|
|
|
|
|
|
The Julid crate can be used in two different ways: as a regular Rust library, declared in your Rust
|
|
|
|
|
project's `Cargo.toml` file (say, by running `cargo add julid-rs`), and used as shown above. There's
|
|
|
|
|
a rudimentary [benchmark](https://gitlab.com/nebkor/julid/-/blob/main/examples/benchmark.rs) example
|
|
|
|
|
in the repo, which I'll talk more about below. But the primary use case for me was as a loadable
|
|
|
|
|
SQLite extension, as I [previously
|
|
|
|
|
As noted, the Julid crate can be used in two different ways: as a regular Rust library, declared
|
|
|
|
|
in your Rust project's `Cargo.toml` file (say, by running `cargo add julid-rs`), and used as shown
|
|
|
|
|
above. There's a rudimentary
|
|
|
|
|
[benchmark](https://gitlab.com/nebkor/julid/-/blob/main/examples/benchmark.rs) example in the repo,
|
|
|
|
|
which I'll talk more about below. But the primary use case for me was as a loadable SQLite
|
|
|
|
|
extension, as I [previously
|
|
|
|
|
wrote](/rnd/one-part-serialized-mystery-part-2/#next-steps-with-ids). Both are covered in the
|
|
|
|
|
[documentation](https://docs.rs/julid-rs/latest/julid/), but let's go over them here, starting with
|
|
|
|
|
the extension.
|
|
|
|
@ -118,8 +116,8 @@ The extension, when loaded into SQLite, provides the following functions:
|
|
|
|
|
* `julid_counter(julid)`: show the value of this julid's monotonic counter
|
|
|
|
|
* `julid_sortable(julid)`: return the 64-bit concatenation of the timestamp and counter
|
|
|
|
|
* `julid_string(julid)`: show the [base-32 Crockford](https://en.wikipedia.org/wiki/Base32)
|
|
|
|
|
encoding of this julid; the raw bytes of Julids won't be valid UTF-8, so use this or the built-in
|
|
|
|
|
`hex()` function to `select` a human-readable representation
|
|
|
|
|
encoding of this julid; the raw bytes won't be valid UTF-8, so use this or the built-in `hex()`
|
|
|
|
|
function to `select` a human-readable representation
|
|
|
|
|
|
|
|
|
|
### Building and loading
|
|
|
|
|
|
|
|
|
@ -150,9 +148,10 @@ create table if not exists watches (
|
|
|
|
|
id blob not null primary key default (julid_new()),
|
|
|
|
|
kind int not null, -- enum for movie or tv show or whatev
|
|
|
|
|
title text not null,
|
|
|
|
|
metadata_url text, -- possible url for imdb or other metadata-esque site to show the user
|
|
|
|
|
length int,
|
|
|
|
|
release_date int,
|
|
|
|
|
added_by blob not null,
|
|
|
|
|
added_by blob not null, -- ID of the user that added it
|
|
|
|
|
last_updated int not null default (unixepoch()),
|
|
|
|
|
foreign key (added_by) references users (id)
|
|
|
|
|
);
|
|
|
|
@ -178,8 +177,7 @@ a simple benchmark in the examples folder of the repo, the important parts of wh
|
|
|
|
|
use julid::Julid;
|
|
|
|
|
|
|
|
|
|
fn main() {
|
|
|
|
|
/* snip some stuff */
|
|
|
|
|
|
|
|
|
|
[....]
|
|
|
|
|
let start = Instant::now();
|
|
|
|
|
for _ in 0..num {
|
|
|
|
|
v.push(Julid::new());
|
|
|
|
@ -215,14 +213,14 @@ timestamp as a [`DateTime`](https://docs.rs/chrono/latest/chrono/struct.DateTime
|
|
|
|
|
`created_at(&self)` method to `Julid`.
|
|
|
|
|
|
|
|
|
|
Something to note: don't enable the `plugin` feature in your Cargo.toml if you're using this crate
|
|
|
|
|
inside your Rust application, especially if you're *also* loading it as an extension in SQLite in
|
|
|
|
|
inside your Rust application, *especially* if you're also loading it as an extension in SQLite in
|
|
|
|
|
your application. You'll get a long and confusing runtime panic due to there being multiple
|
|
|
|
|
entrypoints defined with the same name.
|
|
|
|
|
|
|
|
|
|
# Why Julids?
|
|
|
|
|
|
|
|
|
|
The astute may have noticed that this is the third time I've written about globally unique sortable
|
|
|
|
|
IDs ([here is part one](/rnd/one-part-serialized-mystery), and [part two is
|
|
|
|
|
The astute may have noticed that this is the third time I've written about globally unique
|
|
|
|
|
sortable IDs ([here is part one](/rnd/one-part-serialized-mystery), and [part two is
|
|
|
|
|
here](/rnd/one-part-serialized-mystery-part-2)). What's, uh... what's up with that?
|
|
|
|
|
|
|
|
|
|
![marge just thinks they're neat][marge ids]
|
|
|
|
@ -257,17 +255,17 @@ before](/rnd/one-part-serialized-mystery-part-2/#next-steps-with-ids):
|
|
|
|
|
> and so my next step is to write one of those, and remove the ID generation logic from the
|
|
|
|
|
> application.
|
|
|
|
|
|
|
|
|
|
Now that I've accomplished all that I've set out to do, is this the last time I'll time I'll be
|
|
|
|
|
writing at length about these things? It's hard to say for sure, but signs point to "yes". I hope
|
|
|
|
|
you've found them at least a little interesting!
|
|
|
|
|
Now that I've accomplished all I've set out to, is this the last time I'll time I'll be writing at
|
|
|
|
|
length about these things? It's hard to say for sure, but signs point to "yes". I hope you've found
|
|
|
|
|
them at least a little interesting!
|
|
|
|
|
|
|
|
|
|
# Thanks
|
|
|
|
|
|
|
|
|
|
This project wouldn't have happened without a lot of inspiration (and a little shameless stealing)
|
|
|
|
|
from the [ulid-rs](https://github.com/dylanhart/ulid-rs) crate. For the loadable extension, the
|
|
|
|
|
[sqlite-loadable-rs](https://github.com/asg017/sqlite-loadable-rs) crate made it *extremely* easy to
|
|
|
|
|
write; what I thought would take a couple days instead took a couple hours. Thank you, authors of
|
|
|
|
|
those crates! Feel free to steal code from me any time!
|
|
|
|
|
This crate wouldn't have been possible without a lot of inspiration (and a little shameless
|
|
|
|
|
stealing) from the [ulid-rs](https://github.com/dylanhart/ulid-rs) crate. For the loadable
|
|
|
|
|
extension, the [sqlite-loadable-rs](https://github.com/asg017/sqlite-loadable-rs) crate made it
|
|
|
|
|
*extremely* easy to write; what I thought would take a couple days instead took a couple
|
|
|
|
|
hours. Thank you, authors of those crates! Feel free to steal from this project!
|
|
|
|
|
|
|
|
|
|
----
|
|
|
|
|
|
|
|
|
@ -278,24 +276,13 @@ those crates! Feel free to steal code from me any time!
|
|
|
|
|
[name](https://gitlab.com/nebkor/julid/-/blob/2484d5156bde82a91dcc106410ed56ee0a5c1e07/Cargo.toml#L24)
|
|
|
|
|
is just "julid"; that's how you refer to it in a `use` statement in your Rust program.
|
|
|
|
|
|
|
|
|
|
[^httm]: Remember in *Hot Tub Time Machine*, where Rob Cordry's character, "Lew", decides to stay in
|
|
|
|
|
the past and use his future-knowledge to amass wealth and power, and he makes his own versions
|
|
|
|
|
of things that were done in his past, like forming a glam rock band called "Mötley Lew", and a
|
|
|
|
|
search engine called "Loogle", etc.?
|
|
|
|
|
|
|
|
|
|
[^counter idea]: Putting the counter bits after the timestamp bits was stolen from
|
|
|
|
|
<https://github.com/ahawker/ulid/issues/306#issuecomment-451850395>, though they use only 15 bits
|
|
|
|
|
for the counter, due to each character in the string encoding representing five bits, and using
|
|
|
|
|
three whole characters for the counter. That gives them one more random bit than Julids, and
|
|
|
|
|
lowers the number of available unique intra-millisecond IDs in the same process to 32,678.
|
|
|
|
|
|
|
|
|
|
[^monotonic]: At least, they will still have a total order if they're all generated within the same
|
|
|
|
|
process in the same way; the code uses a [64-bit atomic
|
|
|
|
|
integer](https://gitlab.com/nebkor/julid/-/blob/2484d5156bde82a91dcc106410ed56ee0a5c1e07/src/julid.rs#L11-12)
|
|
|
|
|
to ensure that IDs generated within the same millisecond have incremented counters, but that
|
|
|
|
|
atomic counter is not global; calling `Julid::new()` in Rust and `select julid_new()` in SQLite
|
|
|
|
|
would be as though they were generated on different machines. I just make sure to only generate
|
|
|
|
|
them inside the DB.
|
|
|
|
|
will not be aware of each others' counters. I just make sure to only generate them inside the
|
|
|
|
|
DB.
|
|
|
|
|
|
|
|
|
|
[^my computer]: According to the output of `lscpu`, my computer has an "AMD Ryzen 9 3900X 12-Core
|
|
|
|
|
Processor", running between 2.2 and 4.6 GHz. It's no slouch!
|
|
|
|
|