tweakitty-tweak
This commit is contained in:
parent
f7aa8840bd
commit
5f03effbf3
1 changed files with 42 additions and 38 deletions
|
@ -7,10 +7,10 @@ tags = ["software", "sundry", "proclamation", "sqlite", "rust", "ulid", "julid"]
|
|||
+++
|
||||
|
||||
# Presenting Julids
|
||||
Nebcorp Heavy Industries and Sundries, long a world leader in sundries, is proud to present the
|
||||
official globally unique sortable identifier type for all Nebcorp HIAS', and all Nebcorp companies'
|
||||
database entities, [Julids](https://gitlab.com/nebkor/julid). Julids are globally unique sortable
|
||||
identifiers, backwards-compatible with [ULIDs](https://github.com/ulid/spec).
|
||||
Nebcorp Heavy Industries and Sundries, long the world leader in sundries, is proud to announce the
|
||||
public launch of the official identifier type for all Nebcorp companies' assets and database
|
||||
entries, [Julids](https://gitlab.com/nebkor/julid). Julids are globally unique sortable identifiers,
|
||||
backwards-compatible with [ULIDs](https://github.com/ulid/spec), but better.
|
||||
|
||||
Inside your Rust program, simply add `julid-rs` to your project's `Cargo.toml` file, and use it
|
||||
like:
|
||||
|
@ -27,8 +27,8 @@ fn main() {
|
|||
Such a program would output something like:
|
||||
|
||||
``` text
|
||||
[main.rs:2] id.created_at() = 2023-07-29T20:21:50.009Z
|
||||
[main.rs:2] id.as_string() = "01H6HN10SS00020YT344XMGA3C"
|
||||
[main.rs:5] id.created_at() = 2023-07-29T20:21:50.009Z
|
||||
[main.rs:5] id.as_string() = "01H6HN10SS00020YT344XMGA3C"
|
||||
```
|
||||
|
||||
However, it can also be built as a [loadable extension](https://www.sqlite.org/loadext.html) for
|
||||
|
@ -69,11 +69,11 @@ they do:
|
|||
* IDs created within the same millisecond are still meant to sort in their order of creation
|
||||
|
||||
Julids and ULIDs have different ways to implement that last piece. If you look at the layout of bits
|
||||
in a ULID, they look like this:
|
||||
in a ULID, you see:
|
||||
|
||||
![ULID bit structure](./ulid.svg)
|
||||
|
||||
According to the ULID spec, for ULIDs created in the same millisecond, the least-significant bit
|
||||
According to the ULID spec, for ULIDs created within the same millisecond, the least-significant bit
|
||||
should be incremented for each new ID. Since that portion of the ULID is random, that means you may
|
||||
not be able to increment it without spilling into the timestamp portion. Likewise, it's easy to
|
||||
guess a new possibly-valid ULID simply by incrementing an already-known one. And finally, this means
|
||||
|
@ -86,18 +86,18 @@ To address these shortcomings, Julids (Joe's ULIDs) have the following structure
|
|||
|
||||
As with ULIDs, the 48 most-significant bits encode the time of creation. Unlike ULIDs, the next 16
|
||||
most-significant bits are not random: they're a monotonic counter for IDs created within the same
|
||||
millisecond[^monotonic]. Since it's only 16 bits, it will saturate after 65,536 IDs intra-millisecond creations,
|
||||
after which, IDs in that same millisecond will not have an intrinsic total order (the random bits
|
||||
will still be different, so you shouldn't have collisions). My PC, which is no slouch, can only
|
||||
generate about 20,000 per millisecond, so hopefully this is not an issue! Because the random bits
|
||||
are always fresh, it's not possible to easily guess a valid Julid if you already have a different
|
||||
valid one.
|
||||
millisecond[^monotonic]. Since it's only 16 bits, it will saturate after 65,536 IDs
|
||||
intra-millisecond creations, after which, IDs in that same millisecond will not have an intrinsic
|
||||
total order (the random bits will still be different, so you shouldn't have collisions). My PC,
|
||||
which is no slouch, can only generate about 20,000 per millisecond, so hopefully this is not an
|
||||
issue! Because the random bits are always fresh, it's not possible to easily guess a valid Julid if
|
||||
you already have one.
|
||||
|
||||
# How to use
|
||||
|
||||
As mentioned, the Julid crate can be used in two different ways: as a regular Rust library, declared
|
||||
in your Rust project's `Cargo.toml` file (say, by running `cargo add julid-rs`), and used as also
|
||||
shown above. There's a rudimentary
|
||||
As noted, the Julid crate can be used in two different ways: as a regular Rust library, declared
|
||||
in your Rust project's `Cargo.toml` file (say, by running `cargo add julid-rs`), and used as shown
|
||||
above. There's a rudimentary
|
||||
[benchmark](https://gitlab.com/nebkor/julid/-/blob/main/examples/benchmark.rs) example in the repo,
|
||||
which I'll talk more about below. But the primary use case for me was as a loadable SQLite
|
||||
extension, as I [previously
|
||||
|
@ -158,8 +158,8 @@ create table if not exists watches (
|
|||
```
|
||||
|
||||
and then [some
|
||||
code](https://gitlab.com/nebkor/ww/-/blob/main/src/import_utils.rs?ref_type=heads#L92-126) that
|
||||
inserted rows into that table like
|
||||
code](https://gitlab.com/nebkor/ww/-/blob/cc14c30fcfbd6cdaecd85d0ba629154d098b4be9/src/import_utils.rs#L92-126)
|
||||
that inserted rows into that table like
|
||||
|
||||
``` sql
|
||||
insert into watches (kind, title, length, release_date, added_by) values (?,?,?,?,?)
|
||||
|
@ -217,16 +217,10 @@ inside your Rust application, *especially* if you're also loading it as an exten
|
|||
your application. You'll get a long and confusing runtime panic due to there being multiple
|
||||
entrypoints defined with the same name.
|
||||
|
||||
## Safety
|
||||
There is one `unsafe fn` in this project, `sqlite_julid_init()`, and it is only built for the
|
||||
`plugin` feature. The reason for it is that it's interacting with foreign code (SQLite itself) via
|
||||
the C interface, which is inherently unsafe. If you are not building the plugin, there is no
|
||||
`unsafe` code.
|
||||
|
||||
# Why Julids?
|
||||
|
||||
The astute may note that this is the third time I've written recently about globally unique sortable
|
||||
IDs ([here is part one](/rnd/one-part-serialized-mystery), and [part two is
|
||||
The astute may have noticed that this is the third time I've written about globally unique
|
||||
sortable IDs ([here is part one](/rnd/one-part-serialized-mystery), and [part two is
|
||||
here](/rnd/one-part-serialized-mystery-part-2)). What's, uh... what's up with that?
|
||||
|
||||
![marge just thinks they're neat][marge ids]
|
||||
|
@ -244,16 +238,24 @@ Like Marge says, I just think they're neat! I'm not the only one; here are just
|
|||
the lower 64 bits for that, instead of UUIDv7's 62)
|
||||
* [Snowflake ID](https://en.wikipedia.org/wiki/Snowflake_ID), developed by Twitter in 2010; these
|
||||
are 63-bit identifiers (so they fit in a signed 64-bit number), where the top 41 bits are a
|
||||
millisecond timestamp, the next 10 bits are a machine identifier[^twitter machine count], and the last 12 bits are for an
|
||||
intra-millisecond sequence counter (what Julid calls a "monotonic counter")
|
||||
millisecond timestamp, the next 10 bits are a machine identifier[^twitter machine count], and the
|
||||
last 12 bits are for an intra-millisecond sequence counter (what Julid calls a "monotonic
|
||||
counter"); unlike all the other IDs discussed, there are no random bits
|
||||
|
||||
and I'm sure the list can go on.
|
||||
|
||||
As for what I wanted them for, I wanted to use them in my Rust and SQLite-based [web
|
||||
app](https://gitlab.com/nebkor/ww), in order to fix some deficiencies in ULIDs, as discussed. Now I
|
||||
have no unshaved yaks to distract me from getting back to that.
|
||||
I wanted to use them in my SQLite-backed [web app](https://gitlab.com/nebkor/ww), in order to fix
|
||||
some deficiencies in ULIDs and the way I was using them, as [I said
|
||||
before](/rnd/one-part-serialized-mystery-part-2/#next-steps-with-ids):
|
||||
|
||||
So, is this the last I'll time I'll be writing at length about these things? It's hard to say for
|
||||
> [...] it bothers me that ID generation is not done inside the database itself. Aside from being
|
||||
> a generally bad idea, this lead to at least one frustrating debug session where I was inserting
|
||||
> one ID but reporting back another. SQLite doesn't have native support for this, but it does have
|
||||
> good native support for loading shared libraries as plugins in order to add functionality to it,
|
||||
> and so my next step is to write one of those, and remove the ID generation logic from the
|
||||
> application.
|
||||
|
||||
So, is this the last time I'll time I'll be writing at length about these things? It's hard to say for
|
||||
sure, but signs point to "yes". I hope you've found them at least a little interesting!
|
||||
|
||||
# Thanks
|
||||
|
@ -267,12 +269,14 @@ hours. Thank you, authors of those crates! Feel free to steal from this project!
|
|||
----
|
||||
|
||||
[^monotonic]: At least, they will still have a total order if they're all generated within the same
|
||||
process in the same way; the crate and extension use an atomic u64 to ensure that IDs generated
|
||||
within the same millisecond have incremented counters, but that atomic counter is not global, so
|
||||
calling `Julid::new()` in Rust and `select julid_new()` in SQLite will not be aware of each
|
||||
others' counters.
|
||||
process in the same way; the code uses a [64-bit atomic
|
||||
integer](https://gitlab.com/nebkor/julid/-/blob/2484d5156bde82a91dcc106410ed56ee0a5c1e07/src/julid.rs#L11-12)
|
||||
to ensure that IDs generated within the same millisecond have incremented counters, but that
|
||||
atomic counter is not global; calling `Julid::new()` in Rust and `select julid_new()` in SQLite
|
||||
will not be aware of each others' counters. I just make sure to only generate them inside the
|
||||
DB.
|
||||
|
||||
[^my computer]: According to the output of `lscpu`, my computer is an "AMD Ryzen 9 3900X 12-Core
|
||||
[^my computer]: According to the output of `lscpu`, my computer has an "AMD Ryzen 9 3900X 12-Core
|
||||
Processor", running between 2.2 and 4.6 GHz. It's no slouch!
|
||||
|
||||
[^twitter machine count]: There are only ten bits for the machine ID, which means there are only
|
||||
|
|
Loading…
Reference in a new issue