diff --git a/README.md b/README.md index 881fef2..18847bc 100644 --- a/README.md +++ b/README.md @@ -29,18 +29,21 @@ Docs.rs: ## A slightly deeper look -Julids are ULID-backwards-compatible (that is, all Julids are valid ULIDs, but not all ULIDs are -Julids) identifiers with the following properties: +Julids are a drop-in replacement for ULIDs; all Julids are valid ULIDs, but not all ULIDs are valid Julids. + +Given their compatibility relationship, Julids and ULIDs must have quite a bit in common, and indeed +they do: * they are 128-bits long * they are lexicographically sortable * they encode their creation time as the number of milliseconds since the [UNIX epoch](https://en.wikipedia.org/wiki/Unix_time) - * IDs created within the same millisecond will still sort in their order of creation, due to the - presence of a 16-bit monotonic counter, placed immediately after the creation time bits + * their string representation is a 26-character [base-32 + Crockford](https://en.wikipedia.org/wiki/Base32) encoding of their big-endian bytes + * IDs created within the same millisecond are still meant to sort in their order of creation -It's that last thing that makes them distinctive. ULIDs have the following structure, from most to -least-significant bit: +Julids and ULIDs have different ways to implement that last piece. If you look at the layout of bits +in a ULID, you see: ![ULID bit structure](./ulid.svg) @@ -63,19 +66,30 @@ will still be different, so you shouldn't have collisions). My PC, which is no s generate about 20,000 per millisecond, so hopefully this is not an issue! Because the random bits are always fresh, it's not possible to easily guess a valid Julid if you already know one. -# SQLite extension +# How to use + +The Julid crate can be used in two different ways: as a regular Rust library, declared in your Rust +project's `Cargo.toml` file (say, by running `cargo add julid-rs`), and used as shown above. There's +a rudimentary [benchmark](https://gitlab.com/nebkor/julid/-/blob/main/examples/benchmark.rs) example +in the repo that shows off most of the Rust API. But the primary use case for me was as a loadable +SQLite extension. Both are covered in the [documentation](https://docs.rs/julid-rs/latest/julid/), +but let's go over them here, starting with the extension. + +## Inside SQLite as a loadable extension The extension, when loaded into SQLite, provides the following functions: - * `julid_new()`: create a new Julid and return it as a `blob` + * `julid_new()`: create a new Julid and return it as a 16-byte + [blob](https://www.sqlite.org/datatype3.html#storage_classes_and_datatypes) * `julid_seconds(julid)`: get the number seconds (as a 64-bit float) since the UNIX epoch that this - julid was created + julid was created (convenient for passing to the builtin `datetime()` function) * `julid_counter(julid)`: show the value of this julid's monotonic counter * `julid_sortable(julid)`: return the 64-bit concatenation of the timestamp and counter * `julid_string(julid)`: show the [base-32 Crockford](https://en.wikipedia.org/wiki/Base32) - encoding of this julid + encoding of this julid; the raw bytes of Julids won't be valid UTF-8, so use this or the built-in + `hex()` function to `select` a human-readable representation -## Building and loading +### Building and loading If you want to use it as a SQLite extension: @@ -96,36 +110,85 @@ create table users ( and you've got a first-class ticket straight to Julid City, baby! -# Rust crate +For a table created like: -Of course, you can also use it outside of a database; the `Julid` type is publicly exported, and -you can do like such as: +``` sql +-- table of things to watch +create table if not exists watches ( + id blob not null primary key default (julid_new()), + kind int not null, -- enum for movie or tv show or whatev + title text not null, -- this has a secondary index + length int, + release_date int, + added_by blob not null, + last_updated int not null default (unixepoch()), + foreign key (added_by) references users (id) +); +``` + +and then [some +code](https://gitlab.com/nebkor/ww/-/blob/cc14c30fcfbd6cdaecd85d0ba629154d098b4be9/src/import_utils.rs#L92-126) +that inserted rows into that table like + +``` sql +insert into watches (kind, title, length, release_date, added_by) values (?,?,?,?,?) +``` + +where the wildcards get bound in a loop with unique values and the Julid `id` field is +generated by the extension for each row, I get over 100,000 insertions/second when using a +file-backed DB in WAL mode and `NORMAL` durability settings. + +## Inside a Rust program + +Of course, you can also use it outside of a database; the `Julid` type is publicly exported. There's +a simple benchmark in the examples folder of the repo, the important parts of which look like: ``` rust use julid::Julid; fn main() { - let id = Julid::new(); - dbg!(id.timestamp(), id.counter(), id.sortable(), id.as_string()); -} + [....] + let start = Instant::now(); + for _ in 0..num { + v.push(Julid::new()); + } + let end = Instant::now(); + let dur = (end - start).as_micros(); + + for id in v.iter() { + eprintln!( + "{id}: created_at {}; counter: {}; sortable: {}", + id.created_at(), + id.counter(), + id.sortable() + ); + } + println!("{num} Julids generated in {dur}us"); ``` -after adding it to your project's dependencies (eg, `cargo add julid-rs`; the package name is -"julid-rs", but the library name as used in your `use` statements is just "julid"). By default, it -will also include trait implementations for using Julids with -[SQLx](https://github.com/launchbadge/sqlx), and serializing/deserializing with -[Serde](https://serde.rs/), via the `sqlx` and `serde` features, respectively. One final default -optional feature, `chrono`, uses the Chrono crate to return the timestamp as a -[`DateTime`](https://docs.rs/chrono/latest/chrono/struct.DateTime.html) by adding a -`created_at(&self)` method to `Julid`. See the simple -[example](https://gitlab.com/nebkor/julid/-/blob/main/examples/benchmark.rs) in the repo. +If you were to run it on a computer like mine (AMD Ryzen 9 3900X, 12-core, 2.2-4.6 GHz), you might +see something like this: + +``` text +$ cargo run --example=benchmark --release -- -n 30000 2> /dev/null +30000 Julids generated in 1240us +``` + +That's about 24,000 IDs/millisecond; 24 *MILLION* per second! + +The default optional Cargo features include implementations of traits for getting Julids into and +out of SQLite with [SQLx](https://github.com/launchbadge/sqlx), and for generally +serializing/deserializing with [Serde](https://serde.rs/), via the `sqlx` and `serde` features, +respectively. One final default optional feature, `chrono`, uses the Chrono crate to return the +timestamp as a [`DateTime`](https://docs.rs/chrono/latest/chrono/struct.DateTime.html) by adding a +`created_at(&self)` method to `Julid`. Something to note: don't enable the `plugin` feature in your Cargo.toml if you're using this crate inside your Rust application, *especially* if you're also loading it as an extension in SQLite in your application. You'll get a long and confusing runtime panic due to there being multiple entrypoints defined with the same name. -## Safety +### Safety There is one `unsafe fn` in this project, `sqlite_julid_init()`, and it is only built for the `plugin` feature. The reason for it is that it's interacting with foreign code (SQLite itself) via the C interface, which is inherently unsafe. If you are not building the plugin, there is no