buff the readme from the blog post

This commit is contained in:
Joe Ardent 2023-07-30 11:27:43 -07:00
parent 66b6a9f868
commit fbe01d5770
1 changed files with 90 additions and 27 deletions

115
README.md
View File

@ -29,18 +29,21 @@ Docs.rs: <https://docs.rs/julid-rs/latest/julid/>
## A slightly deeper look
Julids are ULID-backwards-compatible (that is, all Julids are valid ULIDs, but not all ULIDs are
Julids) identifiers with the following properties:
Julids are a drop-in replacement for ULIDs; all Julids are valid ULIDs, but not all ULIDs are valid Julids.
Given their compatibility relationship, Julids and ULIDs must have quite a bit in common, and indeed
they do:
* they are 128-bits long
* they are lexicographically sortable
* they encode their creation time as the number of milliseconds since the [UNIX
epoch](https://en.wikipedia.org/wiki/Unix_time)
* IDs created within the same millisecond will still sort in their order of creation, due to the
presence of a 16-bit monotonic counter, placed immediately after the creation time bits
* their string representation is a 26-character [base-32
Crockford](https://en.wikipedia.org/wiki/Base32) encoding of their big-endian bytes
* IDs created within the same millisecond are still meant to sort in their order of creation
It's that last thing that makes them distinctive. ULIDs have the following structure, from most to
least-significant bit:
Julids and ULIDs have different ways to implement that last piece. If you look at the layout of bits
in a ULID, you see:
![ULID bit structure](./ulid.svg)
@ -63,19 +66,30 @@ will still be different, so you shouldn't have collisions). My PC, which is no s
generate about 20,000 per millisecond, so hopefully this is not an issue! Because the random bits
are always fresh, it's not possible to easily guess a valid Julid if you already know one.
# SQLite extension
# How to use
The Julid crate can be used in two different ways: as a regular Rust library, declared in your Rust
project's `Cargo.toml` file (say, by running `cargo add julid-rs`), and used as shown above. There's
a rudimentary [benchmark](https://gitlab.com/nebkor/julid/-/blob/main/examples/benchmark.rs) example
in the repo that shows off most of the Rust API. But the primary use case for me was as a loadable
SQLite extension. Both are covered in the [documentation](https://docs.rs/julid-rs/latest/julid/),
but let's go over them here, starting with the extension.
## Inside SQLite as a loadable extension
The extension, when loaded into SQLite, provides the following functions:
* `julid_new()`: create a new Julid and return it as a `blob`
* `julid_new()`: create a new Julid and return it as a 16-byte
[blob](https://www.sqlite.org/datatype3.html#storage_classes_and_datatypes)
* `julid_seconds(julid)`: get the number seconds (as a 64-bit float) since the UNIX epoch that this
julid was created
julid was created (convenient for passing to the builtin `datetime()` function)
* `julid_counter(julid)`: show the value of this julid's monotonic counter
* `julid_sortable(julid)`: return the 64-bit concatenation of the timestamp and counter
* `julid_string(julid)`: show the [base-32 Crockford](https://en.wikipedia.org/wiki/Base32)
encoding of this julid
encoding of this julid; the raw bytes of Julids won't be valid UTF-8, so use this or the built-in
`hex()` function to `select` a human-readable representation
## Building and loading
### Building and loading
If you want to use it as a SQLite extension:
@ -96,36 +110,85 @@ create table users (
and you've got a first-class ticket straight to Julid City, baby!
# Rust crate
For a table created like:
Of course, you can also use it outside of a database; the `Julid` type is publicly exported, and
you can do like such as:
``` sql
-- table of things to watch
create table if not exists watches (
id blob not null primary key default (julid_new()),
kind int not null, -- enum for movie or tv show or whatev
title text not null, -- this has a secondary index
length int,
release_date int,
added_by blob not null,
last_updated int not null default (unixepoch()),
foreign key (added_by) references users (id)
);
```
and then [some
code](https://gitlab.com/nebkor/ww/-/blob/cc14c30fcfbd6cdaecd85d0ba629154d098b4be9/src/import_utils.rs#L92-126)
that inserted rows into that table like
``` sql
insert into watches (kind, title, length, release_date, added_by) values (?,?,?,?,?)
```
where the wildcards get bound in a loop with unique values and the Julid `id` field is
generated by the extension for each row, I get over 100,000 insertions/second when using a
file-backed DB in WAL mode and `NORMAL` durability settings.
## Inside a Rust program
Of course, you can also use it outside of a database; the `Julid` type is publicly exported. There's
a simple benchmark in the examples folder of the repo, the important parts of which look like:
``` rust
use julid::Julid;
fn main() {
let id = Julid::new();
dbg!(id.timestamp(), id.counter(), id.sortable(), id.as_string());
[....]
let start = Instant::now();
for _ in 0..num {
v.push(Julid::new());
}
let end = Instant::now();
let dur = (end - start).as_micros();
for id in v.iter() {
eprintln!(
"{id}: created_at {}; counter: {}; sortable: {}",
id.created_at(),
id.counter(),
id.sortable()
);
}
println!("{num} Julids generated in {dur}us");
```
after adding it to your project's dependencies (eg, `cargo add julid-rs`; the package name is
"julid-rs", but the library name as used in your `use` statements is just "julid"). By default, it
will also include trait implementations for using Julids with
[SQLx](https://github.com/launchbadge/sqlx), and serializing/deserializing with
[Serde](https://serde.rs/), via the `sqlx` and `serde` features, respectively. One final default
optional feature, `chrono`, uses the Chrono crate to return the timestamp as a
[`DateTime`](https://docs.rs/chrono/latest/chrono/struct.DateTime.html) by adding a
`created_at(&self)` method to `Julid`. See the simple
[example](https://gitlab.com/nebkor/julid/-/blob/main/examples/benchmark.rs) in the repo.
If you were to run it on a computer like mine (AMD Ryzen 9 3900X, 12-core, 2.2-4.6 GHz), you might
see something like this:
``` text
$ cargo run --example=benchmark --release -- -n 30000 2> /dev/null
30000 Julids generated in 1240us
```
That's about 24,000 IDs/millisecond; 24 *MILLION* per second!
The default optional Cargo features include implementations of traits for getting Julids into and
out of SQLite with [SQLx](https://github.com/launchbadge/sqlx), and for generally
serializing/deserializing with [Serde](https://serde.rs/), via the `sqlx` and `serde` features,
respectively. One final default optional feature, `chrono`, uses the Chrono crate to return the
timestamp as a [`DateTime`](https://docs.rs/chrono/latest/chrono/struct.DateTime.html) by adding a
`created_at(&self)` method to `Julid`.
Something to note: don't enable the `plugin` feature in your Cargo.toml if you're using this crate
inside your Rust application, *especially* if you're also loading it as an extension in SQLite in
your application. You'll get a long and confusing runtime panic due to there being multiple
entrypoints defined with the same name.
## Safety
### Safety
There is one `unsafe fn` in this project, `sqlite_julid_init()`, and it is only built for the
`plugin` feature. The reason for it is that it's interacting with foreign code (SQLite itself) via
the C interface, which is inherently unsafe. If you are not building the plugin, there is no