diff --git a/content/rnd/a_serialized_mystery/index.md b/content/rnd/a_serialized_mystery/index.md index 23256e8..168db89 100644 --- a/content/rnd/a_serialized_mystery/index.md +++ b/content/rnd/a_serialized_mystery/index.md @@ -188,7 +188,7 @@ impl Decode<'_, Sqlite> for DbId { fn decode(value: SqliteValueRef<'_>) -> Result { let bytes = <&[u8] as Decode>::decode(value)?; let bytes: [u8; 16] = bytes.try_into().unwrap_or_default(); - Ok(u128::from_be_bytes(bytes).into()) + Ok(u128::from_ne_bytes(bytes).into()) } } ``` @@ -313,25 +313,29 @@ not storing them in string from. Fundamentally, the ULID was a simple [128-bit p integer](https://doc.rust-lang.org/std/primitive.u128.html), capable of holding values between 0 and 340,282,366,920,938,463,463,374,607,431,768,211,455. -But there's a problem: we're storing our ID in the database as a sequence of 16 bytes. I was asking +But there's a problem: I was storing the ID in the database as a sequence of 16 bytes. I was asking for those bytes in "native endian", which in my case, meant "little endian". If you're not familiar -with endianness, there are two varieties: big, and little. "Big" makes the most sense; if you see a -number like "512", it's big-endian; the end is the part that's left-most, and "big" means that it is -the most-significant-digit. This is the same as what westerners think of as "normal" numbers. In the -number "512", the "most significant digit" is `5`, which correspends to `500`, which is added to the -next-most-significant digit, `1`, corresponding to `10`, which is added to the next-most-significant -digit, which is also the least-most-significant-digit, which is `2`, which is just `2`, giving us -the full number `512`. +with endianness, there are two varieties: big, and little. "Big" makes the most sense for a lot of +people; if you see a number like "512", it's big-endian; the end is the part that's left-most, and +"big" means that it is the most-significant-digit. This is the same as what westerners think of as +"normal" numbers. In the number "512", the "most significant digit" is `5`, which correspends to +`500`, which is added to the next-most-significant digit, `1`, corresponding to `10`, which is added +to the next-most-significant digit, which is also the least-most-significant-digit, which is `2`, +which is just `2`, giving us the full number `512`. If we put the least-significant-digit first, we'd write the number `512` as "215"; the order when written out would be reversed. This means that the lexicographic sort of `512, 521` would have "125" come before "215", which is backwards. -Little-endiannes is like that. If a multibyte value is on a little-endian system, the least-significant -bytes will come first, and the sorting would be non-numeric. +Little-endiannes is like that. If a multibyte numeric value is on a little-endian system, the +least-significant bytes will come first, and a lexicographic sorting of those bytes would be +non-numeric. -Unfortunaly, my computer is based on the Intel x86 instruction set, which means that it represents -numbers in "little endian" form. This means that +The solution, though, is simple: just write them out in big-endian order! This was literally a +one-line change in the code, to switch from `to_ne_bytes()` ("ne" for "native endian") to +`to_be_bytes()`. + +Boom. Sorted. ## The actual problem