checkpoint; recurse

2023-06-28 16:51:56 -07:00 · 2023-06-28 16:51:56 -07:00 · a30f2188ca
commit a30f2188ca
parent 64d5597802
2 changed files with 109 additions and 6 deletions
--- a/content/rnd/a_serialized_mystery/index.md
+++ b/content/rnd/a_serialized_mystery/index.md
@ -28,7 +28,7 @@ Most large distributed programs that people interact with daily via HTTP are, in
 facade for some kind of database. Facebook? That's a database. Gmail? That's a database.
 ![that's a database][thats_a_database]
-<center><span class="caption">wikipedia? that's a database.</span></center>
+<div class="caption">wikipedia? that's a database.</div>
 In most databases, each entry ("row") has a field that acts as a [primary
 key](https://en.wikipedia.org/wiki/Primary_key), used to uniquely identify that row inside the table
@ -333,17 +333,104 @@ non-numeric.
 The solution, though, is simple: just write them out in big-endian order! This was literally a
 one-line change in the code, to switch from `to_ne_bytes()` ("ne" for "native endian") to
-`to_be_bytes()`.
+`to_be_bytes()`. I confirmed that the bytes written into were being written in the correct
 lexicographic order:
-Boom. Sorted.
+``` sql
 sqlite> select hex(id), username from users order by id asc;
 018903CDDCAAB0C6872A4509F396D388|first_user
 018903D0E591525EA42202FF461AA5FA|second_user
 ```
 Note the first six characters are the same, for these two users created some time apart[^ulid-timestamps].
 Boom. "Sorted".
 ## The actual problem
-there was no visitor for seqs, which is what json byte arrays are, and what async_sessions was doing.
+Except that the logins were still broken; it wasn't just the test. What was even stranger is that
 with advanced debugging techniques[^advanced-debugging], I confirmed that the login *was*
 working. By which I mean, when the user submitted a login request, the function that handled the
 request was:
-## Wait, why isn't it broken?
+ - correctly confirming password match
 - retrieving the user from the database
-oh, it's not
+The second thing was required for the first. It was even creating a session in the session table:
 ``` sql
 sqlite> select * from async_sessions;
 ..|..|{"id":"ZY...","expiry":"...","data":{"_user_id":"[1,137,3,205,220,170,176,198,135,42,69,9,243,150,211,136]","_auth_id":"\"oM..."}}
 ```
 I noticed that the ID was present in the session entry, but as what looked like an array of decimal
 values. The less not-astute among you may have noticed that the session table seemed to be using
 JSON to store information. This wasn't my code, but it was easy enough to find the
 [culprit](https://github.com/http-rs/async-session/blob/d28cef30c7da38f52639b3d60fc8cf4489c92830/src/session.rs#L214):
 ``` rust
    pub fn insert(&mut self, key: &str, value: impl Serialize) -> Result<(), serde_json::Error> {
        self.insert_raw(key, serde_json::to_string(&value)?);
        Ok(())
    }
 ```
 This was in the [external library](https://docs.rs/async-session/latest/async_session/) I was using
 to provide cookie-based sessions for my web app, and was transitively invoked when I called the
 `login()` method in my own code. Someone else was serializing my IDs, in a way I hadn't anticipated!
 The way that Serde decides what code to call is based on its [data
 model](https://serde.rs/data-model.html). And wouldn't you know it, the following words are right
 there, hiding in plain sight, as they had been all along:
 > When deserializing a data structure from some format, the Deserialize implementation for the data
 > structure is responsible for mapping the data structure into the Serde data model by passing to
 > the Deserializer a Visitor implementation that can receive the various types of the data model...
 >
 > [...]
 >
 > * seq
 >   - A variably sized heterogeneous sequence of values, for example Vec<T> or HashSet<T>. ...
 >
 > [...]
 >
 > The flexibility around mapping into the Serde data model is profound and powerful. When
 > implementing Serialize and Deserialize, be aware of the broader context of your type that may make
 > the most instinctive mapping not the best choice.
 Well, when you put it that way, I can't help but understand: I needed to implement a `visit_seq()`
 method in my deserialization code.
 ![fine, fine, I see the light][see_the_light]
 <div class = "caption">fine, fine, i see the light</div>
 You can see that
 [here](https://gitlab.com/nebkor/ww/-/blob/656e6dceedf0d86e2805e000c9821e931958a920/src/db_id.rs#L194-216)
 if you'd like, but I'll actually come back to it in a second. The important part was that my logins
 were working again; time to party!
 ## Wait, why *isn't* it broken?
 I'd just spent the day banging my head against this problem, and so when everything worked again, I
 committed and pushed the change and signed off. But something was still bothering me, and the next
 day, I dove back into it.
 All my serialization code was calling a method called
 [`bytes()`](https://gitlab.com/nebkor/ww/-/blob/656e6dceedf0d86e2805e000c9821e931958a920/src/db_id.rs#L18),
 which simply called `another method that would return an array of 16 bytes, in big-endian order, so
 it could go into the database and be sortable, as discussed.
 But all[^actually_not_all] my *deserialization* code was constructing the IDs as [though the bytes
 were
 *little*-endian](https://gitlab.com/nebkor/ww/-/blob/656e6dceedf0d86e2805e000c9821e931958a920/src/db_id.rs#L212). Which
 lead me to ask:
 what the fuck?
 Like, everything was *working*. Why did I need to construct from a different byte order? I felt like
 I was losing my mind, so I reached out to the [Recurse Center](https://www.recurse.com) community
 and presented my case.
 # Lessons learned
@ -382,5 +469,21 @@ automated tests aren't enough
 [^confusing-yes]: Is this confusing? Yes, 100%, it is not just you. Don't get discouraged.
 [^ulid-timestamps]: The 7 most-significant bytes make up the timestamp in a ULID, which in the hex
    dump form pasted there would be the first fourteen characters, since each byte is two hex
    digits.
 [^advanced-debugging]: "adding `dbg!()` statements in the code"
 [^actually_not_all]: Upon further review, I discovered that the only methods that were constructing
    with little-endian order were the SQLx `decode()` method, and the Serde `visit_seq()` method,
    which were also the only ones that were being called at all. The
    [`visit_bytes()`](https://gitlab.com/nebkor/ww/-/blob/656e6dceedf0d86e2805e000c9821e931958a920/src/db_id.rs#L152)
    and `visit_byte_buf()` methods, that I had thought were so important, were correctly treating
    the bytes as big-endian, but were simply never actually used. I fixed [in the next
    commit](https://gitlab.com/nebkor/ww/-/commit/84d70336d39293294fd47b4cf115c70091552c11#ce34dd57be10530addc52a3273548f2b8d3b8a9b)
 [thats_a_database]: ./thats_a_database.png "that's a database"
 [see_the_light]: ./seen_the_light.png
--- a/content/rnd/a_serialized_mystery/seen_the_light.png
+++ b/content/rnd/a_serialized_mystery/seen_the_light.png