hitman/README.md
2024-03-29 08:59:14 -07:00

2.3 KiB

Hitman counts your hits, man.

This is a simple webpage hit/visit counter service. To run in development, copy the provided env.example file to .env. By default, it will look for a database file in your home directory called .hitman.db. You can let hitman create it for you, or you can use the sqlx db create command (get by running cargo install sqlx-cli; see https://crates.io/crates/sqlx-cli ); if you do that, don't forget to also run sqlx migrate run, to create the tables. This project uses SQLx's compile-time SQL checking macros, so your DB needs to have the tables to allow the SQL checks to work.

How does it work?

You need to register the hit by doing a GET to the /hit/:page endpoint, where :page is a unique and persistent identifier for the page; on my blog, I'm using the Zola post slug as the id. This bit of HTML + JS shows it in action:

    <p>There have been <span id="allhits">no</span> views of this page</p>

    <script defer>
        const hits = document.getElementById('allhits');
        fetch('http://localhost:5000/hit/index.html').then((resp) => {
            return resp.text();
        }).then((data) => {
            hits.innerHTML = data;
        });
    </script>

In this example, the :page is "index.html". The /hit endpoint registers the hit and then returns back the latest count of hits.

The index.html file in this repo has the above code in it; if you serve it like

python3 -m http.server 3000 & cargo run

then visit http://localhost:3000 you should see that there is 1 hit, if this is the first time you're trying it out. Reloading won't increment the count until the hour changes and you visit again, or you kill and restart Hitman.

Privacy?

The IP from the request is hashed with the date, hour of day, :page, and a random 64-bit number that gets regenerated every time the service is restarted and is never disclosed. This does two things:

  1. ensures that hit counts are limited to one per hour per IP per page;
  2. ensures that you can't enumerate all possible hashes from just the page, time, and then just trying all four billion possible IPs to find the matching hash.

There is no need to put up a tracking consent form because nothing is being tracked.

Security?

Well, you need to give it a specific origin that is allowed to connect. Is this enough?