hitman/README.md

# Hitman counts your hits, man.

This is a simple webpage hit/visit counter service. To run in development, copy the provided `env.example` file
to `.env`. By default, it will look for a database file in your home directory called
`.hitman.db`. You can let hitman create it for you, or you can use the `sqlx db create` command (get
by running `cargo install sqlx-cli`; see https://crates.io/crates/sqlx-cli ); if you do that, don't
forget to also run `sqlx migrate run`, to create the tables. This project uses SQLx's compile-time
SQL checking macros, so your DB needs to have the tables to allow the SQL checks to work.


## How does it work?

You need to register the hit by doing a GET to the `/hit/:page` endpoint, where `:page` is a unique
and persistent identifier for the page; on my blog, I'm using the Zola post slug as the id. This bit
of HTML + JS shows it in action:

``` html
    <p>There have been <span id="allhits">no</span> views of this page</p>

    <script defer>
        const hits = document.getElementById('allhits');
        fetch('http://localhost:5000/hit/index.html').then((resp) => {
            if (resp.ok) {
                return resp.text();
            } else {
                return "I don't even know how many"
            }
        }).then((data) => {
            hits.innerHTML = data;
        });
    </script>
```

In this example, the `:page` is "index.html". The `/hit` endpoint registers the hit and then
returns back the latest count of hits.

The `index.html` file in this repo has the above code in it; if you serve it like

`python3 -m http.server 3000 & cargo run`

then visit http://localhost:3000 you should see that there is 1 hit, if this is the first time
you're trying it out. Reloading won't increment the count until the hour changes and you visit
again, or you kill and restart Hitman.

If you see a log message like `rejecting invalid slug index.html`,
you'll need to add the allowed slugs into the `slugs` table:

``` sql
insert into slugs (slug) values ("index.html"), ("user");
```

See the note on security below.

### Privacy

The IP from the request is hashed with the date, hour of day, `:page`, and a random 64-bit number
that gets regenerated every time the service is restarted and is never disclosed. This does two
things:

 1. ensures that hit counts are limited to one per hour per IP per page;
 2. ensures that you can't enumerate all possible hashes from just the page, time, and then just
    trying all four billion possible IPs to find the matching hash.

There is no need to put up a tracking consent form because nothing is being tracked.

### Security?

Well, you need to give it a specific origin that is allowed to connect; this isn't really enough,
though. To mitigate the potential for abuse, the code that registers a hit checks against a set of
allowed slugs. Any time you add a new page to your site, you'll need to update the `slugs` table.
backend works in theory, but I don't have a way to display the count, and no tests. 2024-03-17 21:57:17 +00:00			`# Hitman counts your hits, man.`

			This is a simple webpage hit/visit counter service. To run in development, copy the provided `env.example` file
			to `.env`. By default, it will look for a database file in your home directory called
			`.hitman.db`. You can let hitman create it for you, or you can use the `sqlx db create` command (get
			by running `cargo install sqlx-cli`; see https://crates.io/crates/sqlx-cli ); if you do that, don't
			forget to also run `sqlx migrate run`, to create the tables. This project uses SQLx's compile-time
			`SQL checking macros, so your DB needs to have the tables to allow the SQL checks to work.`


			`## How does it work?`

Fix bug in default period calc, update readme. 2024-03-29 15:59:14 +00:00			You need to register the hit by doing a GET to the `/hit/:page` endpoint, where `:page` is a unique
			`and persistent identifier for the page; on my blog, I'm using the Zola post slug as the id. This bit`
			`of HTML + JS shows it in action:`
tidy, buff the readme 2024-03-17 22:54:03 +00:00
			``` html
Fix bug in default period calc, update readme. 2024-03-29 15:59:14 +00:00			`<p>There have been <span id="allhits">no</span> views of this page</p>`

			`<script defer>`
			`const hits = document.getElementById('allhits');`
			`fetch('http://localhost:5000/hit/index.html').then((resp) => {`
update readme with slugs table 2024-04-01 00:29:48 +00:00			`if (resp.ok) {`
			`return resp.text();`
			`} else {`
			`return "I don't even know how many"`
			`}`
Fix bug in default period calc, update readme. 2024-03-29 15:59:14 +00:00			`}).then((data) => {`
			`hits.innerHTML = data;`
			`});`
			`</script>`
tidy, buff the readme 2024-03-17 22:54:03 +00:00			```

Fix bug in default period calc, update readme. 2024-03-29 15:59:14 +00:00			In this example, the `:page` is "index.html". The `/hit` endpoint registers the hit and then
			`returns back the latest count of hits.`
tidy, buff the readme 2024-03-17 22:54:03 +00:00
Fix bug in default period calc, update readme. 2024-03-29 15:59:14 +00:00			The `index.html` file in this repo has the above code in it; if you serve it like

			`python3 -m http.server 3000 & cargo run`

			`then visit http://localhost:3000 you should see that there is 1 hit, if this is the first time`
			`you're trying it out. Reloading won't increment the count until the hour changes and you visit`
			`again, or you kill and restart Hitman.`

update readme with slugs table 2024-04-01 00:29:48 +00:00			If you see a log message like `rejecting invalid slug index.html`,
			you'll need to add the allowed slugs into the `slugs` table:

			``` sql
			`insert into slugs (slug) values ("index.html"), ("user");`
			```

			`See the note on security below.`

good enough 2024-03-30 20:07:56 +00:00			`### Privacy`
Fix bug in default period calc, update readme. 2024-03-29 15:59:14 +00:00
			The IP from the request is hashed with the date, hour of day, `:page`, and a random 64-bit number
			`that gets regenerated every time the service is restarted and is never disclosed. This does two`
			`things:`

			`1. ensures that hit counts are limited to one per hour per IP per page;`
			`2. ensures that you can't enumerate all possible hashes from just the page, time, and then just`
			`trying all four billion possible IPs to find the matching hash.`
tidy, buff the readme 2024-03-17 22:54:03 +00:00
Fix bug in default period calc, update readme. 2024-03-29 15:59:14 +00:00			`There is no need to put up a tracking consent form because nothing is being tracked.`
backend works in theory, but I don't have a way to display the count, and no tests. 2024-03-17 21:57:17 +00:00
Fix bug in default period calc, update readme. 2024-03-29 15:59:14 +00:00			`### Security?`
backend works in theory, but I don't have a way to display the count, and no tests. 2024-03-17 21:57:17 +00:00
update readme with slugs table 2024-04-01 00:29:48 +00:00			`Well, you need to give it a specific origin that is allowed to connect; this isn't really enough,`
			`though. To mitigate the potential for abuse, the code that registers a hit checks against a set of`
			allowed slugs. Any time you add a new page to your site, you'll need to update the `slugs` table.