Nicole Tietz-Sokolskaya
77d4ebb371
backed up by experiments demonstrating that SQLite will meet all of our requirements. This also introduces ADRs in the repo, and adds a README in preparation making the repository public.
38 lines
1.6 KiB
Markdown
38 lines
1.6 KiB
Markdown
|
|
# Overview
|
|
|
|
The goal of this experiment is to determine what whatabase to use for Pique.
|
|
Normally, we can just go with a tried-and-true option, like PostgreSQL. However,
|
|
a few factors are working against us here:
|
|
|
|
- The goal for pageloads is under 50ms for every page load (on the server side)
|
|
- PostgreSQL uses different storage for values over 2kB (compressed), which can
|
|
lead to much slower reads if it goes to the disk.
|
|
- We'll be storing documents up to 1 MB each in the database. This is just text
|
|
content and does *not* include resources like images.
|
|
|
|
This combination may make Postgres an unsuitable choice! It has proven slow in
|
|
the past when, at a job, someone had put large JSON blobs (multiple kB) into a
|
|
column, and queries were over 100ms when these columns were involved. I don't
|
|
know how much of that was a Postgres limitation and how much was the particular
|
|
schema and hardware we had, so I want to find out!
|
|
|
|
|
|
# Experiment design
|
|
|
|
I'm going to run a benchmark on three databases: Postgres, MariaDB, and SQLite.
|
|
Each will start by loading in a bunch of text documents into each database,
|
|
then we will do some random reads and measure the time of each. Memory and CPU
|
|
limits will be set on the non-embedded databases.
|
|
|
|
The text documents will be generated randomly at a size and count determined to
|
|
be a reasonable size Pique's data will probably reach after a few years. Our
|
|
experiment is not particularly valid if it only lasts a year.
|
|
|
|
To sample, we will pick random IDs from in the range of (0, count), since IDs
|
|
are assigned monotonically increasing for the databases we have chosen.
|
|
|
|
|
|
# Results
|
|
|
|
|