Building a webarchive with Archivebox

By jeff


Link rot (archived-link) is a very real thing, and something that I'm noticing even more now. In 2020 I was optimistic with so many friends and people that I followed on different social media platforms starting blogs, but only a couple years later they are offline, the sites, the people are still around.

The only reason I noticed it happening was because at the same time I spun up a FreshRSS server and started following RSS feeds vs the algorithm that was turning anger to hate at an accelerating rate.

For the last year or so I have been running a local VM with Archivebox, but managing VMs is a pain and I'm starting to grock docker, so I thought I would put together a personal web-archive that I can use in parallel with The web-archive.

Step 0 - Preparation

To get started I set up a folder on my bulk storage volume (I will write a post about my synology based web server at some point). I have two volumes, one SSD based and the other spinning. As the spinning disks die I will be replacing them with SSDs too, so for now this site will be very slow.

Make sure that the folder has proper permissions. I gave the SYSTEM user read/write access so the docker can do its thing.

Step 1 - The install

Next I opened Portainer and grabbed the "official" docker-compose file and copy/pasted it as a stack.

partial screenshot of the docker-compose.yaml

My tweaks

  • I already have something running on port 8000, so I changed it to 8001
  • I increased the maximum media size to something HUGE
  • I mapped the volume I created above.

Step 2 - Getting ready to set up

Deploy the stack and wait till its started.

Now, Archivebox WILL NOT RUN AS ROOT and you can't log on until you have created a superuser, so first you need to change the archivebox user passwd in the terminal.

Synology makes this pretty easy. Create a new terminal to get started.

Screenshot of the synology docker terminal page

Once there you want to confirm the user with whoami and then change the password for the archivebox user

passwd archivebox

Creating the superuser

Now that you know the password for archivebox you can change to that user with

su archivebox

Then you want to confirm that you are in the correct folder with ls and then you can run the archivebox command:

archivebox manage createsuperuser

I keep this simple and used archivebox as the superuser, but this can be anything.

When you are done the terminal will look something like this:

Screenshot of terminal when creating archivebox superuser

Step 3 - Log in and start archiving

Now you can either run archivebox commands from the terminal, or log into the webapp with the super user and get started.

screenshot of the archivebox login

As time allows I will post links to the archives I make, post some tips I've learned, etc. This is what an