Web authentication using the filesystem

Have you ever tried to password protect a file on the web using only the filesystem? Here are two very similar ways to do it.

A metal punch card like sheet with individual characters in the left two columns and a grid for marking

Hah, so we were just talking about some cursed way to implement a key-value store in a Discord server. There’s a... I’m not sure what posts are called on BlueSky, but there’s a post on BlueSky about it:

なぜ (@why.bsky.team)
Looking for a new key-value store? Try a billion files in a folder. Your filesystem won’t mind. A billion files in a folder will just work. Don’t worry about it. A billion files, in one folder. Live your life. You don’t need a fancy piece of software, you have XFS.

You want files. Keys are the filenames, values are whatever the content of the files are.

If you know the filename, you can access the content. If you don’t know the filename, you can’t access the content. You can definitely make use of this to make sure only “authenticated” people access certain files. I put that in quotes, because authentication here just means they have a password. There’s no database involved, there’s no user account, there are no RBAC policies, there’s no SSO. There’s just a file store, and a web form that asks for a password. If you know the password, you get the file. If you don’t know the password, you don’t get the file. Simple.

So how does this work?

Storage the file

When you upload a file, you also give it a password. Let’s say the file is superimportant_financial_report_v4.final.final.really_final.pdf, and you choose a passphrase: Correct-Horse-Battery-Staple-3882$. The upload handler receives both of these information, it hashes the password using bcrypt or something collision resistant, then base64 encodes the resulting password, giving you this hash:

JDJ5JDEwJFFWTjVTcVdhSC5LYTVPVXJKZWxTS08wMFYwZEZTUTFsa09mRVVCL3JLbmxmQmdvcHpmYzFx

Then the storage engine puts the file down into a folder with the following name:

superimportant_financial_report_v4.final.final.really_final.pdf.....JDJ5JDEwJFFWTjVTcVdhSC5LYTVPVXJKZWxTS08wMFYwZEZTUTFsa09mRVVCL3JLbmxmQmdvcHpmYzFx

Original filename, ..... separator, and the hash of the password. The separator can be anything as long as it’s constant, valid filename, and likely not contained in the filename.

Accessing the file

Let’s say you clicked on a link that would access this file, and now you’re presented with a password field. You put in Correct-Horse-Battery-Staple-3882$, and click “Download.”

Storage knows which file you’re looking for, so grabs the one where the filenames match, extracts the hash, and uses the password you provided and the stored has into a password verify function in the language it’s written. If that returns true, you’re served the file with the original filename, if not, you get a 403.

Okay, but this is still too much work

Is there an easier way?

Yes, kinda. You need to give up variability and security though. The cool thing about the hashing functions (in PHP 8.3 it’s password_hash and password_verify) is that they automatically generate a salt for you. A salt is some additional data to make the hashes unpredictable, but verifyable.

If you’re happy not having salt in your passwords, you can use message digest functions, which will be deterministic and will produce the same output given the same input.

Using the password above, if I use the hash PHP function with the sha512 hashing algorithm, I will always get the following output:

3163e2a11c5054eeb8513ce4278997978b9d3a0a6444c90798c33d7becc282bf501e7dab7c04a93749c46bdcc97641f4b1acef4bcdc8ddab5954a22d1acf2486

Then we save the file as normal, except without the separator:

superimportant_financial_report_v4.final.final.really_final.pdf3163e2a11c5054eeb8513ce4278997978b9d3a0a6444c90798c33d7becc282bf501e7dab7c04a93749c46bdcc97641f4b1acef4bcdc8ddab5954a22d1acf2486

When requesting the file with the password, all we really need to do is put together the file and the hashed provided password, and if there’s a file by that name, we can serve it up, otherwise it’s a 404 instead of a 403.

So what’s the catch?

This is simple, but does not protect against people who can read your directory, so you need to configure your storage in a way to prevent that from happening. Collisions are unlikely. Logging will contain the password hashes because they’re part of the filenames. That’s usually not a good idea.

This is more of a curiosity than a recommendation for you to do.

Photo by Jon Hodl on Unsplash