How to create versions of same file in database and disk like BaseCamp document manager

196 Views Asked by At

If you've used Basecamp there is a documents manager page you can also find a demo here. I liked the feature of creating versions when same file gets uploaded twice,[optionally user specifies to create the file as new version]. How do you think the files are maintained in the DATABASE as records and in DISK as files. From my inspection it seems the files are saved like below

Structure

[user_folder]
 [project_name]
  [file_name]
    [V1/V2/V3]

for example taking below consideration

Example

  1. user name is "mak"

  2. project name "izord" and

  3. file name is "base.html"

it has 2 versions here is directory structure

   [mak]
     [izord]
      [base]
        [V1]
         base.html
        [V2]
          base.html

i might be wrong in my assumptions, but that's why i am here to get it corrected.

Question

  1. How do you think i can organize files on disk in this situation
  2. Do you have any premade, already developed table structure for saving file details?
  3. How does one sort files from table based on alphabets like the one in demo video. Are they maintaining sort of another table?
1

There are 1 best solutions below

2
On BEST ANSWER

First of all, I'm pretty sure they store files in S3, so take this into account.

As for database structure, I'd have it something like

Account:      ID
Project:      ID, AccountID
File:         ID, ProjectID, Name, ContentType, CurrentRevision
FileRevision: ID, FileID, ContentLength, AmazonS3Key, CreatedAt, CreatedBy

and then store each file under <bucket>/<account-id>/<project-id>/<file-name> "directory" (there are no actual directories in S3, but they are emulated with forward slashes). File content goes inside an appropriate directory:

<bucket>/<account-id>/<project-id>/<file-name>
  v1
  v2
  ...
  v13

Uploading a new revision of a file is simply a matter of adding a v<File.CurrentRevision + 1> entry to an S3 storage.

Also note that disclosing IDs of accounts and projects might not be a good idea, so I'd go with hashes of those: <bucket>/<md5(account-id + salt)>/<md5(project-id + salt)>/<file-name>