I have a remote git repo with more than 100 commits. I have a copy of those files locally and have some changes. I initialized a git repo locally. I added the remote as origin, but now I am facing some problems. The local repo is only initialized, and no commits yet, and no files staged. I have some important changes locally which are not present in the remote. The problems:
- I can't pull remote because that will overwrite the existing local files
- I can't commit all local changes and push because they don't share the same commit history
- I can't force push also, because that will delete the commit history in remote
What I am trying to achieve is to pull all commit history from remote, and then merge the local commit into the latest commit on remote.
Please note (in case this info will also help): This is a dotfiles git repo. In my previous machine, I set the local git repo using the command git --work-tree=/home/user --git-dir=.git/ init. In my new machine, before setting this repo again, and pulling the changes, I instead copied the config files and other dotfiles from my old machine and made a few changes. Now in the new machine I used the same command but the above mentioned problems were raised. Both remotely and locally I have the ~/.mozilla/firefox also backed up, so it contains binary (or similar) files which are neither readable nor writable, but have been changed while using firefox, including .db files. So for those files, I just wanna keep the local ones, and ignore the remote ones.
How can I approach this?
Right. So don't do that.
You never commit changes in the first place. This statement starts with an incorrect assumption! That's where we'll fix things.
This part is right, so we won't do that.
Now, before we go about fixing #2 above, let's get into this part:
This is ... difficult, and also illustrative: it shows the difference between a backup system (which should save and restore these databases) and a version control system (which should not). Git is a version control system, not a backup system; as such, you really don't want to store these databases in it. But you already did, so you're stuck with that. There is no good Git solution to this issue. Consider redoing everything so that these databases are excluded from your version control. (Whether you want to include or exclude browser bookmarks is a separate question, but note that these are typically imported and exported as XML or modified HTML or some such, and Git's merge algorithms perform poorly with these file formats.)
With that caveat—that these version-controlled databases are going to be a problem and there is very little you can do about it—let's go on to items 1 and 2 above. You have been led astray by learning the
git pullcommand. It's not inherently bad, butgit pullis composed of two more-basic commands:git fetch, which you do need to use; followed bygit mergeby default), which you must not use here.Knowing that
git pull=git fetch+ second Git command, and what each of these two commands do, would have gotten you a lot closer to your answer. All you need to do is:git fetchto obtain all the commits from the remote namedorigin(i.e., rungit fetchorgit fetch origin).git commit, so the difference between the parent of the new commit, and the new commit itself, will be the differences between the files in each of those two commits.(In other words, that's where "changes" come from. Git does not store changes. Git stores snapshots. But the snapshot-diff duality [1] [2] means we can work with changes whenever we like.
Step 2 is the hard part. The "normal" way to "get on a branch" is to use
git switch(or, for Git versions predating 2.23,git checkout), but this asks Git to overwrite your working tree files. As these files are not (yet) in any commit(s), you definitely do not want to do this. You:git fetch;git add;I've reproduced this condition:
(The
../tbare repository here is a repository full of dinky test files and other random stuff from old stackoverflow answers.)So, no branches, no commits; let's run
git fetchand populate with commits:Note that while I'm still on my own
masterbranch, mymasterbranch does not exist. I can change the name of this unborn branch withgit checkout -borgit switch -cor, if my Git is new enough,git branch -m(move, i.e., rename, branch). It's up to you what branch name you want to use here. For illustration I'll switch mine tomain. Then I will create it based on the upstreammaster, which I now have in myttrepository asorigin/master:Note how it now looks as though I've deleted every file. This is because the index (aka staging area) remains empty. I have to
git addfiles to populate it, or run commands such asgit restore -Sto copy files from the current orHEADcommit into Git's index, or both.The current commit is now
11ae6ca18f6325c858f1e3ea2b7e6a045666336d. That's the commit I specified when I rangit branch mainto create the namemain. I did that by writingorigin/master, but note:There's that same hash ID:
origin/mastermeanscommit 11ae6ca18f6325c858f1e3ea2b7e6a045666336d. Branch names likemainand remote-tracking names likeorigin/masterare just ways we have Git remember hash IDs for us.If I want, I can now run
git reset, which does a--mixedreset, which means it moves the current branch name to the commit I specify. I'll use the default, which isHEAD, which is the current commit specified by the namemainwhich now holds the same commit hash ID asorigin/masterwhich is the commit I'd like to "append to" after all. (That's the commit I chose withgit branch main origin/master!) Then, having "moved" the current branchmainfrom11ae6ca18f6325c858f1e3ea2b7e6a045666336dto11ae6ca18f6325c858f1e3ea2b7e6a045666336d—i.e., not having moved at all—git reset --mixedwill read into Git's index all of the files from the commit I moved to. So now no files will be staged for deletion. Instead, the index / staging-area now matches the current commit, andgit statusreports on the difference between the index and my working tree (mine is empty, yours won't be):(the
git statuslist is the same as thegit resetlist here, and again consists of every file in Git's index, which is now every file in the current orHEADcommit).If I didn't want to fill Git's index like this, I can
git rm -r --cached .or (a bit of a special case hack)git read-tree --emptynow. But in general this is what you want to do:git fetch.origin/whatever), so that your next commit will use this commit as its parent.You can, if you like, set up your new branch as a different branch—not
mainormasteror whatever—and you can set it up without an upstream usinggit branch --no-track newbr origin/masterfor instance, or you can remove the upstream later withgit branch --unset-upstream. You cangit restore -S(but not-W) andgit reset --mixed(but not--hard) if you like. These are all just fiddling around the edges: the fundamentals you want are those in the bullet points above this paragraph.On a completely different note: dotfiles repositories
I like the idea of storing (some / many / most of) my "dot-files" in a repository. What I don't like is having a
.gitrepository in my home directory, where those dot-files live. So what I did was write an overly fancy program: I put my committed dot-files into a repository and then have the program install them into place, mostly with symlinks wherever that works. This lets me pick and choose which dot files actually get saved and hence work around problems like the Firefox binary databases.Mine is messy and highly imperfect and I have not fussed with it for a few years at this point. It's probably not a great starting point for anyone else. But I think the general idea is sound enough: don't store the dot-files in a Git repository, store prototype dot-files that get copied or symlinked or whatever. Maintain the prototypes, not the actual files, so that you can accommodate quirks as needed. In other words, add a level of indirection.