Here's my scenario: When I initially created my Mercurial repo, I used hg add to add all *.pl *.sh and *.sql scripts to the repo. I later learned how to use the .hgignore file to exclude other files from the repo. One of the files I needed to exclude was a *.sql file that is generated by a script, so it is essentially a data file that constantly changes when the script runs that produces it; thus, I added it explicitly to the .hgignore file a few revisions ago.
Today, I want to update to a prior revision before this *.sql file was added to the .hgignore, so that I can create a branch off of it. However, when I try to update the working directory to this prior version, I get the following error:
a.sql: untracked file differs
abort: untracked files in working directory differ from files in requested revision
I know that one way I could get around this problem is to delete the file before trying to update to the prior revision, either by manually deleting it or using hg update --clean --check.
That may work in this particular case, since the file is auto-generated by a script each time, and so I don't care about the data that is currently in it.
However, I'm trying to find out what is the safe way people would generally handle this situation when they decide to ignore a file set (like a set of data files that aren't auto-generated) and need to return to a previous revision before they were marked to be ignored, especially if they wanted to retain the most current content in those file sets while still being able to view earlier revisions of files that Mercurial is actively tracking.
I've also considered that you could backup the files, but I think that is only a reasonable solution if this is a one-off case. If you want the ability to hg update to previous revisions on a frequent at-whim basis, then it becomes quite tedious to backup the data each time before you update to a earlier revision (it's also not a reliable way to guarantee that others may not delete the data that isn't being tracked in the repo).
Thanks for the help.
It depends.
If you have exclusive control over this repository, and have the practical ability to require everyone to re-clone from it, then you can use
hg convertto exclude the files from the old revisions. This is by far the cleanest option, but it will change the revision identifiers (hashes) for those revisions and all of their topological descendants. This is why everyone has to re-clone; their old clones will not interact properly with the new repository.If you can't do that, you can copy the files somewhere else (you do have backups already, right?), clobber the originals with the old versions, and then restore them from your copy. This has to be done whenever you check the files out, so it is definitely suboptimal. You may be able to make this slightly easier by keeping the files outside the repository and checking in symlinks to the files, but you'll still have to fix up the symlinks whenever you checkout an old version.
However, what you describe is not the normal use case for Mercurial. Typically, untracked files are autogenerated, or at least able to be regenerated from tracked files. The operating assumption is that untracked files are not important and can be discarded at any time. Mercurial doesn't actually do this, because that would be rude, but neither does it make any special effort to preserve them when (for example) you make a bundle of the repository.
If you need to deal with versioning of object files, it is typical to store them in a separate artifact repository or some other system. This can be more difficult to manage because you have to reunite the binaries with the source code when you do a build. But it is much more robust than keeping the binaries loose in the repository and hoping they won't get accidentally overwritten or deleted.
Another option is to collapse the binary to text and then place the text under version control. This is always possible (e.g. take a hexdump) but may or may not be practical or reasonable, depending on the file format. For a compressed file format (e.g. tarballs, most image files, etc.) the hexdump is not going to be any easier to 3-way merge than the original binary, so there's little point in it. Similarly, if the binary is huge, the hexdump will be huge too. On the other hand, if a binary is compiled from source code, it is entirely normal to store the source and discard the binary. For something structured like an SQLite database, you might try storing an SQL script which will generate the database. For a zip file or tarball, store the contents. And so on. All of these things can be regenerated using
makeor a similar tool whenever you check things out, and you can automate this with a repo hook.