Permanently Removing Files From a Git Repository
What do you do if you accidentally check in some large or confidential files to a git repository and you need to remove all traces of them?
You can do a `git reset —hard [commit id]^` to move head to one revision before those files were checked in. They no longer show up in a git log. They’re gone, right?
Resetting will remove the files from the history in the current branch, but they could still be referenced by other branches, and they’ll be referenced by the reflog for at least the next 90 days. Even once you’ve removed all references to that commit, it’s still going to sit around as a dangling commit in the .git/objects directory, taking up space.
The story is the same if you go back and —amend the commit to remove the files, since the original commit will still exist in the objects directory.
If other people clone your repository, they’re still going to get the dangling commit. However, if they’ve already cloned and they pull they won’t get the file.
Here’s what you need to do to truly remove all traces of that file:
1) Remove all references to that commit from any branch. This can be tricky. Look through the history of all of your branches for that commit and remove it.
2) Remove all references in the reflog: git-reflog expire —expire=”0 days” —all
(being able to remove just the offending commit from the reflog would be great, but I couldn’t get any variation of git-reflog-delete to work for me)
3) You can now see that the commit is dangling by running a git-fsck. If that doesn’t print that the commit is dangling, go back to step 1 and try again.
4) Clean up the dangling commits by running git-prune
The files should now be gone from your objects directory. To be extra sure, you can run a git-fsck —verbose and look at all of the commits ids that it scans to make sure the commit has been completely removed.