2

I'm trying to replace CRLF (windows line endings) with LF (unix line endings) in all commits.

I've found that you can use this config:

git config --global core.autocrlf input

But from my understanding this will replace CRLF with LF in the future commits, not in the commits that are already in my repo.

adamlowlife
  • 219
  • 2
  • 11
  • 3
    Duplicate of this question on SO's sister site: https://superuser.com/q/293941/14517 – Heinzi Jan 29 '19 at 14:00
  • 2
    Generally speaking, you should avoid massive rewrites of your Git history like this. Exceptions might include if you are the only one working on that repo (e.g. an independent project). Otherwise, if your branches are shared among several developers, they will all basically have to gut everything and clone again. – Tim Biegeleisen Jan 29 '19 at 14:01
  • Possible duplicate of [How to replace crlf with lf in a single file](https://stackoverflow.com/questions/27810758/how-to-replace-crlf-with-lf-in-a-single-file) – dimwittedanimal Jan 29 '19 at 14:01

2 Answers2

3

It is possible to change the history in your repo so that it appears LF was always used; but it is not a simple process, and there are costs to doing it (which may or may not matter depending on how your repo is used). You might want to take a closer look at why you want all commits to have LF, and see if there's a simpler way to get the result you want.

(For instance, if you're working on a system that uses LF line endings and just want the files normalized to that standard whenever you check them out, you might be able to accomplish that with filters - which is basically the same idea as what autocrlf does for windows users when the repo contains (at least some) LF line endings. This is not as pure a solution, but it avoids the disruptions of a history rewrite, so if it meets your needs it may be a better solution in practice.)

IF for whatever reason you really do need to change the line endings as they're stored in history, then you have to rewrite the history. That is, you have to replace all of the commits that currently have CRLF line endings with new commits that have the line endings replaced.

That means all commit ID (hash) values will change, so if you use those for anything (like release documentation, or whatever), it would either need to be updated or would be obsolete. It also means that any other clones of the repo (on build servers, or in use by other devs) would need to be updated. The basic issue (and a method of solving it when rewriting a branch's history) can be found in the git rebase docs under "recovering from upstream rebase", but for a sweeping history rewrite it's better to have everyone push all changes to a central repo, then discard all clones, then rewrite the central repo's history, then have everyone re-clone. IF that's not practical in your circumstances, then I strongly advise against attempting a rewrite, because if someone does things wrong while trying to recover after a rewrite, the rewrite could be undone.

With that said - if you decide to do it, how?

If the history is small and simple, you could possibly use rebase. For example, given a small, linear history, you could say git rebase -i, change all commits' commands to edit, and find and fix the line endings in all files for each commit. This can get tedious really quickly, though, and if you have any 'real' amount of history you'll need something more automated.

In the most general case, you could use git filter-branch. The easiest way is with a tree-filter; you'd write a script that rewrites all line endings in the work tree (which, for this example, we'll call filter.sh, then give a command like

git filter-branch --tree-filter filter.sh -- --all

IF the repo is large (lots of commits, and/or a very large work tree) then the command will take a long time. You can speed it up some by giving filter-branch a ramdisk on which to put its work tree, but it still is a resource-intensive operation.

It would be faster (but still maybe not "fast') to use an index-filter. It's more involved, though. Probably you'd pre-process every BLOB (file object) in history, producing a version with line endings replaced; then your script would run commands to update the index by replacing each "old' BLOB with the corresponding new one.

I realize that last paragraph gets into some terms and concepts that not everyone might know, and that's kind of the point. If you know enough, or it's worth learning enough, to understand and assemble such a process, then that's an option, and it would be faster than using a tree filter.

Mark Adelsberger
  • 42,148
  • 4
  • 35
  • 52
0

As an alternative to rewriting all your repo's history, you could have git to normalize line endings.

In https://www.git-scm.com/docs/gitattributes you can find an example of how to use the --renormalize option of git add. In this example it is used in conjunction with gitattributes files, but it should also work, when changing the global setting of core.autocrlf.

The description at https://www.git-scm.com/docs/git-add#git-add---renormalize sounds quite like, what you are aiming for:

Apply the "clean" process freshly to all tracked files to forcibly add them again to the index. This is useful after changing core.autocrlf configuration or the text attribute in order to correct files added with wrong CRLF/LF line endings. This option implies -u.

salchint
  • 371
  • 3
  • 10