In our company, we have an application that generates a large number of text files every day ( about 30000 files totaling about 100 megabytes per day). Most of the files are generated the same every day, but some files have differences from day to day. Information about these differences should be commented and stored. Information about the author of the comment also needs to be stored (these files are usually handled by about 5-10 people)
The obvious solution to this problem is to use a git repository. I thought to organize the work as follows:
- Create a non-bare repository on a network folder and give employees access to this folder. Install a git client for employees.
- Add all files to the repository and create an initializing commit
- Overwrite the files in the repository every day.
- Employees will go to the network folder every day and commit the changes to the files.
And this approach generally works. But because of the huge size of the repository and the fact that communication with the repository is done over the network, it is very slow. For example, git status
can take 5-10 minutes to execute.
Another option I was thinking of is to give users ssh\rdp access to the repository computer, which should speed things up. But this option is too complicated for users.
How would you solve this problem?
I would really appreciate any suggestions you may have, thanks.
git will perform poorly over network shares, which are not fast local storage. Similar to many databases.
git holds onto parent commit references forever, so the size of the repo keeps increasing. While there are ways to only download some commits, or to rewrite history, they are beyond basic git usage.
Centralize this workflow.
Consider finding and using a web interface for a version control system. Select one with the feature of commenting on commits, as personal users. Automate the procedure of committing the next set of files in question. Now in a browser, click on the change set and post a comment.