Git


Git is a distributed version control system for files, which is available under the GNU General Public License for developers around the world as open source software. It is used both by industry giants such as Google, Facebook and Microsoft as well as many independent, open source projects like GitHub, Open Compute Project or MediaWiki. Git was written in the C programming language. Its original purpose was to simplify working with the source code of the Linux kernel, with large amounts of data, and when many developers are involved.

History

Linus Torvalds developed the idea of Git when the version control software BitKeeper became proprietary and therefore was no longer freely available. The developers of BitKeeper changed the license under which the software was distributed. Resultantly, Torvalds began developing his own solution and also to simultaneously implement his high requirements on a distributed file management system. He was primarily concerned with different performance characteristics that had become important to him when working on the source code of Linux kernel.

  • Git should be a distributed, decentralized system that did not work with the client-server principle like a conventional Concurrent Versions System (CVS). Instead, a peer-to-peer system was used, which could also perform data transfer via different protocols.
  • Git was to allow similar workflows to BitKeeper and also nonlinear workflows, as well as integrate and support modern paradigms of software development (agile software development, Kanban or Scrum).
  • Git was to reliably protect files from unauthorized access and unintended changes.

In April 2005, the first version of Git was published. Since June 2005, Junio Harmano has been responsible for the project. He now works for Google. Meanwhile Git is available in version 2.6.1 and can be used with all Linux-like operating systems. There are appropriate ports for Windows and OS X. Git software is also used as a basis for CMS and Wikis because it has the characteristics of a simply structured database.[1] GitHub was launched on the basis of Git; it’s a hosting service that complements Git with collaboration tools for a variety of developers.

Functionality

Git software is based as a concept on six principles that differ significantly from other version control systems.[2]

  • Branching and merging: Branching is considered to be the creation of new development branches. Merging is the merging of two or more branches. Git is structurally constructed like a hash tree or Merkle tree, in which the sheets are hash values of data blocks and the upper nodes, the hash values of their children. The data structure allows on one hand an efficient handling of files because the data is considered differently and on the other hand, a high degree of cryptologic security because checksums are used.
  • Small and fast: Git is a decentralized system that saves files as a repository on a local computer. Even very large repositories can be processed that way, since updates are initially stored only on the local computer and communication with a central server is not necessary. When developing Git, importance was placed on the KISS principle (Keep it simple and short), so that Git is 1-325 times faster than other systems.[3]

Git.png

  • Distributed: Its decentralized structure is an essential feature of Git. The program clones a repository before it is modified by a developer. Every developer has such a copy saved locally. If a copy fails or there is a malicious attack on files, both the full copy as well as individual files can be uploaded again. This distributed architecture allows completely different workflows such as shared repository, blessed repository or the integration of large development teams working based on modern programming paradigms.
  • Data Assurance: Data security is produced by the fact that every file and commit (an ID of a specific file version) is compared using checksums. This ensures that third parties cannot modify the data and that your own changes build on the original developer’s data. Basically, any change gets its own ID, so as not to manipulate the history of the project and to be able to trace the life cycle of the project.
  • Staging Area: The staging area allows you to manage different commits, so that they can be formatted and evaluated, without changing the working directory. In this way, several areas of the source code, or individual files can be changed, so that the changes apply only to selected commits, and not all files.
  • Free and Open Source: The current version of Git is released under the GNU General Public License version 2.0 to make free software available to all users. Nevertheless, the logos and the term Git are subject to specific brand guidelines, which can be read in the Git Trademark Policy.[4]

Importance for programming

Git is considered a state of the art solution for decentralized management of files and source code. The program is used not only by large companies, but also by a variety of independent developers. Within the first ten years, Git became one of the most widely used technologies in the IT sector. The focus on decentralized version management also ensured that Git is a prime example of tools for modern software development. In particular, programming and development paradigms should be mentioned here, which have evolved in the last ten to fifteen years as well as agile software development and its different methods. Git as a distributed version control is an ideal partner for such software development and is replacing competing models such as Subversion, Mercurial or Baazar in the future.

References

  1. Gollum github.com. Accessed on 10/19/2015.
  2. Git About github.com. Accessed on 10/19/2015.
  3. Small and Fast git-scm.com. Accessed on 10/06/2015
  4. Git Trademark Policy git-scm.com. Accessed on 10/06/2015

Weblinks