Source Control and Git
Source control is one of the most common and important concepts to learn when coding and in this article we will run through what it is and also one of the most commonly used forms of it, 'git'.
What is Source Control?
Source or version control is a method for monitoring and managing changes to code. It ensures the stability of code while enabling multiple developers to collaborate on the same project without causing large issues with the code already in production or released.
A detailed resource by Atlassian, titled "What is version control" gives an in-depth explanation of source control and it’s uses, however we’ll go through it a little bit more and further into git in particular.
In simpler terms, version control allows developers to expand upon and modify the original code confidently, knowing that their adjustments won't lead to issues in the integrity of the source code during building new features and fixes in development. It also provides reassurance that, if any issues pop up from these changes, reverting to a previously stable code version is super easy, sort of like time traveling back to a past version in time.
In my view, this is one of the first concepts and resources that a developer should pick up after they start coding. This will help your learning, development and improve your development time.
Source control and git have been a lifesaver through my coding journey many times. You know how you can go down a rabbit whole of coding and then realise you need to go back, thanks to git and source control you can easily revert and save hours and possibly days trying to go back and undo any unwanted changes.
What is Git?
Git is an open-source, distributed version control system. It’s designed for speed and efficiency in handling projects of any size. It lets each developer to have a complete local copy of the project's repository and allows for independent work, offline activity, and seamless collaboration. Its branching and merging capabilities support isolated development on new features or fixes, with easy integration back into the main production version of the project.
Git ensures data integrity through checksums for every file and commit, protecteing the project's history and allowing for visibility of any for changes. It stands out for its performance with large projects and its active, supportive community that contributes to its continuous improvement. As a free tool, Git offers an accessible, powerful solution for managing the complexities of software development, making it a preferred choice for developers.
Why Git?
Git has transformed the world of version control systems, bringing forward multiple compelling benefits that cater to modern development workflows. Its key features not only address common challenges faced by development teams, it also improves productivity and project management.
Speed and Efficiency: Git is designed for fast performance, even when handling large projects that have extremely large amount of commits. This efficiency is perfect for developers who need to jump between previous versions or collaborate on complex codebases without getting slowdowns or bottlenecks.
Offline Capabilities: Apart from downloading and installing the software, it can function fully offline. Since each developer has a complete local copy of the project's repository they can continue making changes while committing and even branching without an internet connection. However, to push updates to a project stored in the cloud an online connection is required.
Data Integrity: Using checksums to uniquely identify each commit and its contents it can make sure each change is tracked and identifiable. This not only ensures that every piece of your project is adjusted as expected over time but also protects against accidental or malicious alterations as each change is tracked.
Branching and Merging: This is what makes git stand out as one the the most optimal source control resources. It allows developers to work on new features, fixes, or experiments in isolated environments (branches) without affecting the main codebase. When the developer is ready, these changes can be merged back into the main project smoothly. This allows for more experimentation and testing new concepts, as there's no risk to the main codebase of the project while working on branches.
Collaborative Workflows: Git allows for various collaborative workflows, making it easier for teams to adopt a methodology that suits their project's requirements and size. Particularly due to branching and merging, it allows developers to work simultaneously on a project without causing larger issues to the codebase (sometimes this can lead to merge conflicts but this can be easily rectified when reviewing code to merge together). This allows for a more organized and manageable development process.
Community Support and Resources: Being open-source and widely used, Git has a large ecosystem of tools and a strong and supportive community. There are vast amounts of resources available for learning Git for all skill ranges, as well as third-party tools and integrations that extend its functionality.
Git and Branching
Branching is one of the main features of Git and allow for parallel development, which allows developers to diverge from the main line of development and work independently on various tasks and features. When you create a branch you're creating an environment where you can experiment and develop new features, or fix bugs without causing issues to the main codebase (master branch or main branch)
Each branch in Git is a lightweight movable pointer using checksums to one of these commits. The process of branching is nearly instantaneous, and switching between branches is just as quick. This is because Git only needs to update the reference pointers, not the actual project files, making it an efficient way to manage multiple lines of development. Whenever you checkout or switch between branches, it will load that specific environment or version of the solution.
Working in separate branches allows developers the freedom to experiment with ideas in a sandboxed version of the project. It's like having a parallel universe for each new feature or fix where you can commit changes without impacting the stability of the main code. When a feature is complete and tested, it can be merged back into the main branch. By merging the branch into the main or master branch it puts all the new code changes into the production and then this becomes the new main source of truth for all developers to branch off.
Branching and merging allows for various workflows such as feature branching, Gitflow, and forking, to name a few which provide flexibility in how teams collaborate on projects. Branching not only helps in isolating feature development but also in facilitating code review processes and managing releases. It's a powerful tool that, when used effectively, can significantly enhance productivity and code quality in software development projects.
Git Branching Models
There are some specific branching model frameworks that guide teams on how to use Git to handle development, releases, and maintain projects. This allows a consistent approach to source control within a team.
Below we will go through a few concepts that teams can follow depending on your team and project requirements.
Gitflow Model: This is a highly structured model originally introduced by Vincent Driessen. It's best suited for projects that have a scheduled release cycle and a need for parallel development. Gitflow involves using different types of branches for different purposes:
Feature branches: For developing new features. Each feature gets its own branch and is merged back into the develop branch when completed.
Develop branch: Serves as an integration branch for features.
Release branches: For preparing releases. They allow for minor bug fixes and preparing meta-data for a release.
Master branch: Contains the official release history and is where release branches are merged into.
Hotfix branches: For urgent and unplanned issue remediation, these branches are created from the master branch and are merged back into both master and develop.
Git Feature Branch Workflow: This is a simpler model that focuses on the use of feature branches only, without the complexity of multiple long-lived branches. Developers create new branches for each feature or bug fix from the main branch. After the work is completed and tested, these feature branches are merged back into the main branch. This model is often preferred by teams that deploy frequently and where releases are not tied to specific feature sets.
Choosing between these models (or others not mentioned here) is contingent upon the needs of your project and the preferences of your team. Larger teams with complex release strategies might prefer Gitflow for its clear structure and explicit separation of concerns, while smaller teams or those with a continuous deployment process might choose the Feature Branch Workflow for its simplicity and efficiency.
Git Services
While Git is a command-line interface (CLI) tool, many Git hosting services provide user-friendly interfaces, collaboration features and cloud codebase storage.
These services are platforms that host your repositories and add many user-friendly features on top of the standard functionality provided by Git itself.
A few of the popular services include:
GitHub: Owned by Microsoft, GitHub is arguably the most well-known Git service provider. It's home to a vast open-source community and offers seamless integrations with many other development tools and services. GitHub also emphasizes social coding, where developers can follow each other, star projects, and contribute to public repositories.
GitLab: GitLab provides a more integrated Continuous Integration/Continuous Deployment (CI/CD) experience, allowing users to not only host and review code but also to run automated tests and deploy code within the same environment. It promotes a more DevOps-focused workflow and can be self-hosted or used as a cloud service.
Bitbucket: Atlassian's Bitbucket is integrated tightly with their other services like Jira and Trello, making it especially appealing for users already within the Atlassian ecosystem. Bitbucket offers both private and public repositories and is known for its strong focus on serving professional teams with advanced permission settings.
Azure DevOps: Azure DevOps, formerly known as Visual Studio Team Services, is a suite of development tools provided by Microsoft. It's a service that integrates with Git for version control and offers a broad set of tools for software development and delivery.
Let's run through some of the core features these services provide:
Version Control: All Git services offer robust version control systems, which are fundamental for tracking and managing changes to the code over time. They allow developers to maintain a comprehensive history of their work, revert to previous states, branch off for new features, and merge updates.
Collaboration Tools: These services shine when it comes to team collaboration. They provide pull requests (or merge requests) which are a cornerstone for code review. Team members can suggest changes, review code written by others, comment for discussion, and approve or request additional modifications before the changes are merged into the main branch. Issue tracking is another key feature, where team members can report bugs, request features, and assign tasks to specific contributors.
Code Visualization: Git services often include tools that visualize the code and its changes over time. They show diffs, which are the differences between file versions, highlight changes in the codebase, and offer graphs to represent branches and commits, making it easier to understand the project's progress.
Access Control: Access control is critical for security and project management. Git hosting services provide ways to control who has access to the repository, with options ranging from read-only to full administrative privileges. They allow project leads to manage team member roles and responsibilities, ensuring only authorized personnel can make changes to the code.
Cloud Accessibility: Git hosting services like GitHub, GitLab, and Bitbucket are cloud-based platforms, making your repositories accessible from anywhere in the world. This online presence eliminates the need for a centralized server in your office or local network, and developers can push and pull code from the repository at their convenience.
Teams can simplify the development process with these services, ensuring code quality and streamline the collaborative aspect of building projects. Each service comes with it’s own set of unique features and integrations so your choice may depend on some of the unique features they offer.
Setting Up and Using Git
There are a few steps to install and setup git on your machine, but once installed you will be able to use it in all your coding solutions.
Below are some helpful steps depending on your machine to work out how to get it installed and start using it.
Remember there are also software interfaces that you can install and run to make it easier instead of dealing with just the command line alone, but it’s always helpful to know the underlying commands.
Installing Git
On Linux:
Install Git using your distribution’s package manager. For Ubuntu or Debian:
sudo apt-get update
sudo apt-get install git
Verify installation:
git --version
.
On Windows:
Download the Git installer from git-scm.com.
Follow the installer's prompts.
Verify installation in Git Bash or Command Prompt:
git --version
.
On Mac:
The easiest way to install Git on a Mac is via the stand-alone installer:
Download the installer from git-scm.com.
Alternatively, you can use Homebrew, a package manager for macOS:
brew install git
Verify installation in the Terminal:
git --version
.
Basic Git Commands (Same for Linux, Windows, and Mac)
Initializing a Repository:
git init
Staging and Committing Changes:
Stage changes for a specific file:
git add <filename>
Stage all changes:
git add .
Commit staged changes:
git commit -m "Commit message"
Branching and Merging:
Create a new branch:
git branch <branch-name>
Switch to a branch:
git checkout <branch-name>
Or, for newer versions:
git switch <branch-name>
Merge a branch:
Switch to the receiving branch, then:
git merge <branch-name>
Remote Repositories:
Add a remote repository:
git remote add origin <repository-URL>
Verify the remote URL:
git remote -v
Push changes to remote repository:
git push -u origin main
Clone a repository:
git clone <repository-URL>
Differences in Operating Systems
While the Git commands themselves remain consistent across Linux, Windows, and Mac, the initial installation process varies due to differences in operating systems and available package managers.
Linux uses the native package manager (`apt` for Debian/Ubuntu, `yum` for Fedora, etc.).
Windows installation is through a downloaded executable installer.
Mac users have the option of using the stand-alone installer or Homebrew, a popular package manager for macOS.
After installation, the configuration of Git with your name and email, which is crucial for identifying commit authors, is identical across systems:
git config --global user.name "Your Name"
git config --global user.email "your.email@example.com"
For those preferring a graphical interface, GUI tools like GitHub Desktop, Sourcetree, or GitKraken offer similar functionalities across all three platforms, abstracting away the command line differences and making version control more accessible.
Terminology
There are various terms used when using source control and git and here are some of them explained:
Source or Version Control: A system that tracks changes to code over time, allowing multiple developers to collaborate efficiently.
Git: An open-source distributed version control system, designed for speed and handling projects of any size with ease.
Repository (Repo): A storage space where your project's history is tracked by Git.
Branching: The process of diverging from the main line of development to work independently on changes.
Merging: Combining the changes from one branch back into another branch, often the main branch.
Commit: A recorded snapshot of your project's files at a point in time.
Master/Main Branch: The default development branch where the stable code lives and development occurs.
Feature Branches: Branches created for developing specific features or changes, isolated from the main codebase until completion.
Develop Branch: A branch where all the features are merged for testing before being released to the main branch.
Release Branches: Branches created to prepare a new project release, allowing for final adjustments and versioning.
Hotfix Branches: Quick branches made from the main branch to fix urgent bugs.
Remote Repository: A repository hosted on a server, facilitating collaboration by allowing multiple developers to push and pull changes.
Pull Request (or Merge Request): A method for notifying team members about changes in a branch, requesting their review and merging into another branch.
Git Services: Cloud-based platforms hosting Git repositories, offering collaboration tools and user-friendly interfaces.
GitHub, GitLab, Bitbucket, Azure DevOps: Various Git hosting services providing collaboration features, version control, and more.
Push: The command to upload local repository changes to a remote repository.
Pull: Fetching changes from a remote repository and integrating them into your local repository.
Remote: A command to manage set of tracked repositories.
Clone: Creating a local copy of a remote repository.
Origin: The default name given to the remote repository from which a project was cloned.
Checkout: Switching between branches or restoring working tree files.
Switch: A newer alternative to 'checkout' for switching branches.
Homebrew: A package manager for macOS used for installing Git and other software.
Gitflow: A branching model for Git, prescribing specific roles to different branches and defining how and when they should interact.
Staging: The process of preparing changes in some files to be committed to the repository.
Checksum: Is mathematical calculations used to verify the integrity of data by detecting any alterations or errors.
Merge Conflict: Occur when two branches have made edits to the same line in a file, or when one branch deletes a file while the other branch edits it, requiring manual intervention to resolve.
Resources
Learning Git takes time and practice. Here are some resources to get you started:
The official Git SCM website: https://git-scm.com/
offers extensive documentation and tutorials.
Online Courses:
Coursera
edX
Udemy
Interactive Git Training:
Websites like Atlassian Git Tutorial (https://www.atlassian.com/git/tutorials) provide a gamified approach to learning Git.
Git Graphical User Interfaces (GUIs):
Gitkraken: https://www.gitkraken.com/ (one of my personal favourites)
Git official site of GUI’s based on system: https://git-scm.com/downloads/guis
References
Atlassian (2024) What is version control, Atlassian, accessed 13 March 2024. https://www.atlassian.com/git/tutorials/what-is-version-control
draw.io (2020) How to create a gitflow diagram, draw.io, accessed 13 March 2024. https://www.drawio.com/blog/gitflow-diagram