domingo, mayo 22, 2011

This Is a Flash Of Pure Inspiration, Més I Més I Messi, Però Més Però Molt Més (The Feet Continue To Dance - The Wizard Of Ox)



Git is a distributed revision control system, where every working directory is a full-fledged repository with complete history and full revision tracking capabilities. 

Git is categorized as DVCS (Distributed Version Control System), because is not dependant on a central server. So the academic way for working with Git is pushing/pulling data from/to each developer repository. This works in small teams or in a highly distributed development (open source projects that people are working around the world), but in mid-size teams or business companies, that require a central repository because of infrastructure/workflow process like Continuous Integration System, QA Checks before delivering, Environment Backups, External Manual Audits... seem that a traditional SCM should be desired. But this claim is far from reality, Git is still your VCS; how about creating a theoretical central repository? I say theoretical because in Git there is no central repository at a technical level. This repository will act as central because of convention. I call, and in many other posts also call this repository origin.

A Git remote repository is a repository without working directory. Only composed by  .git project directory and nothing else.

Nvie has created a nice schema of this topology:


See that each developer pulls and pushes to origin, but also may exchange data with other peers. For example, if two or more developers are working on a new feature, they can push changes between them before pushing stable version to origin repository.

Git is not tied to any particular transmission protocol, it supports transmitting changes via USB stick, email, ..., or traditional way like HTTP, FTP, SSH, ...

So although Git has broken the typical SCM hub architecture to peer-to-peer structure, we can still create (by convention) a central repository for uploading stable code. And let me write again, "This central repo is just another node in the peer not THE REPOSITORY".

What I am going to explain is how to install and configure this "central repo" in an Ubuntu Server.

We can say that Git only takes care of repository management and leaves transport operations to lower layers. A typical transport configuration for these central repos is using SSH protocol. So let's install and configure a SSH server. (if you have already installed skip to next step).

Install SSH Server:

$ sudo apt-get install openssh-server

after installed try:

$ ssh <username>@<servername>

Configure SSH Server:

In /etc/ssh/sshd_config configure to only use SSH Protocol 2: 

Protocol 2

Next step is to install Git: (You can skip this step if you have already installed).

Install Git (not git-core package):

$ sudo apt-get install git

Then execute Git command to check that has been installed correctly.

Next step is creating a bare repository for the project. By convention, bare repository directories end with .git. So first thing to do is create a .git directory of project. 

Creating a bare repository from existing repository:

$ git clone --bare my_project my_project.git

This command transforms the /my_project/.git to my_project.git.

Creating a new bare repository:

If you are starting a new project you can initialize it directly as bare repository using:

$ mkdir my_project.git
$ cd my_project.git
$ git --bare init

Now all structure is created and ready to be transferred. Case that initial project was started on developer computer you should copy this directory (using scp for example) to origin.

Then execute next command:

$ git init --bare --shared

This command will add propertly group read/write permissions.

And now it is time to clone created repository to developer computer, I assume that developer has already an account in server (for connecting using ssh). So go to developer computer (or open another terminal) and type next command:

$ git clone <username>@<servername>:/<directories>/my_project.git

If user has read permissions to my_project.git directory, repository will be downloaded to local computer. Write permissions are required for checking in changes.

And now I suppose you are thinking that it was so easy creating a remote repository, but now another problem arises. If your company is small you can manually create a new user into your server for each developer, it should be easy to manage, but if your company is bigger, then management of all users is hard. You must create an account for each one, and more important, they will have access to server shell using ssh (not only for uploading code) or ftp, ..., and this fact implies a problem with security, you should take care of what a user can do and what cannot do in his shell.

So arrived at this point, one can setup accounts for everyone, which is straightforward but can be cumbersome. Another way is using an LDAP or any other centralized system, but this is alien topic for this post.

A second method is to create an account called "git" on the server, and ask every user who will have  access, to send its SSH public key, and add that key to the .ssh/authorized_keys file of "git" user. I am sure that this approach sounds you familiar (github way?). So let's explain this way:

First of all each user should send you its public key, (they can find in .ssh directory *.pub file), or simply create new, using ssh-keygen command. See this tutorial for learning how to generate both keys http://github.com/guides/providing-your-ssh-key.

Setting up Git server with user public keys:

First step is create a git user with .ssh directory.

#from server
$ sudo adduser git
$ su git
$ cd
$ mkdir .ssh

Next step is create authorized_keys file where all public keys will be stored:

For example:

#from server
$ cat id_dsa.user1.pub >> ~/.ssh/authorized_keys
$ cat id_dsa.user2.pub >> ~/.ssh/authorized_keys

And now each developer, with public key published in authorized_keys and private key in his own .ssh directory, has access to repository. Let's try, open another terminal (would be developer machine in real scenario) and try to clone existing repo from server:

#from developer computer
$ git clone git@<servername>:<directories>/my_project.git

After repository is cloned to developer computer, modifications can be made and pushed them.

And now you can say, "Ok, I don't have to create one account for each developer but I am still having a problem with security", each developer still has access to shell. Yes it is true, but you can easily restrict the "git" user to only doing Git activities with a limited shell called git-shell. Next step is specifying git-shell instead of bash for Git user, in /etc/passwd.

$ sudo vim /etc/passwd

and change

git:x:1000:1000::/home/git:/bin/sh

to

git:x:1000:1000::/home/git:/usr/bin/git-shell

Now your server is secured, only Git operations are allowed using "git" account with users that have sent their SSH public key.

You have your central remote repository configured and ready to be used; at this point you may consider install Git tools like gitweb, gitosis or gitolite, but in this post are off topic.

I hope you have found this post useful.

Music: http://www.youtube.com/watch?v=q2AemC0cwy0

5 comentarios:

Anónimo dijo...

- What about some regular/normal titles for the posts?

Noam dijo...

A client can also directly push it's publish key to the authorized keys of the git account using the command:

ssh-copy-id -i ~/.ssh/id_rsa.pub username@host

Alex dijo...

Noam thank you very much your input, I just don't know it exists.

I also send here a link where it explains the options of this command:

http://linux.die.net/man/1/ssh-copy-id

Thank you very much.

JJ dijo...

Without ssh-copy-id tool, for example for OS X users, you can do that:

cat id_rsa.pub|ssh superuser@git-server "cat >> /home/user-to-authorize./ssh/authorized_keys

Anónimo dijo...
Este comentario ha sido eliminado por un administrador del blog.