Permissions setup for a Debian web host

> This is another of those “note-to-self” posts, where I detail how I’m setting something up so that I can refer back to it, or so I can point someone else to it. As if often the case, some of this may be Debian-specific…

There are different ways of approaching the task of setting the permissions for web directories, depending on how many users have access to the server, how many sites are sharing the server, and lots of other concerns. I tend to be in a situation where anyone who has shell access to the server is at a level trusted with web content, so that simplifies the process somewhat. I’ll look at different levels of dealing with this questions, in increasing levels of security.

## Simplest approach: use the `www-data` group

This is the default group that Debian has for web daemons. If you add everyone who can log in to this group, you can then use this group for all web directories that the notional “web team” needs to access, and make them “group writable”. **Be aware that this configuration also allows the Apache daemon itself to write to the web directories, which is an obvious potential security issue, so you need to be sure that the web applications in there don’t/won’t allow that.**

You can either specify the group when creating the user:

adduser –ingroup www-data USER

or add an existing user to the group:

adduser USER www-data

## Almost as simple as simplest approach: create a `webdev` group

This is also a very simple approach, which won’t allow Apache to write to the web directories, unless you specifically allow it. (This would usually be for cache directories, image upload and so on.)

“webdev” is just an arbitrary name, it can be anything you like as long as it doesn’t exist. First create your new group, then add the user(s) to it:

addgroup webdev
adduser USER webdev

It goes without saying (or should do) that for the above to work, you also need to allow the right group access to the web directories you need. A simple example of this, making a few assumptions of your directory layout, would be:

chgrp -R webdev /www/www.example.com/htdocs
chmod -R g+w /www/www.example.com/htdocs

So what did we just do?

First, we recursively (`-R`) changed the group to be `webdev` for the `htdocs` directory. Then, we (also recursively) allowed the group write-access (`g+w`) on `htdocs`. Which means: from now on, anyone in the `webdev` group can create and edit files in `htdocs` and any of its subdirectories. Note that these lines will stop any previously configured group-access from working (if it’s a different group from `webdev`). However, if we have a directory with write-access for everyone (AKA “`chmod 777`”) as is sometimes the case with cache directories, for example, they won’t be affected.

So, how can we make this more granular?

## Multi-layered approach: create per-site groups

If we wanted to have some directories writable by all our web team, and others by certain people in certain sub-teams, we can create multiple groups.

Take, for example, two subdomain sites on example.com: foo.example.com and bar.example.com. Of course, these could be different domains, I’m just sticking with example.com for the, er, examples. We want to deny editing access to the teams working on these two sites to each other’s site. A solution is to create two groups: `webdev-foo` and `webdev-bar`, maybe.

addgroup webdev-foo
addgroup webdev-bar
adduser fooguy webdev-foo
adduser foogal webdev-foo
adduser barboy webdev-bar
chgrp -R webdev-foo /www/foo.example.com/htdocs
chgrp -R webdev-bar /www/bar.example.com/htdocs
chmod -R g+w /www/foo.example.com/htdocs /www/bar.example.com/htdocs

This takes care of giving write-access for their sites to `fooguy`, `foogal` and `barboy`. Neither `fooguy` nor `foogal` will be able to write to the bar.example.com site’s directory, and `barboy` won’t be able to edit foo.example.com. If we want to allow all three of them to edit or create inside the main site, we just add them to the `webdev` group, assuming we’ve already set the permissions for its root directory and children to be `g+w`.

adduser fooguy webdev
adduser foogal webdev
adduser barboy webdev

## Checking permissions

If we pop over and have a look at these directories, what should be see?

cd /www
ls -l *example.com

The output should be something like:

bar.example.com:
total 4
drwxrwxr-x 2 root webdev-bar 4096 2012-01-09 18:49 htdocs

foo.example.com:
total 4
drwxrwxr-x 2 root webdev-foo 4096 2012-01-09 18:49 htdocs

www.example.com:
total 4
drwxrwxr-x 2 root webdev 4096 2012-01-09 18:50 htdocs

What does that mean? What we’re seeing here is that in all cases, the permissions are set as `drwxrwxr-x`, which means:

1. It’s a directory
2. User permissions are `rwx` – Read/Write/eXecute
3. Group permissions are also `rwx`
4. Other (“world”) permissions are `r-x` – Read/eXecute

We can also see that each of the `htdocs` entries has `root` as its owner, and the respective group we set before as its group. If we’ve already got a super simple site in these – just an index and an image directory – and list inside of htdocs, we should see:

bar.example.com/htdocs:
total 4
drwxrwxr-x 2 root webdev-bar 4096 2012-01-09 19:01 img
-rw-rw-r– 1 root webdev-bar 0 2012-01-09 18:59 index.html

foo.example.com/htdocs:
total 4
drwxrwxr-x 2 root webdev-foo 4096 2012-01-09 19:01 img
-rw-rw-r– 1 root webdev-foo 0 2012-01-09 18:59 index.html

www.example.com/htdocs:
total 4
drwxrwxr-x 2 root webdev 4096 2012-01-09 19:01 img
-rw-rw-r– 1 root webdev 0 2012-01-09 18:59 index.html

This tells use that the index and the directory are both editable by the right groups as well. (Files are `-rw-rw-r–`, meaning user and group read/write and world read-only.)

*To clarify: “execute”, when applied to directories, means the ability to change into it or open it. Applied to a file, the execute-bit is a potential hazard, if the file has any code in there, but that’s another story for another day.*

## More granularity: ACL

The approach detailed above is usually enough for most web situations, but if more control is required, we move into ACL territory (Access Control Lists). This is something that has to be made available at the filesystem level, and isn’t usually available on normal web hosts. As such, it’s a bit out of the scope of this post.

A web development workflow

[Update: added illustration of overview.]

I originally called this post “My web development workflow” but although it is *my* workflow, the idea of the article was as a suggestion for one possible methodology for the kind of development with which I’m usually involved. Others I’ve worked with, both in the past and on an ongoing basis, have found this method to be both flexible and fast. Once the concepts are taken onboard, it’s also very easy to understand. Another key advantage is that it’s designed to allow work to be carried out from multiple workstations – I’m using the word “workstation” very loosely here, including mobile devices – because the working copy of the files is at a remote location. The amount of times I’ve saved somebody else’s skin thanks to that…

The whole shebang depends on certain tools of course. Some of the core elements of the setup are ubiquitous and *de facto* standards, such as Git, `rsync`, `ssh`, `ppk`, and any flavour of Unix-like OS. That brings me to the less ubiquitous elements, although within the scope of web development, they are not that far from standard. These components include TextMate, from [MacroMates](http://macromates.com/) which also means that OS X is a required element. If you’re not based on OS X, you’ll need to replace TextMate with a suitable editing environment. If you’re using Windows, don’t. Life’s too short, really. I’ll come onto how TextMate integrates into the workflow later on.

Here’s a step-by-step rundown of the process to set this workflow up:

1. [create a central *bare* git repository](#centralrepo)
1. [create placeholder project file(s) in a work directory on the dev server](#placeholder)
1. [initialise the working dir as a git project, configure it, and push the placeholder files to the central repo](#gitinit)
1. [synchronise the remote work directory to a local directory](#rsync)
1. [open the local work directory in TextMate](#mate)
1. [configure Remote Project for TextMate with the remote working directory](#rp1)
1. [symlink the web root of the remote working dir to a web-visible location](#symlink)
1. [use Remote Project in TextMate to keep your local version in sync with your remote version-controlled copy](#rp2)
1. start hacking…

Seems like a lot of hassle? Well, it’s all done in seconds after the first time, and the benefits far outweigh the setup steps. Let’s go through each step in more detail.

Here’s an illustration of the overview:

A web development workflow 01

The “mobile device(s)” block can be any platform that can run `ssh` – your laptop, your tablet, your phone, somebody else’s gear, whatever. As long as you can run `ssh` and you know how to use a good text editor (make that `vim`!) you can edit your working copy from anywhere and commit/push your changes.

## Create a central *bare* git repository {#centralrepo}

This is an easy one to get us started. Let’s call our project “devflow” so we have a handle for it. Create a central directory – I use `/opt/git/devflow.git/` – then go in there and initialise it:

cd /opt/git/devflow.git/
git init –bare

Additionally, and predictably if you’ve ever done anything like this before, the permissions need to allow your development team read/write access to the repo. One option is create a group – I always thought “gits” had a ring to it – and set up the perms:

chmod -R g+w /opt/git/devflow.git
chgrp -R gits /opt/git/devflow.git

Now any user in the gits group can write to that repo. If you have to add a user to a group it’s as simple as

adduser {username} {groupname}

## Create placeholder project files {#placeholder}

Also simple, I usually use `~/jobs/devflow` as the working directory on the dev server. Sometimes to get the ball rolling I just create a single `readme.markdown` file.

## Initialise your working copy as a git repo {#gitinit}

Inside your working copy on the dev server, initialise a new git repository:

git init

If you haven’t configured git yet, start by telling it a little about yourself:

git config –global user.name “A. N. Other”
git config –global user.email “another@example.com”

Once you’ve done that, you need to tell the local git repo where the central repo is – the “origin”:

git remote add origin another@devserver:/opt/git/devflow.git

Apart from the URL of the remote origin, we need to tell git what branch to use and where to find it:

git config –add branch.master.remote origin
git config –add branch.master.merge refs/heads/master

If at this point, you have a look at the config with `cat` you’ll see the results of the configuration we’ve just done:

cat .git/config

[remote “origin”]
url = another@devserver:/opt/git/devflow.git
fetch = +refs/heads/*:refs/remotes/origin/*
[branch “master”]
remote = origin
merge = refs/heads/master

We can see that the `git config` commands have written the items to the configuration file. It’s possible to edit this file directly as well – whatever you prefer. Another way to view the state of the configuration is to use git’s own config command:

git config -l

This lists the configuration in a compact version of the config file:

remote.origin.url=another@devserver:/opt/git/devflow.git
remote.origin.fetch=+refs/heads/*:refs/remotes/origin/*
branch.master.remote=origin
branch.master.merge=refs/heads/master

## Synchronise local with remote {#rsync}

In your terminal – I use iTerm, but OS X Terminal is fine – change to a working directory. Once again, for this I use `~/Jobs/DevFlow` or similar. Once you `cd` into there, just use `rsync`:

rsync -Chavuz another@devserver:~/jobs/devflow .

which will bring your project, such as it is, to your local machine. The switches to rsync are to ignore version control systems directories – we don’t want any git files getting local. That will likely create issues for us. The other switches are to get human feedback, use archive mode, be verbose, look for updates, and compress the files *en route*.

Note: this assumes you’ve set up key-based authentication on the dev server. If you haven’t step into my office…

### Setting up PPK

On your local machine – let’s just drop the pretence and call it “Mac” from now on! – run this command:

ssh-keygen

Then answer with Enter (i.e. blank) to all the questions. This will create a `.ssh` directory in your home. The key files (see what I did there?) in there are

id_rsa
id_rsa.pub

Which as you can guess is the public key, and by elimination, the private key. The public key you can share around however you like – the worst thing that can happen is that someone will give you a login on their machine. In our case, we’re interested in the contents of `id_rsa.pub`. The contents of that file will be added to another file, on the remote dev server. On the dev server, the simplest thing is to do the same thing: `ssh-keygen`. (You don’t need to create a key on the server, but a side-effect of doing this is that the `~/.ssh` directory is created for you in the right place with the right permissions. Once you have the `~/.ssh/` dir, you can add a file:

touch ~/.ssh/authorized_keys
chmod 600 ~/.ssh/authorized_keys
cat {contents of local id_rsa.pub} >> ~/.ssh/authorized_keys

What does a public key look like? Something like:

ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDFmDZzW5XHKjsV2Fo624wXOPxmLkLSKiFkoTgOIqqlu
A3eD1tBpEGvhXYfVab9MJt08w/WdNo828GDUK3bWL74tLKiKyOY+yGwaqLQlqbKti0sFHTHyN/U76aen2
3hKy9ChMc2rFXFiM73fgQQveB1vZkmMezf6xRYcm2uJFiP76aen23hKy9ChMc2rFXFiM73fgQQveB1vZk
mMezf6xiryOCYiFSHvXZBVdToOg0OtDDO6sH1L8li664CDN8KOX1UebOw2DwHIFXNqvIjZQG3+DN8KOX1
UebOw2DwHIFXNqvIjZQG37IvuAlYAR9UPT4FV9mexCxGBJ6xURmHHYRGJwqUYzJKrjqyUIpY8wlHZjqvk
doqeJlBoUrLGNiT+A/gC4kN5haj65pJDvhOU4J0ctD6b dom@macbookhair

So once the version of you that lives on the dev server has your public key it can check it on its list of `authorized_keys` to see whether to let you in.

You should never send the private key to anyone over any transport. And you don’t have to type a password! This is most important for scripted access, like the type that Remote Project is doing for you behind the scenes in TextMate.

## Open your project in TextMate {#mate}

> I would recommend installing the TextMate [ProjectPlus](http://ciaranwal.sh/2008/08/05/textmate-plug-in-projectplus) plug-in which brings us some nice additional project-handling features. See the above link for full details.

Once you install the `mate` shell command, this is just a question of

cd ~/Jobs/DevFlow/
mate devflow

This will open the whole directory as a project in TextMate.

## Set up Remote Project {#rp1}

The TextMate bundle Remote Project is really a wrapper for `rsync`, allowing you to get from remote, put to remote and compare local and remote copies of your project.

[This section may have to change because of Remote Project’s seeming lack of availability.]

The configuration of Remote Project is handled by an environment variable: `TM_REMOTE_PROJECT`. This is set at the project level. Configure it to the location of the remote working copy: `another@devserver:~/jobs/devflow`.

TM_REMOTE_PROJECT.png

This is a good point to save the TM project (⌃⌘S). I usually save it in the same directory as the root dir of the working copy. In this case, that would be `~/Jobs/DevFlow/`.

Remote Project, as I have it configured, ignores the `.git` directory on your remote working copy, so you never get into a muddle with two competing copies of that. It also means that wherever you are you can work with the repo as long as you have access to ssh.

## Symlink to a web-visible location {#symlink}

Back on the dev server now, we create a virtual host for our project. This is normally a subdomain, possibly a sub-subdomain, like devflow.work.domi.co.uk. This means you can configure the DNS with a wildcard record, so when a client gives you something to work on urgently out of the blue, it’s ready.

If you have your vhosts set up to live somewhere like `/var/www/work.domi.co.uk/htdocs/` you can quickly symlink from there to `~/jobs/devflow` and you can see your changes as soon as Remote Project uploads them.

cd /var/www/work.domi.co.uk/htdocs/
ln -s /home/another/jobs/devflow

An example snippet from an Apache Virtual Hosts configuration might be:


ServerName devflow.work.domi.co.uk
DocumentRoot /var/www/work.domi.co.uk/htdocs/devflow

## Start syncing local and remote copies of the project {#rp2}

The Remote Project bundle has three main functions:

– Upload Project Changes
– Get Remote Project
– Compare to Remote Project

They do what they sound like they will do. Inside the bundle, the commands are fairly normal `rsync` statements. The first thing to try is the compare function – right now, there should be no difference between the local and remote. Edit the `readme.markdown` we created, then run the comparison again. Now we should see that the local copy is newer.

TM-RP-Compare.png

The fast way to use Remote Project is with its default shortcut: ⌃⌘P then 1 for upload, 2 for download and 3 for compare.

Remote_Project_menu.png

The confirmation of the sync is just a small tooltip next to the text cursor in the editor:

RM_tooltip.png

### My customised Remote Project commands:

#### Upload (excerpt)

rsync -auCz –exclude ‘phpdoc’ –exclude ‘.git’ –exclude “cache”
–exclude “logs” –exclude “.DS_Store” –exclude “Thumbs.db”
–exclude “.*.swp” –exclude “stats” –include “.gitignore”
–include “.htaccess” “$TM_PROJECT_DIRECTORY/” “$ESCAPED_REMOTE”

#### Download (excerpt)

rsync –delete -auCz –exclude ‘.git’ –exclude “cache”
–exclude “logs” –exclude “.DS_Store” –exclude “Thumbs.db”
–exclude “.*.swp” –exclude “stats” –include “.gitignore”
–include “.htaccess” “$ESCAPED_REMOTE/” “$TM_PROJECT_DIRECTORY”

Obviously, in the bundle these commands are one-liners.

You’ll notice something important in the Download excerpt: that `rsync` has its `–delete` option set. This will remove any files from your local directory that aren’t present in the remote. That keeps things tidy, but it can also remove new files you’ve just created locally that you haven’t yet uploaded, so beware of that.

Other elements to those command lines are:

– we don’t upload “phpdoc”, because it’s generated automatically on the remote
– we exclude some standard things that aren’t part of the project: cache, logs, preview file, stats, `vi` swap files
– we make sure .gitignore and .htaccess are included – they’re part of the project, but sometimes default `rsync` configuration has them excluded
– the `-C` switch to `rsync` is to ignore source control directories (“C” for “CVS”) and in current versions that includes git, but not all versions of `rsync` know about git, so we explicity `–exclude` it as well.

## Conclusion

Now you’ve got to this point, you can work locally, with the advantages that brings – being able to view files in Finder, save directly and so on – and handle the versioning remotely in a shell, as it should be! And while you’re working, everything is visible at a development URL that is as close as possible to how the production site will be, because it’s at the root of an Apache Virtual Host, with as much of the configuration exactly the same as the final destination for the project’s hosting.