You are here

November 2009

one liner using xargs to automate a repetitive task

This article illustrates using the *nix commandline to chain several commands together to automate a repetitive task - in this case preparing some drupal modules into neatly labelled directories.

The example here covers quite a few different useful commands and commandline techniques, which we can't really hope to cover in exhaustive detail; there's plenty of good documentation out there for everthing we mention.

We've been experimenting with different ways of keeping several drupal modules for multi-site drupal installations up-to-date. One approach we're trying out involves using symlinks, so that we only need to change one symlink and all sites using the module see the change to the updated version.

This is meant to make our lives easier, but it can still be a tedious job downloading several updated modules and preparing them. We want to have each version of a module in a directory which shows the name of the release - this is so we can see that we have multiple versions of a module, and can change our symlink to point at the updated version (and rollback to the other version if we need to).

To prepare one module (which we've downloaded using wget) manually, we'd do this:

$ tar zxvf link-6.x-2.8.tar.gz link/ link/tests/ link/tests/link.crud.test link/link.css link/link.info link/link.install link/link.module link/translations/ link/translations/link.da.po link/translations/link.de.po link/translations/link.fr.po link/translations/link.nl.po link/translations/link.pot link/translations/link.ru.po link/views/ link/views/link.views.inc link/views/link_views_handler_argument_target.inc link/views/link_views_handler_filter_protocol.inc link/LICENSE.txt $ ll total 32 drwxr-xr-x 5 pef pef 4096  2009-11-05 11:15 link -rw-r--r-- 1 pef pef 24911 2009-11-05 11:15 link-6.x-2.8.tar.gz $ mv link link-6.x-2.8 $ ll total 32 drwxr-xr-x 5 pef pef 4096  2009-11-05 11:15 link-6.x-2.8 -rw-r--r-- 1 pef pef 24911 2009-11-05 11:15 link-6.x-2.8.tar.gz

...so it's as simple as untarring the module archive and renaming the directory so the release details are in the directory name.

When there are several modules to do this for, it gets quite repetitive though. In the spirit of being a lazy programmer we put this one-liner together:

ls *.tar.gz | sed s/.tar.gz// | xargs -ifoo sh -c 'tar zxvf foo.tar.gz && mv $(echo foo | sed s/-.*$//) foo'

There are a couple of different techniques for chaining commands together in use here, so let's have a look at what each part is doing.

ls *.tar.gz

This simply lists all tarballs (everything in the current directory which ends in .tar.gz).

We then pipe the output of this to the next command, which is sed:

sed s/.tar.gz//

So sed gets the list of tarballs as a string, and we use it to trim .tar.gz off the end of the names of the tarballls (by replacing it with nothing). In the example above, this would take link-6.x-2.8.tar.gz as its input, and give link-6.x-2.8 as its output. If there are more modules names in the list which is input, these will be trimmed in the same way.

We then use a pipe again to pass the output along the chain - this time to the xargs command. You could think of xargs a little bit like foreach in php - it's taking a list of items as input, and doing something to each of those items in turn. Here we're basically saying something a bit like this pseudo code:

foreach ($module_names as $foo) {   // do some stuff with $foo }

If you're doing something very simple with xargs which just involves one command, you don't necessarily need the sh -c which comes next, but here we're using xargs to iterate over some more commands which are strung together, so we use sh -c to say use the shell to execute this string which we're preparing. So the string inside the single quotes which we're passing to the shell like this effectively sits inside the foreach loop in the pseudo code above.

Let's look at the string in question then:

tar zxvf foo.tar.gz && mv $(echo foo | sed s/-.*$//) foo

As we gave xargs the -ifoo option, each time this line is run through, xargs will replace foo with the next item in the list of input that it received. So using the link module as the example, the shell sees this:

tar zxvf link-6.x-2.8.tar.gz && mv $(echo link-6.x-2.8 | sed s/-.*$//) link-6.x-2.8

Remember what we're trying to achieve here is untar the archive, which will give us a directory called just link. We then want to rename that directory to link-6.x-2.8

So the first command untarring the archive is straightforward. We then use && to chain more commands on afterwards (the shell will only continue executing the commands after the ampersands if the commands before them complete without any errors).

The next bit uses a subshell inside brackets - so the output of the commands inside the brackets is what the commands outside the brackets see. So let's look inside the brackets first (we'll look at the version where xargs has already dealt with foo):

echo link-6.x-2.8 | sed s/-.*$//

This is simply using the sed command to do a search-and-replace again. This time we're trimming off anything between the first - character, and the end of the string ($ anchors the regex pattern to the end). So in this example we get simply link - which just so happens to be the name of the directory which came out of the tarball, and which we want to rename.

Therefore, once the subshell's done its work in our link example, we're left with:

mv link link-6.x-2.8

So we're done - if the pseudo code was a good way of visualising what's going on, here it is with the link example:

// prepare a list of all the module names foreach ($module_names as $foo) {   // $foo is link-6.x-2.8   // untar link-6.x-2.8.tar.gz   tar zxvf link-6.x-2.8.tar.gz   // rename link to link-6.x-2.8   mv link link-6.x-2.8 }

Here's the whole thing again in one line in the shell:

ls *.tar.gz | sed s/.tar.gz// | xargs -ifoo sh -c 'tar zxvf foo.tar.gz && mv $(echo foo | sed s/-.*$//) foo'

As always in *nix there are no doubt plenty of other ways you could approach this (the same thing can often be said for drupal!) - comments and suggestions for improvement are welcome, as are any corrections to our explanation.

Some suggestions for further reading: