Escaping strings for use at any command line

Okay, I have finally sussed this problem on both Windows and Linux.

The following code is written in Perl but it can be quite easily adapted to work for pretty much any programming language.

Procedure for escaping an arbitrary argument for use at a command line

sub escape_arg {
	my $arg = shift;

	# Windows cmd.exe:
	if($^O eq "MSWin32") {

		# Sequence of backslashes followed by a double quote:
		# double up all the backslashes and escape the double quote
		$arg =~ s/(\\*)"/$1$1\\"/g;
		
		# Sequence of backslashes followed by the end of the string
		# (which will become a double quote later):
		# double up all the backslashes
		$arg =~ s/(\\*)$/$1$1/;

		# All other backslashes occur literally

		# Quote the whole thing:
		$arg = "\"".$arg."\"";

		# Escape shell metacharacters:
		$arg =~ s/([()%!^"<>&|;, ])/\^$1/g;
	}

	# Unix shells:
	else {
		# Backslash-escape any hairy characters:
		$arg =~ s/([^a-zA-Z0-9_])/\\$1/g;
	}

	return $arg;
}

Procedure for escaping the name of an arbitrary program for use at a command line

That is, the 0th argument of the call. On Windows, this needs different treatment from the actual arguments.

sub escape_prog {
	my $prog = shift;

	# Windows cmd.exe: needs special treatment
	if($^O eq "MSWin32") {
		# Escape shell metacharacters
		$prog =~ s/([()%!^"<>&|;, ])/\^$1/g;
	}
	
	# Unix shells: same procedure as for arguments
	else {
		$prog = escape_arg($prog);
	}

	return $prog;
}

Procedure for escaping an arbitrary command

As presented in the form of a program followed by a series of arguments for that program. Returns a string.

sub escape_cmd {
	die "No call supplied\n" unless scalar @_ > 0;

	my @escaped = ();

	push @escaped, escape_prog($_[0]);
	push @escaped, map { escape_arg($_) } @_[ 1 .. $#_ ];

	return join " ", @escaped;
}

Tests

These subroutines worked on my Windows machine and the Linux machine which hosts this site. If you find faults or want to suggest some more test strings, be my guest.

The complete list of strings I used for unit tests is:

yes
no
child.exe
argument 1
Hello, world
Hello"world
\some\path with\spaces
C:\Program Files\
she said, "you had me at hello"
arg;,ument"2
\some\directory with\spaces\
"
\
\\
\\\
\\\\
\\\\\
"\
"\T
"\\T
!1
!A
"!\/'"
"Jeff's!"
$PATH
%PATH%
&
<>|&^
()%!^"<>&|
>\\.\nul
malicious argument"&whoami
*@$$A$@#?-_

Back to Blog
Back to Things Of Interest

Discussion (5)

2012-02-22 20:33:39 by qntm:

It seems that this procedure has a fault on Windows, when trying to invoke a program whose name has a space in it. I'm unable to figure out a workaround for this. In particular, I can't find any way to invoke a program named "foo %PATH%.exe" at the command line. Any ideas, anybody?

2012-10-22 23:11:27 by Phil:

Trivial: invoke "foo "%"PATH"%".exe"

2012-11-30 16:52:22 by Johan:

I've made a JavaScript version of it: http://jsbin.com/anitaz/11/

You can use it without installation of software.

2013-08-06 14:55:51 by RP:

i am trying to run "tf changeset /collection:tfsapp.dotcom.blabla.org/Misc 765 /noprompt" as a command and the escaping doesn't work unfortunately, would be great to write a post about how what rules are you trying to implement for escaping characters.

2015-10-27 14:48:53 by Resuna:

Um.

First, a general solution is not possible even in principle on Windows, because the command line is passed to the program as a single string, not a series of strings. This means that the parsing is not guaranteed to be handled by a component provided by Microsoft. It may be completely ad-hoc. If you need to do this on Windows, god help you.

Second, if you need to use this function in UNIX, I mean if you THINK you need to use this in UNIX, the first thing you need to do is take a step back and look at what you're doing. Because in UNIX, you should be using {exec()} to call programs. If you use exec(), you don't need to quote anything. So in most cases, you shouldn't need to use this. There are a few cases where you do (for example, pasting a file name into a terminal window) but mostly the solution is refactoring.