Comparing Strings and Equality

Well you’d think this was easy and I guess it is but look at these comparisons….

$string1 = “Hello World”
$string2 = “hello world”

$string1 -eq $string2 # Returns $true
$string1
-ceq $string2 # Returns $false

PowerShell can be a little inconsistent and you have to remember which commands are case sensitive and which aren’t. Sometimes it’s obvious as in the case above but often you’ll just use the –eq rather than the –ceq operator. You could just force both stings to the same case using .ToLower or .ToUpper, e.g. $string1.ToLower() –eq $string2.ToLower() but using the –ceq makes more sense I think.

There are some other ways of comparing strings

$string1.equals($string2) # Returns $false
$string1.toLower().equals($string2) # Returns $true
$string1.equals($string2,1) # Returns $true

That last one got me thinking, what was the number and were there other values I could use.

Best practice according to MSDN when using strings:

  • Use overloads that explicitly specify the string comparison rules for string operations.Typically, this involves calling a method overload that has a parameter of type StringComparison.
  • Use StringComparison.Ordinal or StringComparison.OrdinalIgnoreCase for comparisons as your safe default for culture-agnostic string matching.
  • Use comparisons with StringComparison.Ordinal or StringComparison.OrdinalIgnoreCase for better performance.
  • Use string operations that are based on StringComparison.CurrentCulture when you display output to the user.
  • Use the non-linguistic StringComparison.Ordinal or StringComparison.OrdinalIgnoreCase values instead of string operations based on CultureInfo.InvariantCulture when the comparison is linguistically irrelevant symbolic, for example).
  • Use the String.ToUpperInvariant method instead of the String.ToLowerInvariant method when you normalize strings for comparison.
  • Use an overload of the String.Equals method to test whether two strings are equal.
  • Use the String.Compare and String.CompareTo methods to sort strings, not to check for equality.
  • Use culture-sensitive formatting to display non-string data, such as numbers and dates, in a user interface.
  • Use formatting with the invariant culture to persist non-string data in string form.

OK that’s going to take us way off base if we start looking at all of those best practice so lets cherry pick a few related to this post.

Use overloads that explicitly specify the string comparison rules for string operations. Typically, this involves calling a method overload that has a parameter of type StringComparison.

Lets look at the equals method by typing $string1.Equals at the cmd prompt

Method OverloadDefinitions : {bool Equals(System.Object obj), bool Equals(string value), bool Equals(string value, System.StringComparison comparisonType)}

MSDN then, are recommending we use:
Equals(string value, System.StringComparison comparisonType)
rather than Equals(string value)
i.e. this $string1.equals($string2,System.StringComparison comparisonType) in preference to $string1.equals($string2)

The number 1 then in my earlier example is System.StringComparison comparisonType but which one?

MSDN want us to use this String comparison as the preferred Use StringComparison.Ordinal or StringComparison.OrdinalIgnoreCase for comparisons as your safe default for culture-agnostic string matching

[System.StringComparison]::OrdinalIgnoreCase

I can assign this to a variable and use it instead of the 1 in the example above like this.

$ordinalIgnorecase = [System.StringComparison]::OrdinalIgnoreCase
$string1.Equals($string2,$ordinalIgnorecase)

or you could use [System.StringComparison]::OrdinalIgnoreCase directly in the code, e.g. $string1.Equals($string2,[System.StringComparison]::OrdinalIgnoreCase)

It also appears, and I say appears, that you can use a number as the comparsion type but what number should I use? Try as I might I was unable to find out what the constants were as remembering to type [System.StringComparison]::OrdinalIgnoreCase seemed a bit of a tough ask although this could be added to every script as part of a template of course.

In actual fact, whilst it seems like the string comparison is just an integer and you can use it in lots of instances it won’t work in all cases – so like it or lump it you are stuck with using the long hand definition of the comparison type.

If I type [System.StringComparison]:: in the ISE I get the following tab completion show up

StringComparisons

Testing various numbers as follows I’m guessing that the numbers relate to the 5 types above, what do you think?

$string1.Equals($string2,0) # Returns $false
$string1.Equals($string2,1) # Returns $true
$string1.Equals($string2,2) # Returns $false
$string1.Equals($string2,3) # Returns $true
$string1.Equals($string2,4) # Returns $false
$string1.Equals($string2,5) # Returns $true

Using 1, 3 or 5 will return a true result as these are case insensitive comparisons. I’ll cover off some research on culture later but for now using a 5 should work just fine for you or 0 if you want to do a case sensitive search but then you don’t need this anyway.

MSDN also listed this as best practice: Use the String.Compare and String.CompareTo methods to sort strings, not to check for equality.

This means, don’t use this code to compare equality in strings

even though it could be done of course $String1.CompareTo($string2) = 0 if they are equal 1 or -1 if $string1 is before or after $string2 in sort order.

Strange that this exact example is used on the scripting guys blog to test for equality!

As suggested above you can use integers in place of the comparison type but in some cases the method overload means that powershell will misinterpret the parameter if you do this. As with using parametersets in a function definition the cmdlets needs to be able to figure out what parameters are being used, if there are multiple options.

In the case of the IndexOf method for example the overload definition shows this:

int IndexOf(char value),
int IndexOf(char value, int startIndex),
int IndexOf(char value,
int startIndex, int count),
int IndexOf(string value),
int IndexOf(string value, int startIndex),
int IndexOf(string value, int startIndex, int count),
int IndexOf(string value, System.StringComparison comparisonType),
int IndexOf(string value,int startIndex, System.StringComparison comparisonType),
int IndexOf(string value, int startIndex, int count, System.StringComparison comparisonType)

So the second parameter could be a startindex or a comparison type.

How does the cmdlet know which parameter you are giving it, a start index or a comparison type?

If you use my assumption above that you can replace the System.StringComparison comparisonType with an integer then it’s going to assume it’s a start index as that can only be an integer.

Given there are so many ways to test equality which is best. This usually means which is fastest. What I can tell you is using –eq or –ceq is much faster than using the string comparisons shown above, try it for yourself and see…

<br />$count = 100<br />$s1 = @()<br />$s2 = @()<br />for ($index = 0; $index -lt $count; $index++) {<br /><%%KEEPWHITESPACE%%> $s1 += "a"<br /><%%KEEPWHITESPACE%%> $s2 += "a"<br />}<br />Measure-Command {<br /><%%KEEPWHITESPACE%%> for ($index = 0; $index -lt $count; $index++) {<br /><%%KEEPWHITESPACE%%>  $($s1[$index]).Equals($($s2[$index]),0)<br /><%%KEEPWHITESPACE%%> }<br />}<br />Measure-Command {<br /><%%KEEPWHITESPACE%%> for ($index = 0; $index -lt $count; $index++) {<br /><%%KEEPWHITESPACE%%>  $($s1[$index]).Equals($($s2[$index]))<br /><%%KEEPWHITESPACE%%> }<br />}<br />Measure-Command {<br /><%%KEEPWHITESPACE%%> for ($index = 0; $index -lt $count; $index++) {<br /><%%KEEPWHITESPACE%%>  $($s1[$index]) -eq $($s2[$index])<br /><%%KEEPWHITESPACE%%> }<br />}<br />
Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.