Writing about software development, testing and Python. (2023)

Comparing strings is a fundamental task common to any programming language.

When it comes to Python, there are several ways of doing it. The best one will always depend on the use case, but we can narrow them down to a few that best fit this goal.

In this article, we'll do exactly that.

By the end of this tutorial, you'll have learned:

  • how to compare strings using the == and != operators
  • how to use the is operator to compare two strings
  • how to compare strings using the <, >, <=, and >= operators
  • how to compare two string ignoring the case
  • how to ignore whitespaces when performing string comparison
  • how to determine if two strings are similar by doing fuzzy matching
  • how to compare two strings and return the difference
  • how to debug when the string comparison is not working

Let's go!

Comparing strings using the == and != operators

The simplest way to check if two strings are equal in Python is to use the == operator. And if you are looking for the opposite, then != is what you need. That's it!

== and != are boolean operators, meaning they return True or False. For example, == returns True if the two strings match, and False otherwise.

>>> name = 'Carl'>>> another_name = 'Carl'>>> name == another_nameTrue>>> name != another_nameFalse>>> yet_another_name = 'Josh'>>> name == yet_another_nameFalse

These operators are also case sensitive, which means uppercase letters are treated differently. The example below shows just that, city starts with an uppercase L whereas capital starts with a lowercase l. As a result, Python returns False when comparing them with ==.

Writing about software development, testing and Python. (1)

>>> name = 'Carl'>>> yet_another_name = 'carl'>>> name == yet_another_nameFalse>>> name != yet_another_nameTrue

Comparing strings using the is operator

Another way of comparing if two strings are equal in Python is using the is operator. However, the kind of comparison it performs is different than ==. The is operator compare if the 2 string are the same instance.

In Python—and in many other languages—we say two objects are the same instance if they are the same object in memory.

>>> name = 'John Jabocs Howard'>>> another_name = name>>> name is another_nameTrue>>> yet_another_name = 'John Jabocs Howard'>>> name is yet_another_nameFalse>>> id(name)140142470447472>>> id(another_name)140142470447472>>> id(yet_another_name)140142459568816

The image below shows how this example would be represented in memory.

Writing about software development, testing and Python. (2)

As you see, we're comparing identities, not content. Objects with the same identity usually have the same references, and share the same memory location. Keep that in mind when using the is operator.

Comparing strings using the <, >, <=, and >= operators

The third way of comparing strings is alphabetically. This is useful when we need to determine the lexicographical order of two strings.

Let's see an example.

>>> name = 'maria'>>> another_name = 'marcus'>>> name < another_nameFalse>>> name > another_nameTrue>>> name <= another_nameFalse>>> name >= another_nameTrue

To determine the order, Python compares the strings char by char. In our example, the first three letters are the same mar, but the next one is not, c from marcus comes before i from maria.

Writing about software development, testing and Python. (3)

It's important to have in mind that this comparisons are case-sensitive. Python treats upper-case and lower-case differently. For example, if we change "maria" to "Maria", then the result is different because M comes before m.

>>> name = 'Maria'>>> another_name = 'marcus'>>> name < another_nameTrue>>> ord('M') < ord('m')True>>> ord('M')77>>> ord('m')109

Writing about software development, testing and Python. (4)

⚠️ WARNING ⚠️: Avoid comparing strings that represent numbers using these operators. The comparison is done based on alphabetical ordering, which causes "2" < "10" to evaluated to False.

>>> a = '2'>>> b = '10'>>> a < bFalse>>> a <= bFalse>>> a > bTrue>>> a >= bTrue

Compare two strings by ignoring the case

Sometimes we may need to compare two strings—a list of strings, or even a dictionary of strings—regardless of the case.

Achieving that will depend on the alphabet we're dealing with. For ASCII strings, we can either convert both strings to lowercase using str.lower(), or uppercase with str.upper() and compare them.

For other alphabets, such as Greek or German, converting to lowercase to make the strings case insensitive doesn't always work. Let's see some examples.

Suppose we have a string in German named 'Straße', which means "Street". You can also write the same word without the ß, in this case, the word becomes Strasse. If we try to lowercase it, or uppercase it, see what happens.

>>> a = 'Atraße'>>> a = 'Straße'>>> b = 'strasse'>>> a.lower() == b.lower()False>>> a.lower()'straße'>>> b.lower()'strasse'

That happens because a simple call to str.lower() won't do anything to ß. Its lowercase form is equivalent to ss but ß itself has the same form and shape in lower or upper case.

The best way to ignore case and make effective case insensitive string comparisons is to use str.casefold. According to the docs:

Casefolding is similar to lowercasing but more aggressive because it is intended to remove all case distinctions in a string.

Let's see what happens when we use str.casefold instead.

>>> a = 'Straße'>>> b = 'strasse'>>> a.casefold() == b.casefold()True>>> a.casefold()'strasse'>>> b.casefold()'strasse'

How to compare two strings and ignore whitespace

Sometimes you might want to compare two strings by ignoring space characters. The best solution for this problem depends on where the spaces are, whether there are multiple spaces in the string and so on.

The first example we'll see consider that the only difference between the strings is that one of them have leading and/or trailing spaces. In this case, we can trim both strings using the str.strip method and use the == operator to compare them.

>>> s1 = 'Hey, I really like this post.'>>> s2 = ' Hey, I really like this post. '>>> s1.strip() == s2.strip()True

However, sometimes you have a string with whitespaces all over it, including multiple spaces inside it. If that is the case, then str.strip is not enough.

>>> s2 = ' Hey, I really like this post. '>>> s1 = 'Hey, I really like this post.'>>> s1.strip() == s2.strip()False

The alternative then is to remove the duplicate whitespaces using a regular expression. This method only returns duplicated chars, so we still need to strip the leading and trailing ones.

>>> s2 = ' Hey, I really like this post. '>>> s1 = 'Hey, I really like this post.'>>> re.sub('\s+', ' ', s1.strip())'Hey, I really like this post.'>>> re.sub('\s+', ' ', s2.strip())'Hey, I really like this post.'>>> re.sub('\s+', ' ', s1.strip()) == re.sub('\s+', ' ', s2.strip())True

Or if you don't care about duplicates and want to remove everything, then just pass the empty string as the second argument to re.sub.

>>> s2 = ' Hey, I really like this post. '>>> s1 = 'Hey, I really like this post.'>>> re.sub('\s+', '', s1.strip())'Hey,Ireallylikethispost.'>>> re.sub('\s+', '', s2.strip())'Hey,Ireallylikethispost.'>>> re.sub('\s+', '', s1.strip()) == re.sub('\s+', '', s2.strip())True

The last and final method is to use a translation table. This solution is an interesting alternative to regex.

>>> table = str.maketrans({' ': None})>>> table{32: None}>>> s1.translate(table)'Hey,Ireallylikethispost.'>>> s2.translate(table)'Hey,Ireallylikethispost.'>>> s1.translate(table) == s2.translate(table)True

A nice thing about this method is that it allows removing not only spaces but other chars such as punctuation as well.

>>> import string>>> table = str.maketrans(dict.fromkeys(string.punctuation + ' '))>>> s1.translate(table)'HeyIreallylikethispost'>>> s2.translate(table)'HeyIreallylikethispost'>>> s1.translate(table) == s2.translate(table)True

How to compare two strings for similarity (fuzzy string matching)

Another popular string comparison use case is checking if two strings are almost equal. In this task, we're interested in knowing how similar they are instead of comparing their equality.

To make it easier to understand, consider a scenario when we have two strings and we are willing to ignore misspelling errors. Unfortunately, that's not possible with the == operator.

We can solve this problem in two different ways:

  • using the difflib from the standard library
  • using an external library such as jellysifh

Using difflib

The difflib in the standard library has a SequenceMatcher class that provides a ratio() method that returns a measure of the string's similarity as a percentage.

Suppose you have two similar strings, say a = "preview", and b = "previeu". The only difference between them is the final letter. Let's imagine that this difference is small enough for you and you want to ignore it.

By using SequenceMatcher.ratio() we can get the percentage in which they are similar and use that number to assert if the two strings are similar enough.

from difflib import SequenceMatcher>>> a = "preview">>> b = "previeu">>> SequenceMatcher(a=a, b=b).ratio()0.8571428571428571

In this example, SequenceMatcher tells us that the two strings are 85% similar. We can then use this number as a threshold and ignore the difference.

>>> def is_string_similar(s1: str, s2: str, threshold: float = 0.8) -> bool ...: : ...: return SequenceMatcher(a=s1, b=s2).ratio() > threshold ...:>>> is_string_similar(s1="preview", s2="previeu")True>>> is_string_similar(s1="preview", s2="preview")True>>> is_string_similar(s1="preview", s2="previewjajdj")False

There's one problem, though. The threshold depends on the length of the string. For example, two very small strings, say a = "ab" and b = "ac" will be 50% different.

>>> SequenceMatcher(a="ab", b="ac").ratio()0.5

So, setting up a decent threshold may be tricky. As an alternative, we can try another algorithm, one that the counts transpositions of letters in a string. And the good new is, such an algorithm exists, and that's what we'll see next.

Using Damerau-Levenshtein distance

The Damerau-Levenshtein algorithm counts the minimum number of operations needed to change one string into another.

In another words, it tells how many insertions, deletions or substitutions of a single character; or transposition of two adjacent characters we need to perform so that the two string become equal.

In Python, we can use the function damerau_levenshtein_distance from the jellysifh library.

Let's see what the Damerau-Levenshtein distance is for the last example from the previous section.

>>> import jellyfish>>> jellyfish.damerau_levenshtein_distance('ab', 'ac')1

It's 1! So that means to transform "ac" into "ab" we need 1 change. What about the first example?

>>> s1 = "preview">>> s2 = "previeu">>>  jellyfish.damerau_levenshtein_distance(s1, s2)1

It's 1 too! And that makes lots of sense, after all we just need to edit the last letter to make them equal.

This way, we can set the threshold based on number of changes instead of ratio.

>>> def are_strings_similar(s1: str, s2: str, threshold: int = 2) -> bool: ...: return jellyfish.damerau_levenshtein_distance(s1, s2) <= threshold ...: >>> are_strings_similar("ab", "ac")True>>> are_strings_similar("ab", "ackiol")False>>> are_strings_similar("ab", "cb")True>>> are_strings_similar("abcf", "abcd")True# this ones are not that similar, but we have a default threshold of 2>>> are_strings_similar("abcf", "acfg")True>>> are_strings_similar("abcf", "acyg")False

How to compare two strings and return the difference

Sometimes we know in advance that two strings are different and we want to know what makes them different. In other words, we want to obtain their "diff".

In the previous section, we used difflib as a way of telling if two strings were similar enough. This module is actually more powerful than that, and we can use it to compare the strings and show their differences.

The annoying thing is that it requires a list of strings instead of just a single string. Then it returns a generator that you can use to join into a single string and print the difference.

>>> import difflib>>> d = difflib.Differ()>>> diff = d.compare(['my string for test'], ['my str for test'])>>> diff<generator object Differ.compare at 0x7f27703250b0>>>> list(diff)['- my string for test', '? ---\n', '+ my str for test']>>> print('\n'.join(diff))- my string for test? ---+ my str for test

String comparison not working?

In this section, we'll discuss the reasons why your string comparison is not working and how to fix it. The two main reasons based on my experience are:

  • using the wrong operator
  • having a trailing space or newline

Comparing strings using is instead of ==

This one is very common amongst novice Python developers. It's easy to use the wrong operator, especially when comparing strings.

As we've discussed in this article, only use the is operator if you want to check if the two string are the same instances.

Having a trailing whitespace of newline (\n)

This one is very common when reading a string from the input function. Whenever we use this function to collect information, the user might accidentally add a trailing space.

If you store the result from the input in a variable, you won't easily see the problem.

>>> a = 'hello'>>> b = input('Enter a word: ')Enter a word: hello >>> a == bFalse>>> a'hello'>>> b'hello '>>> a == b.strip()True

The solution here is to strip the whitespace from the string the user enters and then compare it. You can do it to whatever input source you don't trust.


In this guide, we saw 8 different ways of comparing strings in Python and two most common mistakes. We saw how we can leverage different operations to perform string comparison and how to use external libraries to do string fuzzy matching.

Key takeaways:

  • Use the == and != operators to compare two strings for equality
  • Use the is operator to check if two strings are the same instance
  • Use the <, >, <=, and >= operators to compare strings alphabetically
  • Use str.casefold() to compare two string ignoring the case
  • Trim strings using native methods or regex to ignore whitespaces when performing string comparison
  • Use difflib or jellyfish to check if two strings are almost equal (fuzzy matching)
  • Use difflib to to compare two strings and return the difference
  • String comparison is not working? Check for trailing or leading spaces, or understand if you are using the right operator for the job

That's it for today, and I hope you learned something new. See you next time!

Other posts you may like:

  • How to Choose Between isdigit(), isdecimal() and isnumeric() in Python

  • The Best Way to Compare Two Dictionaries in Python

  • The Best Ways to Compare Two Lists in Python

  • 15 Easy Ways to Trim a String in Python

  • Pylint: How to fix "c0209: formatting a regular string which could be a f-string (consider-using-f-string)"

  • How to Implement a Random String Generator With Python

  • How to Check If a String Is a Valid URL in Python

  • Python F-String: 73 Examples to Help You Master It

This post was originally published at https://miguendes.me


How Python is used in software testing? ›

Python's importance for automated software testing
  • This language has many built-in testing frameworks, such as Pytest, Robot, etc., which ensures faster debugging and the creation of faster workflows.
  • It is an interpreted language and implements the code line by line, which makes the debugging easy.
May 31, 2022

Is Python needed for software testing? ›

As an easy-to-learn scripting language, Python is a great choice for testers. Learning to program in Python can be fun if you pick up an exciting idea that can translate into a useful piece of software, and there are a number of great resources freely available in the web to support you in your learning.

How do you write and test a Python program? ›

1 How to write and test a Python program
  1. Write a Python program to say “Hello, World!”
  2. Handle command-line arguments using argparse.
  3. Run tests for the code with Pytest.
  4. Learn about $PATH.
  5. Use tools like YAPF and Black to format the code.
  6. Use tools like Flake8 and Pylint to find problems in the code.

Which is best Python or software testing? ›

We think that Python is the best language for Test Automation according to the criteria above. Java is also a good choice, and the arguments in favor of Java should also be considered, here you can not settle the dispute between fans of Python and Java.

What are the 4 main uses of Python? ›

Python is extensively applied in data science, data analysis, machine learning, data engineering, web development, software development, and other fields.

What are 4 types of testing in programming? ›

There are four main stages of testing that need to be completed before a program can be cleared for use: unit testing, integration testing, system testing, and acceptance testing.

Which language is best for software testing? ›

Let's take a look at the top five languages:
  1. Python. Python is an open-source programming language popularly supporting automation testing. ...
  2. JavaScript. JavaScript focuses strongly on test automation and performs well when it comes to rebranding client-side expectations through front-end development. ...
  3. C# ...
  4. Ruby. ...
  5. Java.
Nov 8, 2022

Can I do software testing without coding? ›

During the codeless automation process, testers do not need to know how to code. Instead, they leverage specialized tools to help develop the proper test scripts. Codeless testing does still require testers to understand software testing as well as product development insight.

What are different types of testing in Python? ›

There are four different types of tests, each depending on the granularity of code being tested, as well as the goal of the test.
  • Unit Tests. This tests specific methods and logic in the code. ...
  • Feature Tests. This tests the functionality of the component. ...
  • Integration Tests. ...
  • Performance Tests.

How Python is used in Selenium? ›

One of the most widely used test automation tools in Python is Selenium. It's open-source and free to use. Selenium with Python is used to carry out automated test cases for browsers or web applications.

How does Python handle test cases? ›

In this tutorial we will discuss about basic usage of Python unittest module and write some python unit test cases to test a class functions.
Python Unit Test Outcome & Basic Functions.
MethodChecks that
assertNotEqual(a,b)a != b
assertTrue(x)bool(x) is True
assertFalse(x)bool(x) is False
9 more rows
Aug 3, 2022


Top Articles
Latest Posts
Article information

Author: Gov. Deandrea McKenzie

Last Updated: 24/11/2023

Views: 6486

Rating: 4.6 / 5 (46 voted)

Reviews: 93% of readers found this page helpful

Author information

Name: Gov. Deandrea McKenzie

Birthday: 2001-01-17

Address: Suite 769 2454 Marsha Coves, Debbieton, MS 95002

Phone: +813077629322

Job: Real-Estate Executive

Hobby: Archery, Metal detecting, Kitesurfing, Genealogy, Kitesurfing, Calligraphy, Roller skating

Introduction: My name is Gov. Deandrea McKenzie, I am a spotless, clean, glamorous, sparkling, adventurous, nice, brainy person who loves writing and wants to share my knowledge and understanding with you.