Python - Remove All Duplicated Word in String

Sovary May 27, 2022 419
2 minutes read

This tutorial we will find out the way to remove duplicated words from given sentence in Python. We will check given string if there are duplicated word then we will remove from the string and return new string after we filter all duplicated word out.

Look at example how input/output should display

Input:

Python is and Java is also great

Output:

and great Java Python is also

The below method will help you to approch result.

1. Loop and Distinct Words

Algorithm

  1. Split given string to get each word in list
  2. Reserved an empty list to add words for the next step
  3. Loop each word if not found in new list, add it
  4. Join all words together to form new sentence

Syntax:

text = "cambotutorial is free tutorial and website and free tools"
# split and return list of words
lst = text.split()
newlist=[]
for i in lst:
    # text.count(i) return number of duplicate in sentence
    if (text.count(i)>=1 and (i not in newlist)):
        # add into new list
        newlist.append(i)
#concatenation list key with space
concate = ' '.join(newlist)
print(concate)

Output:

cambotutorial is free tutorial and website tools

2. Using Set() Data Type

Set() is built-in data type in Python that can hold many items with it like an array. Set not allow to stroe duplicated values which mean automatically filter repeated items. Following this concept by take advantage this data type we can remove duplicated item in list as well.

Algorithm

  1. Split given string to get each word in list.
  2. Convert List into Set (automatically remove duplicated)
  3. Join all words together to form new sentence

Syntax:

text = "cambotutorial is free tutorial and website and free tools"
# Split and return list of words
lst = text.split(" ")
# Pass to set datatype
no_dup = set(lst)
#concatenation list key with space
result = " ".join(no_dup)
print(result)

Output:

cambotutorial is free tutorial and website tools

3. Using Dictionary fromkeys() Method

This built-in is quit similar to Set() datatype because dictionary in Python only have unique key. This means the keys never contain duplicated values.

Algorithm

  1. Split given string to get each word in list.
  2. Pass words list into fromkeys() method, convert list to dictionary key
  3. Join all words together to form new sentence

Syntax:

text = "cambotutorial is free tutorial and website and free tools"
# split and return list of words
lst = text.split()
#filter get only keys
key = dict.fromkeys(lst)
#concatenation list key with space
result = " ".join(key)
print(result)

Output:

cambotutorial is free tutorial and website tools

4. Using Collection Counter() Method

Another way to filter out duplicated method by using Counter() which is similar to 3rd method. Counter() will convert list into object key and value pair that we can use fromkeys() to remove duplicated items.

Algorithm

  1. Split given string to get each word in list.
  2. Convert to value pair using Counter()
  3. Pass dictionary words into fromkeys() method.
  4. Join all words together to form new sentence.

Syntax:

# import Counter from collections package
from collections import Counter

text = "cambotutorial is free tutorial and website and free tools"
# split and return list of words
lst = text.split(" ")
# return keys in words and value as number of word duplicated
word_dic = Counter(lst)
key = word_dic.keys()
#concatenation list key with space
result = " ".join(key)
print(result)

Output:

cambotutorial is free tutorial and website tools

What do you think which one is the best method for you to remove duplicated words?

Python 
Author

Founder of CamboTutorial.com, I am happy to share my knowledge related to programming that can help other people. I love write tutorial related to PHP, Laravel, Python, Java, Android Developement, all published post are make simple and easy to understand for beginner. Follow him