How to remove duplicate values from an array in Ruby?

How to remove duplicate values from an array in Ruby?

Greetings

Hey guys, how's it going?

Introduction

In this article, let's talk about how to remove duplicate values from arrays in Ruby.

Nowadays, the most important asset someone might have is data. Data is everything. The only way to make a good decision is when you have enough information about a specific topic.

Think in a big picture, multiple duplicate records show inconsistency and inaccuracy in your data. As you know, Companies rely on accurate data to support marketing, sales and their customer service activities, so, duplicate records affect that.

Anyway, let's get down to business.

How to remove duplicate values from an array in Ruby?

The answer is the non-destructive method uniq or the destructive method uniq!.

Syntax

array.uniq

or

array.uniq!

Example

irb(main):001:0> array = [1,1,1,2,3,4,5]
irb(main):002:0> array.uniq
=> [1, 2, 3, 4, 5]

What does the uniq method do?

According to the documentation , the uniq method:

Returns a new Array containing those elements from self that are not duplicates, the first occurrence always being retained.

So, it returns a new array with no duplicate values, the first occurrence is chosen over the second one. If the goal is to change the array itself, and not creating a new one, then, the destructive uniq! method needs to be used.

Consider the following array:

irb(main):001:0> array = [4, 3, 4, 1, 2, 3, 1, 2, 8, 8]

Let's use the uniq method and see the result.

irb(main):001:0> array = [4, 3, 4, 1, 2, 3, 1, 2, 8, 8] 
irb(main):002:0> array.uniq
=> [4, 3, 1, 2, 8]

The result:

  • The second occurrence of the elements is not retained;
  • The array elements follow the same order;

Check it again, the bold elements are retained, while the italic elements are not: array = [4, 3, 4, 1, 2, 3, 1, 2, 8, 8]

irb(main):001:0> array = [4, 3, 4, 1, 2, 3, 1, 2, 8, 8] 
irb(main):002:0> array.uniq
=> [4, 3, 1, 2, 8]

Just a heads up, if the array elements are strings, there is a difference between capital and small letters. In the following example, there are no duplicate string elements inside the array. Check it out.

irb(main):001:0> array = ["DOG", "dog", "Dog", "doG", "cat", "cAt", "caT"]
irb(main):002:0> array.uniq
=> ["DOG", "dog", "Dog", "doG", "cat", "cAt", "caT"]

In the previous example, the uniq method returned the same array, no duplicate values were found. Every normal human being would say that these two words are the same: "Dog" and "dog", however, in our previous example, it's not true. Why does that happen? The uniq method uses the eql? method to compare, long story short, two strings are really equal when they are exactly the same, considering capital and small letters.

This article is not going to cover the details of the use of the eql? method, but this is just one example.

irb(main):002:0> "Dog".eql?("Dog")
=> true
irb(main):003:0> "Dog".eql?("doG")
=> false

How to use the Uniq method with a block

This such a powerful thing, because it is like a condition that you add to it.

Let's check one example. In the following code, the result is only one array item for each different size.

irb(main):014:0> array = ["dog",  "sheep",  "duck", "cat", "chicken", "mouse"]
irb(main):015:0> array.uniq {|item| item.size }
=> ["dog", "sheep", "duck", "chicken"]
irb(main):016:0>

There are 3 letters in "dog", 5 letters in "sheep", 4 letters in "duck" and 7 letters in "chicken". Only one item for each different size.

Return

uniq returns a new array.

irb(main):005:0> array = [1,1,1,2,3,4,5,1]
irb(main):006:0> array.uniq
=> [1, 2, 3, 4, 5]

It returns a new array even if there are no duplicate values.

irb(main):001:0> array1 = [1, 2, 3, 4, 5]
irb(main):002:0> array1.uniq
=> [1, 2, 3, 4, 5]

Now, the destructive 'uniq!' methods returns itself if any elements were removed.

irb(main):004:0> array = [1,1,1,2,3,4,5,1]
irb(main):005:0> array.uniq!
=> [1, 2, 3, 4, 5]
irb(main):006:0> array
=> [1, 2, 3, 4, 5]

It returns nil if no duplicate elements were found.

irb(main):017:0> array2 = [1, 2, 3, 4, 5]
irb(main):018:0> array2.uniq!
=> nil

Conclusion

That's all for today, guys. I hope this article has been useful for you.

Take care!