Find the Duplicate Number

Create a function findDuplicate that takes an array of integers where each integer is between 1 and n (inclusive), with n being the size of the array. The array is guaranteed to have only one duplicate number, but it could be repeated multiple times. The function should return the duplicate number.

Constraints

  • The array will have a length of at least 2 and at most 10,000.
  • Each integer in the array is between 1 and n, where n is the size of the array.
  • There is exactly one integer that is duplicated in the array, but it can be repeated more than once.

Examples

Input: [1, 3, 4, 2, 2]
Output: 2
Explanation: The number 2 is the only duplicate number, appearing twice.

Input: [3, 1, 3, 4, 2]
Output: 3
Explanation: The number 3 is the only duplicate number, appearing twice.

Tips:

  • Think about using a ‘tortoise and hare’ (Floyd’s cycle detection) approach to solve this problem without extra space.
  • Alternatively, consider binary search or sorting methods, but be mindful of the space complexity.

Solution

What’s the Exact Problem?

The problem is to find the single duplicate number in an array where each integer is between 1 and n (inclusive), and n is the size of the array. The duplicate number can be repeated multiple times.

Steps to Understand the Problem

  • Understand that you are given an array where each number should be unique but one number appears more than once.
  • The task is to identify this duplicate number.
  • The challenge is to find an efficient way to identify the duplicate without using extra space.

Examples

Input: [1, 3, 4, 2, 2]
Output: 2

Input: [3, 1, 3, 4, 2]
Output: 3

What Your Function Should Do?

The function findDuplicate should take an array of integers and return the integer that appears more than once in the array.

How to Solve the Problem?

  • Use the ‘tortoise and hare’ (Floyd’s cycle detection) approach. This approach is used to detect a cycle in a linked list and can be adapted for this problem.
  • Treat the array as if it were a linked list where each element points to the index of the next element.
  • Use two pointers, one moving one step at a time and the other moving two steps at a time.
  • When the two pointers meet, it indicates a cycle, and the repeated number is the entry point of this cycle.

Example Code Solution

def findDuplicate(nums):
    # Tortoise and hare
    tortoise = hare = nums[0]
    while True:
        tortoise = nums[tortoise]
        hare = nums[nums[hare]]
        if tortoise == hare:
            break

    # Find the duplicate number
    tortoise = nums[0]
    while tortoise != hare:
        tortoise = nums[tortoise]
        hare = nums[hare]

    return hare

Example Usage

print(findDuplicate([1, 3, 4, 2, 2]))  # Output: 2
print(findDuplicate([3, 1, 3, 4, 2]))  # Output: 3

How this Code Works?

  • The function initializes two pointers (tortoise and hare) at the start of the array.
  • It moves tortoise by one step and hare by two steps until they meet, indicating a cycle in the array (which corresponds to the presence of a duplicate).

  • Once a cycle is detected, the function resets one pointer to the start of the array and then moves both pointers one step at a time until they meet again. The meeting point is the duplicate number.

  • This approach effectively finds the entry point of the cycle, which corresponds to the duplicate number in the array.

Time Complexity

The time complexity of this solution is O(n), where n is the number of elements in the array. This efficiency is due to the fact that the Floyd’s cycle detection algorithm has a linear time complexity, as it involves iterating through the array with two pointers. The space complexity is O(1) since no additional space is used apart from the two pointer variables.