Published : December 12, 2021 - Yanis MANSOUR

This article is intended to explain the approach I took to develop PyCombs, a python package that generates all the possible combinations of length `k`

in a given set.

Before to start, Let's make a quick reminder of what is a combination :

In mathematics, a

combinationis a selection of items from a set that has distinct members, such that the order of selection does not matter. Wikipedia

Another reminder I'd like to make is the number of combinations in a set

This is a solution I thought about. It has the advantage to be non recursive, and is efficient if you are looking for combinations of a small length. It goes linearly with the length of the set. Here is the link of the full project on GitHub

Now that we have the necessary reminders, let's dive into the approach :

To simplify treatments, we will use lists to represent our sets. Let's suppose that we have the following set of 6 cities, in which we want tp look for all possible combinations of length `3`

:

```
mySet = ["Tizi Ouzou", "Paris", "Lyon", "Marseille", "Washington", "Dublin"]
```

To represent combinations, one approach is to have indexes that point to elements, and because we look for combinations of length `3`

, we will need `3`

indexes that will each point to an element of the set :

```
mySet = ["Tizi Ouzou", "Paris", "Lyon", "Marseille", "Washington", "Dublin"]
^ ^ ^
i j k
```

This will give us the first combination :

```
comb = ["Tizi Ouzou", "Paris", "Lyon"]
```

To go to the next combinations, we have to increment the last index `k`

:

```
mySet = ["Tizi Ouzou", "Paris", "Lyon", "Marseille", "Washington", "Dublin"]
^ ^ ^
i j k
comb = ["Tizi Ouzou", "Paris", "Marseille"]
```

By incrementing `k`

until it reaches the end of the list, we obtain all combinations which start with the first two elements, combined with each element amongst the rest of the list :

```
mySet = ["Tizi Ouzou", "Paris", "Lyon", "Marseille", "Washington", "Dublin"]
^ ^ ^
i j k
comb = ["Tizi Ouzou", "Paris", "Marseille"]
mySet = ["Tizi Ouzou", "Paris", "Lyon", "Marseille", "Washington", "Dublin"]
^ ^ ^
i j k
comb = ["Tizi Ouzou", "Paris", "Washington"]
mySet = ["Tizi Ouzou", "Paris", "Lyon", "Marseille", "Washington", "Dublin"]
^ ^ ^
i j k
comb = ["Tizi Ouzou", "Paris", "Dublin"]
```

Here is the list of combinations at this point :

```
combs = [
['Tizi Ouzou', 'Paris', 'Lyon'],
['Tizi Ouzou', 'Paris', 'Marseille'],
['Tizi Ouzou', 'Paris', 'Washington'],
['Tizi Ouzou', 'Paris', 'Dublin']
]
```

To enumerate the remaining combinations, we have to increment `j`

and replace `k`

by `j+1`

, and restart the same process of incrementing `k`

to the end of the list :

```
mySet = ["Tizi Ouzou", "Paris", "Lyon", "Marseille", "Washington", "Dublin"]
^ ^ ^
i j k
comb = ["Tizi Ouzou", "Lyon", "Marseille"]
mySet = ["Tizi Ouzou", "Paris", "Lyon", "Marseille", "Washington", "Dublin"]
^ ^ ^
i j k
comb = ["Tizi Ouzou", "Lyon", "Washington"]
mySet = ["Tizi Ouzou", "Paris", "Lyon", "Marseille", "Washington", "Dublin"]
^ ^ ^
i j k
comb = ["Tizi Ouzou", "Lyon", "Dublin"]
```

Let `N`

be the length of our set. By repeating the previous process, we will reach this situation :

```
mySet = ["Tizi Ouzou", "Paris", "Lyon", "Marseille", "Washington", "Dublin"]
^ ^ ^
i j k
comb = ["Tizi Ouzou", "Washington", "Dublin"]
```

Now, it's the turn of `i`

to be incremented :

```
mySet = ["Tizi Ouzou", "Paris", "Lyon", "Marseille", "Washington", "Dublin"]
^ ^ ^
i j k
comb = ["Paris", "Lyon", "Marseille"]
```

Repeat the process until `i=N-3`

& `j=N-2`

& `k=N-1`

:

```
mySet = ["Tizi Ouzou", "Paris", "Lyon", "Marseille", "Washington", "Dublin"]
^ ^ ^
i j k
comb = ["Marseille", "Washington", "Dublin"]
```

To represent the indexes, let's store them in a list. The situation above will give the following indexes :

```
indexes = [3, 4, 5]
```

In order to implement the full generator, I divided the full process into 2 functions. Let's go :

`move`

functionNow, let's write the `move`

function that will calculate the next `indexes`

list, given the current `indexes`

list and the length of the set we are looking for combinations in (See examples below)

Let capital `I`

be the index of a given index in the list of indexes, `maxLength`

be the length of the set, and `length`

be the length of the `indexes`

list :

```
length = 3
I = 0
indexes = [0,1,2]
^
I
mySet = ["Tizi Ouzou", "Paris", "Lyon", "Marseille", "Washington", "Dublin"]
maxLength = 6
```

First, by looking the pattern showed above, we notice that the index we move is determined by going through the list of `indexes`

* backwards*, and looking for the the first index that doesn't verify the condition of

`indexes[length - 1 - i] == maxLength - 1 - i`

. That is, the index that is being moved is always the first (in reverse order) that doesn't point to the end of the list :```
indexes = [0,1,2]
mySet = ["Tizi Ouzou", "Paris", "Lyon", "Marseille", "Washington", "Dublin"]
```

In this example, the index to move is `indexes[2]`

because :

- The length of the indexes list is 3 ⇒
`length == 3`

and the length of the set is 6 ⇒`maxLength == 6`

- We go through the
`indexes`

list backwards. Thus, the first element is`indexes[length - 1 -i]`

such that`i=0`

(the first element in reverse order) - The condition of
`indexes[length - 1 - i] == maxLength - 1 - i`

is not verified :`indexes[length - 1 - i] = indexes[3 - 1 - 0] = indexes[2] = 2`

`maxLength - 1 - i = 6 - 1 - 0 = 5`

Let's take another example :

```
indexes = [0,1,5]
mySet = ["Tizi Ouzou", "Paris", "Lyon", "Marseille", "Washington", "Dublin"]
comb = ["Tizi Ouzou", "Paris", "Dublin"]
```

In this case, the index to move is `indexes[1]`

because :

- We go through the
`indexes`

list backwards. Thus, the first element is`indexes[length - 1 -i]`

such that`i=0`

(the first element in reverse order) - The condition of
`indexes[length - 1 - i] == maxLength - 1 - i`

is verified :`indexes[length - 1 - i] = indexes[3 - 1 - 0] = indexes[2] = 5`

`maxLength - 1 - i = 6 - 1 - 0 = 5`

- We increment
`i`

⇒`i=1`

to point to the second index in reverse order - The condition of
`indexes[length - 1 - i] == maxLength - 1 - i`

is not verified this time :`indexes[length - 1 - i] = indexes[3 - 1 - 1] = indexes[1] = 1`

`maxLength - 1 - i = 6 - 1 - 1 = 4`

The index to increment is thus `indexes[1]`

:

```
indexes = [0,2,5]
mySet = ["Tizi Ouzou", "Paris", "Lyon", "Marseille", "Washington", "Dublin"]
comb = ["Tizi Ouzou", "Lyon", "Dublin"]
```

But according to the logic we followed in the illustration in the beginning of the article, the element that comes after `indexes=[0,1,5]`

is `indexes=[0,2,3]`

and not `indexes=[0,2,5]`

. To achieve this, we have to set all the indexes that are after `indexes[1]`

:

```
indexes = [0,2,5]
indexToMove = 1
for i in range (indexToMove + 1, length) :
indexes[i] = indexes[i - 1] + 1
```

The process of the move function is the following :

- Find the index to increment :
`indexToIncrement`

- Increment the index :
`indexToIncrement += 1`

- Increment all the indexes that come after
`indexToIncrement`

Note that this function returns `True`

if it moved the `indexes`

list, and `False`

if no move is possible (the end of combinations)

Here is the full function :

```
def move(indexes, maxLength) :
# Determining the indexToMove
length = len(indexes)
indexToMove = -1
for i in range (length) :
if indexes[length - 1 -i] >= maxLength :
raise ValueError ("Indexes must be less than maxLength")
if indexes[length - 1 - i] != maxLength - 1 - i :
indexToMove = length - 1 - i
break
if indexToMove == -1 :
return False
# Moving the index and the indexes after it
indexes[indexToMove] += 1
for i in range (indexToMove + 1, length) :
indexes[i] = indexes[i - 1] + 1
return True
```

Some examples of outputs :

```
indexes = [0,1,2]
move(indexes, 6)
print(indexes) # [0, 1, 3]
indexes = [0,1,5]
move(indexes, 6)
print(indexes) # [0, 2, 3]
indexes = [0,4,5]
move(indexes, 6)
print(indexes) # [1, 2, 3]
```

`find_combinations`

functionThis function takes a set (that is represented by a list) and the length of the combinations to generate in parameter. It returns all the combinations in a list.

This is the here is the logic of the function :

- Initialise the
`indexes`

list :`indexes = [i for i in range (k)]`

, and adds the first combination to the combinations list - As long as a move is possible, move and add the corresponding combination to the combinations list

```
def find_combinations (list, k) :
n = len(list)
combinations = []
if k < 0 or k > n :
raise ValueError ("k must be equal to 0 or positive and less than or equal to n")
if k == 0 :
return [[]]
elif k == 1 :
return [[i] for i in list]
elif k == len (list) :
return [list]
else :
indexes = [i for i in range (k)]
combinations.append ([list[i] for i in indexes])
while move(indexes, n) :
combinations.append([list[i] for i in indexes])
return combinations
```

Here is an example of the output :

```
mySet = ["Tizi Ouzou", "Paris", "Lyon", "Marseille", "Washington", "Dublin"]
combs = find_combinations(mySet, 3)
print (combs)
```

Output

```
[
['Tizi Ouzou', 'Paris', 'Lyon'],
['Tizi Ouzou', 'Paris', 'Marseille'],
['Tizi Ouzou', 'Paris', 'Washington'],
['Tizi Ouzou', 'Paris', 'Dublin'],
['Tizi Ouzou', 'Lyon', 'Marseille'],
['Tizi Ouzou', 'Lyon', 'Washington'],
['Tizi Ouzou', 'Lyon', 'Dublin'],
['Tizi Ouzou', 'Marseille', 'Washington'],
['Tizi Ouzou', 'Marseille', 'Dublin'],
['Tizi Ouzou', 'Washington', 'Dublin'],
['Paris', 'Lyon', 'Marseille'],
['Paris', 'Lyon', 'Washington'],
['Paris', 'Lyon', 'Dublin'],
['Paris', 'Marseille', 'Washington'],
['Paris', 'Marseille', 'Dublin'],
['Paris', 'Washington', 'Dublin'],
['Lyon', 'Marseille', 'Washington'],
['Lyon', 'Marseille', 'Dublin'],
['Lyon', 'Washington', 'Dublin'],
['Marseille', 'Washington', 'Dublin']
]
```

This solution has the advantage of being non recursive. It is efficient for combinations of a small length. But if you are looking for combinations of an important length, performances may drop (not significantly) since the `move`

function is linear with `k`

(the length of the combinations).

Implementations that optimise the process of the `move`

function do exist (see this thread) but for combinations of small length, the solution of this article is good.

Please feel free to reach me out on my instagram for any suggestion / question.