# Coming up with a linear time solution for the following problem

There are n packages, numbered 1 to n. A set K of pairs (i, j) defines a list of dependencies such that package j cannot be installed without installing package i.

Come up with a linear time algorithm that takes a list K, and produces the list of all packages that are necessary to install package 1.

Here is my attempt:

```
function algortihm(n,K)
radix sort K(i,j) on key j
dependencies <- arr size n //declare array
currPackage <- 1
tempArr <- K
function func(currPackage, K)
dependencies.append(currPackage)
count <- -1
for (i,j) in K:
if j not in dependencies:
count <- count + 1
if j == currPackage:
tempArr.remove(count)
func(i, tempArr)
endif
endif
if j > currPackage:
break
endif
endfor
endfunction
return dependencies
endfunction
```

Using this input K = (1, 2) (3, 2) (3, 4) (4, 1) (5, 1) (5, 4) (6, 8) (8, 3) (7, 6)

Radix sort has complexity O(n) and will reduce the number of times the list is iterated through because if we sort by dependencies (key j) then we know that once j exceeds the size of the package we know there are no more dependency listings and the for loop can be broken out of.

List after sorting: (4, 1) (5, 1) (1, 2) (3, 2) (8, 3) (3, 4) (5, 4) (7, 6) (6, 8)

Furthermore, every time a dependency is found, it is removed from a temporary array, which is then passed recursively into the function. It also ensures that a call or iteration/comparison isnâ€™t made if the dependency is already recorded.

This works out to about 37 comparisons being made, but I know it is not linear time. I just can't spot what would be faster than what I've got it to already, it is difficult to analyse the complexity of the algorithm I have come up with but I reckon it is O(n^2/b)

@WillemVanOnsem I don't think so. Topological sort would help you identify an order in which all packages can be installed without violating any dependency but will not help you identify the minimal set of packages needed to be installed before installing a specific package

### 1 Answer

Using a sort is not appropriate for this problem. You would be able to sort all the nodes but identifying the minimal set of packages needed to be installed before a given package will be hard.

A better approach is to consider the dependencies as edges in a graph and represent them in reverse direction. That is if there is a dependency `(i, j)`

(meaning `i`

should be installed before `j`

), add an edge from `j`

to `i`

in your graph. Now having defined this graph, the list of packages that need to be installed before package `x`

are exactly those packages that are reachable from `x`

in the thus defined graph. To find which are those nodes you can use and graph search algorithm e.g. breadth first search or depth first search.

This is in essence topological sorting: en.wikipedia.org/wiki/Topological_sorting