Fastest Method Of Getting K Smallest Numbers In Unsorted List Of Size N In Python?
Solution 1:
You could use a heap queue; it can give you the K largest or smallest numbers out of a list of size N in O(NlogK) time.
The Python standard library includes the heapq
module, complete with a heapq.nsmallest()
function ready implemented:
importheapqk_smallest= heapq.nsmallest(k, input_list)
Internally, this creates a heap of size K with the first K elements of the input list, then iterating over the remaining N-K elements, pushing each to the heap, then popping off the largest one. Such a push and pop takes log K time, making the overall operation O(NlogK).
The function also optimises the following edge cases:
- If K is 1, the
min()
function is used instead, giving you a O(N) result. - If K >= N, the function uses sorting instead, since O(NlogN) would beat O(NlogK) in that case.
A better option is to use the introselect algorithm, which offers an O(n) option. The only implementation I am aware of is using the numpy.partition()
function:
import numpy
# assuming you have a python list, you need to convert to a numpy array firstarray = numpy.array(input_list)
# partition, slice back to the k smallest elements, convert back to a Python list
k_smallest = numpy.partition(array, k)[:k].tolist()
Apart from requiring installation of numpy
, this also takes N memory (versus K for heapq
), as a copy of the list is created for the partition.
If you only wanted indices, you can use, for either variant:
heapq.nsmallest(k, range(len(input_list)), key=input_list.__getitem__) # O(NlogK)
numpy.argpartition(numpy.array(input_list), k)[:k].tolist() # O(N)
Solution 2:
If the list of the kth smallest numbers doesn't need to be sorted, this can be done in O(n) time with a selection algorithm like introselect. The standard library doesn't come with one, but NumPy has numpy.partition
for the job:
partitioned = numpy.partition(l, k)
# The subarray partitioned[:k] now contains the k smallest elements.
Solution 3:
You might want to take a look at heapq
:
In [109]: L = [random.randint(1,1000) for _ in range(100)]
In [110]: heapq.nsmallest(10, L)
Out[110]: [1, 17, 17, 19, 24, 37, 37, 45, 63, 73]
Solution 4:
EDIT: this assumes that the list is immutable. If the list is an array and can be modified there are linear methods available.
You can get the complexity down to O(n * log k)
by using a heap of size k + 1
.
- Initially get the first
k
elements into a min-heap. - For every subsequent element, add the element as a leaf and heapify.
- Replace the last element with the next element.
Heapify can be done in logarithmic time and hence the time complexity is as above.
Solution 5:
You can do it in O(kn)
with a selection algorithm. Once kn >= n log n
, switch to sorting. That said, the constant on the selection algorithm tends to be a lot higher than the one on quicksort, so you really need to compare i (kn)
and j (n log n)
. In practice, it's usually more desirable to just sort unless you're dealing with large n or very small k.
Edit: see comments. It's actually a lot better.
Post a Comment for "Fastest Method Of Getting K Smallest Numbers In Unsorted List Of Size N In Python?"