In my last post, I have compared R to Julia, showing how Julia brings a refreshening programming mindset to the Data Science community. The main takeaway is that with Julia, you no longer need to vectorize to improve performance. In fact, good use of loops might deliver the best performance.
In this post, I am adding Python to the mix. The language of choice of Data Scientists has a word to say. We will solve a very simple problem where built-in implementations are available and where programming the algorithm from scratch is straightforward. The goal is to understand our options when we need to write efficient code.
Let us consider the problem of membership testing on an unsorted vector of integers:
julia> 10 ∈ [71,38,10,65,38]
julia> 20 ∈ [71,38,10,65,38]
I implemented the linear search algorithm in R, Python and Julia, and compared CPU times against a C implementation (1.000 searches over an array with 1.000.000 unique integers). Several flavours of implementations were tested:
Looking to results side by side for this simple problem, we observe that:
A comprehensive version of this article was originally published here (open access).