In my last post, I have compared R to Julia, showing how Julia brings a refreshening programming mindset to the Data Science community. The main takeaway is that with Julia, you no longer need to vectorize to improve performance. In fact, good use of loops might deliver the best performance.
In this post, I am adding Python to the mix. The language of choice of Data Scientists has a word to say. We will solve a very simple problem where built-in implementations are available and where programming the algorithm from scratch is straightforward. The goal is to understand our options when we need to write efficient code.
Let us consider the problem of membership testing on an unsorted vector of integers:
julia> 10 ∈ [71,38,10,65,38]
true
julia> 20 ∈ [71,38,10,65,38]
false
I implemented the linear search algorithm in R, Python and Julia, and compared CPU times against a C implementation (1.000 searches over an array with 1.000.000 unique integers). Several flavours of implementations were tested:
Looking to results side by side for this simple problem, we observe that:
A comprehensive version of this article was originally published here (open access).
Posted 29 March 2021
© 2021 TechTarget, Inc.
Powered by
Badges | Report an Issue | Privacy Policy | Terms of Service
Most Popular Content on DSC
To not miss this type of content in the future, subscribe to our newsletter.
Other popular resources
Archives: 2008-2014 | 2015-2016 | 2017-2019 | Book 1 | Book 2 | More
Most popular articles
You need to be a member of Data Science Central to add comments!
Join Data Science Central