Data scientists are in high demand, with compensation in this field averaging upwards of $100,000 in San Francisco. Learn data science and you can become one of the lucky, well-compensated professionals filling the current gap in this field.

The first step to learning data science is usually asking “how do I learn data science?”. The response to this question tends to be a long list of courses to take and books to read, starting with linear algebra or statistics. I went through this myself a few years ago when I was learning. I had no programming background, but knew that I wanted to work with data.

The Python community invested in the mid-1990s in Numeric, an “extension to Python to support numeric analysis as naturally as [M]atlab does” [1]. Numeric later evolved into NumPy [2]. Several years later, the plotting functionality from Matlab was ported to Python with matplotlib [3]. Libraries for scientific computing were built around NumPy and matplotlib and bundled into the SciPy package [4], which was commercially supported by Enthought [5]. Python’s support for Matlab-like array manipulation and plotting is a major reason to prefer it over Perl and Ruby.

Today, the most popular alternatives to Python for data scientists are R, Matlab/Octave, and Mathematica/Sage. In addition to the work mentioned above to port features from Matlab into Python, recent work has ported several popular features from R and Mathematica into Python.