Things to Know Before Rushing to Start in Data Science

Here are important things that I wished I had known when I decided to start a Data Science journey:

 

1. High school Math is fundamental for Data Science.

Matrix calculations, derivatives, eigenvalues, Set Theory, functions, vectors, linear transformations, etc. are extremely important to understand the theory behind statistical methods and programming. Therefore, before starting your next MOOC or Machine Learning book it’s crucial to review all those concepts again. Most schools request students to be proficient at these methods in order to graduate, but the silver lining is that it won’t require too much of your time to refresh or obtain this knowledge.

There are plenty of resources to start, but what worked for me was The Manga Guide to Linear Algebra, which is very simple, graphic and provides a great foundation prior getting into more complex stuff.

My suggestion is to schedule some weeks to review these concepts and to use the Feynman Techniqueto be able to explain in simple terms each of these topics.

 

2. Although there are many useful internet resources, books are still one of the best tools to learn from.

One of the issues people face today when trying to get into a field such as Data Science is Information Overload, a term used when talking in relation to the effect of having too many resources at the disposal. There are hundreds of MOOCs, online courses, specialisations, videos, etc., but the best use of the most valuable resource that we have, “time”, is to pick a book and start from the basics up to new concepts, and then keep filling the gaps with other books.

Learning Data Science should be seen like a building blocks game.

I believe this analogy is the best for learning most of the things, but it is extremely useful in our Data Science journey:

  • First, you need to select the toy model you would like to build.
  • Open all the plastic bags and lay all the different pieces on a flat surface, so you can see all the different parts.
  • Understand how each part can be used. Learn about the characteristics: dimension, color, weight, shape.
  • Start building small chunks until you’ve mastered all the uses.
  • Finally, after you’ve followed the instruction manual and built the model you’ve wanted, take all the pieces apart and start experimenting.

The same should be done with all the techniques in each area of Data Science. Learn what most all the blocks are, learn how to use them and then when you want to create more complex stuff look for the missing parts that you don’t have.

 

3. Computing skills are essential, not just for Data Science but for tomorrow’s world.

Not until I started studying for my Data Science master did I realize something that has been whispered for some time through all the blog posts, books, and news and it is the following message:

“Computer Code attributes for more than 80% of our lives today.”

Code is in our smartphones, websites, cars, televisions, health system, public transportation, manufacturing of goods, etc.

Almost every job/profession in industry is directly impacted by some program that enables the input, transform and print process of information. Learning about programming and how code works is not only to make software, apps or create a great website. Learning how to program will give you the advantage to understand how technology impacts our lives. Instead of blaming the computer program for “not working”, you will now think systematically and understand where the problem may be. And who knows, maybe you’ll come with better ideas to improve technology from a user perspective.

 

4. Your critical and analytical skills are very important.

I am a big fan of TV-shows related to crime and problem solving. One example is Scorpion, which narrates the story of a group of geniuses who solve a different range of problems using technology and math skills. The highlight of these type of shows, apart from all the action, jokes, and hero-scenes, is the “Critical Thinking” used by the characters to find the solution to different kinds of problems. This is one thing that is not mentioned in most of the Data Science resources. The ability to find the correct angle to approach a problem will lead you to identify not only which tools to use for any problem, but will sometimes lead you to the most efficient solution.

 

5. Everyone likes a TED talk, everyone shares good keynotes about leaders. However, YOU must prepare to deliver your findings.

There are many visualization packages (seaborn, ggplot, matplotlib) and software (tableau, excel) that can help create wonderful crisp charts. So, avoid getting saturated with too many options. The most important thing is how the message is delivered. Sometimes the simplest tools will generate a clear, relevant outcome.

 

This is the original post in http://www.kdnuggets.com

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.