
Data Science… A lot is being said about it.
One of the key things is where to start.
You wouldn’t jump into machine learning before mastering basic skills in data science.
What do I mean by basic skills?
Is it how to access to data?
How you manipulate it to satisfy your requirements?
Or maybe, even the definition of ‘data’ is something to be discussed first.
Can a simple text be data? Or does it need to be a certain format or type, such as floats, integers.
Or can I define it more broadly?
Anything that yields information on a desired subject.
I would say yes,, but it better be processible.
Or at least needs to have the potential to be processible,,, in a machine.
So not only, numbers and documents but even more abstract concepts like ‘smell’ and ‘taste’ can be translated into useable and useful data.
So while we all have our own learning journeys in different levels and fields of data science, it is not a bad idea to always think about the data, itself. So while you are fascinated about your machine learning algorithm crunching numbers and results, remember its limit will always be data.
You don’t have to go very far (than Kaggle) to see that many times, better data with mainstream algorithm can yield much more than average data with better algorithm.
So yes, you have to work on those data science skills first.
I mean to ‘really work’.
This is not about those varity of subjects you covered in the last popular course you have taken.
What you have leart will not get stable unless you use it.
It is about learning by doing, possibly on a real life project.
That is the only way to make skills out of knowledge, before you forget it.
Let us experience this on a real life problem, ‘buying and selling cars’.
Let us create a robust algorithm, from scratch to the end, using only basic data science skills.
Let’s add a real new dimension to our skills.
Python Real World Data Science Mega Project: Car Buyer App
Web Scraping in Python: Create Your Own Middleware in Scrapy