- Biocuration Group

# The current trend in Data science with R and Python for Healthcare Data

The data scientist job is definitely one of the most lucrative and hyped job roles out there. More and more businesses are becoming data-driven, the world is increasingly becoming more connected and looks like every business will need a data science practice. So, the demand for data scientists is huge. Even better, everyone acknowledges the shortfall of talent in the industry.

But, becoming a data scientist is extremely complicated and competitive. The career path of a data scientist is not going to be easy. It needs a mix of problem-solving, structured thinking, coding, and various technical skills among others to be truly successful.

The Field of Data Science is Broad and Varied

There is no single definition of data science, as it varies with industry, specific business, and what the purpose of the data scientist’s role is. Different roles require different skill sets, therefore the educational and training path is not uniform.

The role the data scientist is to play is now generally broken down into two large categories:

Type A: Data science for people – data collection and analysis to support decision-making based on the evidence

Type B: Data science for software – for example, the recommendations one might get for books or movies from Amazon or Netflix, based upon past behaviors.

Industry Demand from a Modern Data Scientist

In the current job market, Data scientists are expected to know a lot — machine learning, computer science, statistics, mathematics, data visualization, communication, and deep learning.

Being a Data Scientist is much more than a glamorous job title and a generous salary. It takes serious commitment to become a great Data Science practitioner in this competitive, candidate driven market we’re seeing grow exponentially today.

Studying further degrees are extremely time-consuming and take a huge commitment. To then go into the commercial sector, the expectations here can be much more demanding. In a commercial environment, pressure can come from a variety of sources – time, colleagues, money, answers, the list goes on.

Data plays a huge part in business decision-making and the skills required to manage these data sets fall well outside of the remit of managers and executives. This means a lot of pressure can be felt by Data Scientists who are working for companies with shareholders expecting to see profit and business input directly from your insights.

Most organizations will expect some quick results so picking the projects with low hanging fruit becomes important. This can be daunting for a rookie Data Scientist, so some guidance from the wider Data Science team could be key to initial success. Whereas in a research environment, the pressures – whilst still demanding – are perhaps not as pointed.

Required Skills for Data Science Job Roles

Jeff Hale looked at general data science skills and at specific languages and tools separately. He searched job listings on LinkedIn, Indeed, SimplyHired, Monster, and AngelList on October 10, 2018. Here’s a chart showing how many data scientist jobs each website listed.

Source: KDnuggets

As per Jeff’s analysis, machine learning, statistics, and computer science skills are the most frequent general data scientist skills sought by employers.

Source: KDnuggets

It is interesting that communication is mentioned in nearly half of job listings. After all, data scientists need to be able to communicate insights and work with others.

Source: KDnuggets

Among Tech skills, Python is the most in-demand language. The popularity of this open-source language has been widely observed. R is not far behind Python. It once was the primary language for data science. I was surprised to see how in demand it still is. The roots of this open source language are in statistics, and it’s still very popular with statisticians.

Python or R is a must for virtually every data scientist position.

Apart from Py...(more)

Here is how to Become a data scientist in 4 steps. This is how I did it.

Learn Statistics First

I did this too late, you can start early. It so easy, I dont know why I hesitated. Perhaps the mental block we all(well not all) have with maths.

Descriptive statistics

Types of data variables

Central tendency measures

Spread of data, skew of data

Measures of dispersion

Inferential statistics

Population and sample (Sampling methods is optional but read it : simple random sampling and stratified random sampling

Random variables, Probability distributions - normal, Poisson

Estimation and Hypothesis testing.

Hurrah! and now you are ready for the next step and its not R or python! ha ha !

Learn Excel & Power BI next : 750 million users globally, the tool and platform that has seen more data than any other. Also the mom of Power BI, Yes! Power BI is advanced excel features evolved into a software.

Relevance of inserting tables in excel - 9 great reasons to insert tables in excel for data analytics professionals.

Consolidating data with compatibility view features

Data manipulation with machine learning in excel using xlstat

Analysis toolpak for descriptive and inferential statistics

Power View, Power Query, Power Pivot and Power Maps.

Learn R and Python (Yes its and not or)

Learn Tableau (it is integrated with Python and R)

Why? Its too hot to ignore. Too easy to not try on. You see Data science has data exploration, data analytics, and data presentation. Tableau and to an extent Power BI is great with exploration and presentation. R and Python are super for analytics that comes in between.

So see this to understand what I mean.

Welcome to Data Science, its a lovely world. I came here three years ago and never regretted it.

Next, you can learn R and Python programming, and predictive modelling using machine learning algorithms.

Now, as a fresher, you’re not required to know all of the above skills. It’s just that companies have priorities when they hire a data scientist or a data analyst, it and completely depends on what they expect their candidate to do. While some companies might require you to do all of the above tasks, at a few companies you just need to perform one or few tasks. However, it’s to best learn the basics of all, if possible. This will give you more opportunities and you would know what you are more comfortable with.

Next, as a fresher, the following are some major techniques you should focus on. As these techniques are frequently used in data analysis and companies look for candidates who have strong command on these. This will help you make your chances better for getting your first data science job.

1. Exploratory data analysis

2 Missing value analysis

3. Outlier analysis

4. Feature scaling

5. Sampling techniques

6. Error Metrics Classification

7. Error Metrics Regression

8. Random Forest

9. Linear Regression

10. Logistic regression

11. Visualisations

12. KNN

13. Native Bayes

All in all, learn the above skills. Next, you should focus on projects. If you can work on projects related to all the above. It would be great, else you should focus majorly on data collection and cleaning projects. As this is the major skill that companies look for in freshers and in most cases that is also their primary task. Hence, if you have done projects related to collection and cleaning, it would immensely increase your chances of getting a job in data science.

Also, companies prefer candidates who come with data handling experience. Having projects ensures that you can look after data without losing it. As losing data can cause loss of millions to companies and in case of sensitive data, it can also jeopardise a business.

When you have the skills and projects, you can apply for jobs. Apply thorough Angellist. You will find start-ups, and early stage start-ups here. Or you can also apply through company websites.

Put simply, you should follow this simple approach —

Learn the skills required to be a data scientist

Work on projects

Get hired as a Data Scientist

You can use platforms like __Intillipaat__ to learn the above skills. However, I would suggest you to use Educba. Here you can learn all the above skills while working on projects. These projects can be easily used as your portfolio. Plus, there are a lot of analytics companies that hire for data science roles through edwisor based on the projects people do here. So give it a spin.

for consultation - iftubip.blogspot.com