🤯 5 Lessons I’ve Learned as a Data Scientist


Hey friends,

As we're fundraising at Staq, this week we had more than 10 investor meetings with constant pitching and fine-tuning of our pitch deck. It has been a crazy week.

If you asked me if fundraising was fun a few years ago, I'd probably say YES. After being through the process myself, fundraising is definitely not an easy process although it's always exciting to speak with investors. 😂

Throughout this fundraising journey, I'll also share some lessons learned along the way with you to show you the behind-the-scenes of a startup like ours. Stay tuned!

5 Lessons I’ve Learned as a Data Scientist

Throughout my data science journey, I've made tons of mistakes. Fortunately, I've also learned tons of lessons from those mistakes. Today, I want to share with you the 5 lessons I've learned that I wish I had learned earlier.

I hope you'll find these lessons useful. Let's get started. 🚀

Lesson #1: Storytelling, NOT Presentation.

One of the most profound questions that I’ve ever been asked by a great senior data scientist in my ex-company:

“Admond, what’s the story that we are gonna tell in the meeting later?”

The first time I heard this question, I was stunned for a second. He didn’t ask what slides I’d prepared. He didn’t ask what results I was going to tell.

NONE.

Before I began to understand the importance of storytelling, either stakeholders didn’t understand what I was saying, or the insights couldn’t convince them to take action.

Once I started to improve my storytelling skills, things changed.

Stakeholders began to understand what I delivered through stories without bombarding them with technical jargon and boring results. They took action.

Facts tell, but stories sell.

A good data scientist only focuses on technical skills.

A great data scientist focuses on storytelling skills.

Be a great data scientist.

Lesson #2: Data Is Messy, Embrace It.

Forget about having Kaggle-like data in your real working environment, because most of the time you won’t have clean data.

Even worse, sometimes you don’t even have data to begin with, or perhaps you’re just not sure where to get or query data because they are scattered everywhere.

Data collection and data integrity are some of the most important steps in any data science project, yet a lot of junior data scientists might be oblivious to that.

The reality is that you need to know where to get your data based on business requirements and the existing data architecture.

You might breathe a sigh of relief after you’ve collected the data, but this is where the hard part begins — data integrity.

You need to perform a thorough check on the data collected by asking hard questions and understanding from different stakeholders to see if the data collected makes any sense.

Without having the right and accurate data in the first place, all of our data cleaning, EDA, machine learning models building and deployment are simply a luxury.

Lesson #3: Soft Skills > Technical Skills

One of the most common questions for beginners in data science is this:

“What are the skills that I need to learn when starting out in data science?”

My short answer is, "Learn technical skills first, soft skills later."

Here's my long answer. 👇🏻

In my opinion, I think learning technical skills (programming, statistics etc.) should be the priority when starting out in data science.

Once we’ve a solid foundation in technical skills, we should focus more on building and improving our soft skills (communication, storytelling etc.).

Data scientists are problem solvers.

We don’t just write code, build some fancy machine learning models and call it a day.

From understanding a business problem, collecting and analysing data, to the stage of prototyping, fine-tuning and deploying models to real world applications, all these steps require teamwork, communication and storytelling skills to work with team members, manage the expectation of stakeholders and ultimately drive business decisions and actions.

You can:

  • Write the cleanest code in the world
  • Perform the best data analytics in the world
  • Build the best machine learning model in the world

But if you can’t use your results to drive business decisions and actions to convince people to use what you’ve got, your results would only be residing in your PowerPoint slides without having any real impact.

Sad, but true.

Lesson #4: Explainable Models Matter, A Lot.

Unless you’re working at some cutting-edge technology companies, most businesses prefer simple and explainable machine learning models for predictions.

Your boss and stakeholders want to understand why the model behaves and predicts this way. Therefore, you need to be able to explain what’s going on behind your results.

For example, some questions they might ask include:

  • What caused this anomaly to be detected? And why is that so?
  • Does it make sense in the business context?
  • Why is the prediction the way it is?
  • Are our assumptions correct?

All these questions boil down to one simple question:

“ What’s the pattern observed behind?”

Being able to understand what’s going on behind our models and results is crucial to driving business decisions by convincing stakeholders to take action.

Huge enterprises like banks and hospitals simply can’t afford to deploy a black box model in the real world and let it run wild on the ground without understanding how it works or why it fails.

And this is exactly why simple models like decision trees and logistic regression models are still being utilized in most industries.

Lesson #5: Always See The Big Picture.

When I first started in data science, I focused too much on coding but somehow lost sight of the big picture that was truly important — the end-to-end pipeline integration in production and how the solution performed in the real world.

I was too fixated on the technical part to the extent of over-optimizing my code and models without having a real impact on the overall project or business.

Unfortunately, I learned this the hard way.

Fortunately, I’m currently using what I’ve learned to always remind myself to see the big picture.

Hopefully, you’ll begin to realise the importance of seeing the big picture in your day-to-day work as a data scientist.

And the first step to doing this is to first understand the business domain and the problems that you’re solving.

Be clear about what you and your team aim to achieve in a project and understand how your role could be a part of the big picture and how different small pieces of the picture can work together as a whole to achieve the common goals.


Conclusion

My data science journey definitely has been a tough one, but I really enjoyed the ride and learned a lot along the way.

I hope you found today's sharing helpful to you and will apply the lessons here in your data science job.

Remember, keep learning and never stop improving.

My 5 lessons learned:

  1. Storytelling, NOT Presentation.
  2. Data Is Messy, Embrace It.
  3. Soft Skills > Technical Skills
  4. Explainable Models Matter, A Lot.
  5. Always See The Big Picture.

What lessons have you learned in your data science journey? I can't wait to hear your story. Just reply to this email and let's chat! 😁


That's all for today

Thanks for reading. I hope you enjoyed today's issue. More than that, I hope it has helped you in some ways and brought you some peace of mind.

You can always write to me by simply replying to this newsletter and we can chat.

See you again next week.

- Admond


linkedintwitterinstagramfacebookmedium

Admond Lee

Hi! Admond here 👋🏻 I am a data scientist currently building a tech startup. Sign up for Hustle Hub - my weekly newsletter where I share actionable data science career tips, mistakes and lessons learned from building a startup - directly to your inbox.

Read more from Admond Lee
time lapse photography of man jumping on waterfalls

Hey friends, Last week I shared my story of going from physics to data science. Well... That's just Part 1 of the story. In today's issue, I want to share Part 2 of the story on how I ended up quitting my job (after working for 3 years in data science) to build a startup - Staq. P.S. You'll be surprised how I ended up building a startup. 😂 My 1st Job - Research Engineer at Titansoft Spot me in the picture 😉 In June 2018, I started my first full-time job at Titansoft as a research engineer. It...

man wearing gray T-shirt standing on forest

Hey friends, Having been asked by a number of people why I decided to transition from physics into data science, and eventually quit my job to build a startup, I'd love to share my story with you today to hopefully encourage you to keep exploring, and most importantly, inspire you to pursue your passion. You can't connect the dots looking forward, you can only connect them looking backwards. — Steve Jobs The truth is that I didn't know I wanted to become a data scientist when I was studying...

person about to lift barbell

Hey friends, Hope you're having a great week. After much preparation, we finally launched Staq in Entrepreneur First Global Reveal last week. Spot us (Staq) HERE to watch our 2-min pitch with my cofounder! 😂 It was also my first time meeting many top VCs and investors at an exclusive meetup organised by EF. It was an epic night when we shared what we've been building at Staq. Excited, we even took a picture together! Taking a picture with investors As promised, I'll document my fundraising...