Taking you behind the scenes of Portfolio creation πŸ€Έβ€β™€οΈ

Portfolios for Data Science

You get a portfolio, and you get a portfolio. Everybody gets a portfolio!

That’s what I imagine when I picture myself building portfolios with amazing women who have been working hard to learn new data skills but are unsure how to create a portfolio.

​

So what better than sharing today a Data Science portfolio 101 (or a mini introduction) and also how I would tackle the process of creating a portfolio for getting a job or internship in Data Science or AI?

I think it’s much fun to share with you the ups and downs of creating a portfolio from scratch.

​

First the first, what is a Data Science portfolio?

I know, I know, it may be obvious, but sometimes, starting with the most essential piece of information is the best path to understanding the end result and the required steps to get there.

So, a DS portfolio is a collection of all your shiny skills and the things you feel the most proud of have achieved.

It is the one place for anyone who wants to get to know you better (professionally speaking).

If someone in the DS realm asks you about your skills and the tools you master, then just glancing at your portfolio should answer their questions.

And that led me to the next question.

​

What should be included in a portfolio?

A portfolio could include all your data projects, but realistically, almost no one will take hours to review an extensive list of projects to understand whether your skills would be valuable for the company or if you’re the right person for the job.

So, a portfolio should include the projects you’re the most proud of, the projects that highlight your skills in the best possible way.

​

What is a good project for a Portfolio?

We can include thousands (if not millions) of projects in the “good portfolios” category.

Almost any project has the potential to be a good addition to your portfolio. What makes the difference between a project that’s worth adding to your portfolio is the story, the algorithms, the data source, the plots, and the interesting results.

Am I exploring the data and extracting the best possible insights, or did I make a superficial exploration that led to halfway conclusions?

Did I make beautiful, self-explanatory graphs or put together 20 graphs that almost no one would understand unless I stood there to explain them one by one?

Short, clear, and at one-glance results are a must in our current fast-paced world.

As I wrote in a previous post (you can read it in full here​):

You can take a simple data set and create something remarkable.

Because looking for the golden egg (in terms of datasets) will only make us procrastinate and add unnecessary pressure to our project, in other words, an extensive dataset won’t create an outstanding portfolio.

​

Have you read a beautiful poem, a short description, or a fascinating explanation of something? What do they have in common?

They are easy to read and understand and flow seamlessly to the end result without you even noticing it.

Creating something simple involves a deep understanding, and that’s why when we lack clarity, we usually end up overcomplicating things or adding unnecessary layers. (I’m guilty of this, and you can read more about itΒ here.)

​

What is currently missing in the DS portfolios sphere?

The most important piece in getting into DS, and, creating an outstanding portfolio is what I wrote some months ago (read in fullΒ here):

“I learned that we need programs that not only teach technical skills but also boost our self-confidence and teach us to speak up for ourselves and defend our points of view.
​
What is missing is learning how to be so confident that it’s obvious we should apply to that interesting job, EVEN when we don’t meet ALL the requirements.
​
What is missing is learning how to act and feel confident even AFTER learning all the technical skills.
​
What is missing is to be heard. What is missing is to embrace our emotions and remember that we are not just logical beings but also beings of heart, love, and compassion.

And I know this is, and will be, my most valuable contribution to the world: Empowering women with technical skills and self-confidence to get into DS.”

Now let’s see the behind-the-scenes. And for the sake of clarity let’s imagine I’m looking for a job in Data Science.

​

What is my first step in creating a portfolio?

Creating a portfolio is the stepping stone to getting into Data Science jobs or internships.

And the first thing for me is to be clear about the industry (or industries) in which I would love to get a job.

That is because the topic and skills in my portfolio should ideally align with the industry in which I’m planning to get a job.

Let’s say I’m planning to get a job in an industry focused on forecasting. Then, creating a project that includes ML skills would make a lot of sense.

This first step helped me to define the skills, tools, and libraries I should use and gave me an overall idea of the projects I should work on.

This step aims to get to something like: I really want to get a job working with Natural Language Processing, and my desired industry is healthcare. (I’m assuming I know NLP and have some experience with the healthcare industry).

​

The second step: the project idea and data sets

Now comes a fun moment of exploring and evaluating the potential of my project ideas. I would explore datasets online and see if any spark my curiosity. The ideal scenario here is to work with a dataset that makes me feel excited and interested in the insights I can get.

I see it as a treasure hunt of a puzzle. We start with a lot of data, and step by step, we organize it so we can see clearly what is “hidden.”

I also consider information that I can scrape. This approach has the added benefit of knowing that there aren’t many other projects using my specific dataset because I scrapped it (although there is a probability that there are).

Also, in this step, I consider how much time the project would take me and check if I can manage to invest that time to finish the project. This last is very important because, talking from experience, for me, there is nothing worse than starting a project, investing several hours, and then leaving it unfinished.

At the end of this step, I’m looking for something like a data set that sounds very interesting and intriguing. Of course, I would need data in text form because I want to use NLP. Also, knowing the data roughly helps me define the possible insights and plots I can create. But keep in mind that at this stage, it is still a rough idea.

I can add many more details to this step. Let me know if you want to hear more about choosing or scrapping datasets and planning your projects.

​

Third step: exploring the data and extracting insights

This step will take the majority of my time.

First, cleaning and handling missing data.

Then, explore the possible graphs and the story unveiled with the data.

This will take time, but it will be worth it!

In this step, I focus on adding as much of my knowledge and skills as I can. Also, this is the moment where keeping up with new python libraries will be a valuable asset.

For example, I recently discovered a Python library called Anywidget that simplifies creating and publishing custom Jupyter Widgets.

So, I started to explore the many options available and got particularly excited to try the statistical visualization library called Vega-Altair:

​

Source: Vega-Altair https://github.com/vega/altair​

Isn’t that amazing?

​

Fourth step: Presenting the results

The last step is to present the results cohesively. At this stage in the process, I may have many graphs, many results, and some insights. But I must be sure that with my results, I’m sharing a story that is understandable and makes sense.

So, I start with a draft of my results. I select plots that are very clear and speak for themselves. If a plot requires a long explanation to be understood, then I don’t use it. I focus on finding a way of sharing the insights without a long explanation. That’s because nowadays, almost anyone will read a project with many pages and text.

​

Final thoughts

I hope this step-by-step process can help you clear your doubts about creating a portfolio.

And please keep in mind that I’m sharing a simplification of the process. But no worries. Let me know in the comments if you would like to get more details on any of these steps, and I’ll do my best to share more in future post.

​

Thanks for reading! I’d love to hear your thoughts. Also, do you want to see more of the behind the scenes?

​

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top