A Complete Data Analytics Project with Python

Data collection, analysis, visualization, and presentation.

A Complete Data Analytics Project with Python
Photo by Thought Catalog on Unsplash

Data analytics is a field that involves the analysis of raw data to come up with useful insights.

The field of analytics has been around for a long time. Companies have been utilizing existing customer data (in the form of surveys or internal transaction data) to come up with marketing strategies.

In the past, these companies would hire analysts or statisticians to derive insight from this data, identify market potential, and come up with models to drive sales.

However, the field of analytics has seen massive growth in the past few years. With the increase in computing power, companies are able to collect, store, and process data at a scale that would have been unimaginable before.

This transition has opened up a variety of roles in the domain of analytics, and there is an increasing demand for people who are able to work at the intersection of statistics, technology, and business.

Analysts are able to collect and process large amounts of data with their technical skill, they can derive insight from this data with statistical knowledge, and derive business value with their domain knowledge.

In this article, I'm going to present a data analytics project I created some time back.

I did some market research around a popular eyewear brand Warby Parker. I scraped publicly available information on Twitter and customer review sites to come up with a data-driven analysis, and I will break down my findings here.


Step 1

I first did some research around Warby Parker, and came up with some findings on their brand positioning:

Warby Parker was the first brand to disrupt the eyewear industry and make glasses affordable for everyone. Customers can pay as little as $95 for a pair of eyeglasses, and their buy a pair, give a pair program donates a pair of glasses to someone in need for every pair sold.

Step 2

Then, I decided to take a look at Warby Parker's audience demographic. To do this, I scraped 10,000 of their follower's Twitter profiles.

I used a gender prediction package to predict their gender based on their Twitter usernames. Here is a gender breakdown of Warby Parker's Twitter audience:

Warby Parker has high gender diversity. They have a higher number of female followers as compared to the average US population distribution, which could be due to their position as an all-inclusive brand. They design eyewear with frames and colours specifically catered to women.

Next, I took a look at the top location of their Twitter followers:

Most of Warby Parker's followers come from New York. I checked the average US population distribution on Twitter, and their followers aren't from regions with the highest number of Twitter followers. Instead, it seems like most of their followers come from areas that already have Warby Parker stores.

This could indicate that the presence of physical stores have spiked interest in these regions. Another possibility is that Warby Parker could have selected these areas to open their stores due to high demand.

Step 3

Next, I analyzed the top interests and professions of Warby Parker users. I scraped their follower's Twitter bios to do this, and extracted keywords with the help of a Python library.

After some data cleaning, I was able to find the top interests and professions of their followers:

From the charts above, it seems like the top interest amongst Warby Parker's followers is social issues. They are also interested in music, technology, and art.

Many of them are writers and artists, indicating that they are highly creative individuals. Some of them are entrepreneurs, founders, and CEOs, and are running their own businesses.

I also noticed that there were many people interested in technology, who had professions like programmer, developer, and data scientist in their bio. These titles are not present in the chart above, but technical professions combined made up a huge portion of Warby Parker's customer base.

This is understandable, since people in these professions are more likely to develop issues with their eyesight and need to use prescription glasses.

Step 4

Next, I built customer personas based on my findings above:

Step 5

The personas above were built based on the demographic and interest based data analyzed from Twitter. After that, I came up with some sample marketing strategies for each persona:

Step 6

After building personas and analyzing Warby Parker's audience interests, I decided to do some competitor analysis. I went online and found Warby Parker's top competitors, along with information like revenue and company size:

After doing some reading on Warby Parker's competitors, I came up with the following insights:

  • Warby Parker's main competitors include Zenni Optical, Pair Eyewear, and MyOptique group. These companies are positioned similarly to Warby Parker.
  • Warby Parker has higher total revenue compared to all its competitors within the same niche, and Zenni Optical is a close second.
  • Warby Parker also has higher social media engagement compared to all its competitors.

Step 7

Finally, I did some sentiment analysis around Warby Parker and its competitors. I chose to do a sentiment comparison between Warby Parker and Zenni Optical for this analysis, since Zenni Optical is one of their biggest competitors.

I grouped customer sentiment into four different areas: price, features, quality, and customer service.

I scraped a site called consumeraffairs.com for this analysis.

First, I looked at overall rating distribution on the site:

Zenni Optical had a larger number of positive reviews on the site. They had a total of 858 ratings, with an overall rating score of 4.4/5. Warby Parker, on the other hand, had only 173 reviews with an overall rating score of 3.8/5.

Then, I grouped ratings for both companies in terms of price, features, quality, and customer service:

  • Zenni Optical scored better than Warby Parker in terms of price and quality. There were many negative reviews on Warby Parker describing broken lenses and low quality.
  • Some Warby Parker customers still thought their eyewear were too pricey, while Zenni Optical customers seemed satisfied with the prices. Zenni Optical’s eyewear also cost less than Warby Parker.
  • Warby Parker scored better than Zenni Optical in terms of customer service. There were many positive comments on Warby Parker’s customer service, and people were happy with the team’s quick response at addressing their problems.
  • Customers seemed equally satisfied with both brands in terms of product features.

Next, I looked at rating distribution over time for Warby Parker and Zenni Opticals:

From here, it can be observed that:

  • Zenni Optical has higher overall ratings than Warby Parker in 2021.
  • Negative ratings spiked for Warby Parker in the beginning of the year, and reduced dramatically in April. April also saw a huge increase in positive ratings. This might have been due to their April Fool’s Day campaign. Ratings seem to even out and decline in May.
  • Zenni Optical had higher positive ratings during the beginning of the year. This changed around May, when there was a sudden spike in negative ratings. This could be due to a drop in quality or change in customer service.

That's all for this analysis!

I used Python libraries like BeautifulSoup and Selenium for most of the web scraping involved, NLTK for text cleaning and processing, and Pandas for most of the data manipulation. Excel and Tableau were used for data visualization.

I hope this analysis gave you a better idea as to how an end-to-end data analytics project is structured.

As an analyst, you need to have a good grasp of at least one programming language to be able to collect and manipulate external data. You also need to have the ability to tell a story around the data present, which is a skill that can be developed over time.

Communicating these insights to a non-technical person is one of the most important tasks of a data analyst. Make your visualizations as simple as possible so it can be digested easily. Also make sure to list down important findings based on the charts presented. It is important to get to the point because people have the tendency to lose focus easily.


I hope you enjoyed this article, thanks for reading!