top of page
Bag of Groceries

Instacart Analysis Project

3

Marketing Strategy for Instacart Analysis Overview

I have been hired as an analyst at Instacart, the online grocery delivery service.  Instacart has good sales, but they want to discover more information about their sales patterns.  I have been tasked to perform an exploratory analysis of initial data to derive insights and suggest strategies for better customer segmentation.  

The Instacart stakeholders are considering a new segmented marketing strategy.  My analysis will help to define these segments and strategy.  The stakeholders are interested in answering a series of questions associated with their customers.  

​

Data:

The data for this project consists of open source Instacart data, and a customer data set created specifically for the project.   Analysis occurred in relevant libraries: pandas, NumPy, os,
matplotlib, scipy and seaborn.

Skills/Tools

  • Python

  •  Jupyter Notebooks

  •  Data wrangling, merging, grouping

  •  Deriving Variables

  •  Aggregating Data

  •  Population Flows

  •  Excel Reporting

Questions Asked:

What is the distribution between users in regards to brand loyalty?

Are there differences in ordering habits based on a customer’s loyalty status?

Are there differences in ordering habits based on a customer’s region?

Is there a connection between age and family status in terms of ordering habits?

What different classifications does the demographic information suggest?

What differences can you find in ordering habits of different customer profiles?

Project Steps

1

In the first step of this project, I downloaded the data and  imported it into a notebook as a pandas dataframe.  I then wrangled the data and created subsets.  In this step, I performed initial exploratory analysis, check for missing values, duplicates, and began to clean the data.  I transposed data to enhance readability and I started to drop columns.   An example of my code can be viewed below in figure one.  For more code examples, please view my GitHub link on the homepage. 

Screenshot 2022-09-24 192812.png

Figure 1

3

I created new columns using conditional logic in the form of if-statements, user-defined functions, the loc() function, and for-loops.  I also grouped and aggregated this data.  In order to identify the key customer segments the stakeholders want to view, I created flag columns.  All of these additional columns can be viewed in my final project deliverable.  As an example, I created a column that would flag loyal customers based on certain purchasing requirements.  Below you can see example code of my analysis creating price label conditions. 

Screenshot 2022-09-24 194739.png

Figure 3

5

In the last step of this project, I analyzed customer behavior based on existing columns and further aggregated the data.  I described any connections I found within a final report to stakeholders and listed relevant recommendations.  In this report, the viewer can see the columns combined and added to the data.  It includes descriptions of steps taken as well as a final report of visuals to present out.  Both an excel report document and a jupyter notebook were produced in this step.  As noted, please view the GitHub link on the home page for access to my code.  View the final excel report here. Figure 5 represents pages from the excel report consisting of a population flow of data, columns derived list, and relevant visualizations produced in python. 

Screenshot 2022-09-27 103038.png
Screenshot 2022-09-27 103102.png
Screenshot 2022-09-27 103134.png

Figure 5

2

In this step, I started to perform data consistency checks and began combining the data.  I checked for mixed data types, and I  combined the cleaned orders data with a new products data set.   This step consisted of formatting the data in a way that would be beneficial for my final analysis in the steps to come. An example of my code can be seen in figure 2. 

2

Screenshot 2022-09-24 194012.png

Figure 2

4

I finished combining all data sets (orders, products, customer) and used these newly created columns from the previous step to generate visualizations that would answer some of the stakeholder questions.  I analyzed various charts seen below, and worked to group the data in a way that would produce customer segments for Instacart to view.  

Screenshot 2022-09-27 102151.png
Screenshot 2022-09-27 102134.png

Figure 4

Fruits and Vegetables

Final Results

My findings are packaged within the final excel file link.  Listed below are what comprises the presentation:

  • population flow of data

  • consistency check steps

  • column derivation steps

  • relevant visualizations of customer groups

  • recommendations

I found that the Saturday and Sunday were the busiest days of the week.  Customers are paying the most money for products in the very early morning hours.  Produce sells the most and bulk sells the least.  There are fewer new customers than loyal customers and these loyal customers spend the most money during hour 10 of the day.  The southern region has the most purchases.  The Northeast has the highest young population.  Some baby aisle consumers were single.  Income significantly spikes after the age of 40.  More male consumers bought pet items and some baby aisle consumers do not have a dependent. 

Recommendations

I recommended that the marketing team market towards busy days (weekend).  I also recommended marketing higher priced items during those late night and early morning times when it appears consumers are making emergency runs.  This could be done with more signage on the app.  I think that instacart should continue to promote loyalty and make an effort to target those customers during daytime hours.  I also noted that alcohol should be marketed toward all demographics, and target men and gender neutral audiences in regard to pet products. 

Challenges

I faced several challenges throughout this project.  Prior to the work, I had limited Python knowledge. At first, it was a bit of a challenge to learn the language and syntax.  It was easier running through examples, but I had to look up answers many times when I received an error message.  This project really helped me to become confident in reading and interpreting graphs.  It also helped me to learn how to use the internet for my disposal.  Many times throughout the project, I would receive error messages.  I initially did not know where to start but was able to become very efficient at solving these types of issues by the end of the project.  This was also my first experience with Jupyter Notebooks, and it was not easy to learn the functionalities.  

​

In the future, I plan to continue to practice my Python skills.  As with anything, it takes practice to be good.  I plan to continue to review the functions that were challenging such as for loops.  I also plan to continue to practice combining data and cleaning it.  

© 2023 by Lilly Hooper. Powered and secured by Wix.com

bottom of page