JOSHUA GREENHALGH

OVERVIEW

This Excel dashboard serves as a valuable tool for both car buyers (and sellers) in the competitive automotive market. By analysing average sale prices across different states, kilometer ranges,transmission types, and vehicle conditions, users can make more informed decisions about their
car transactions.

The underlying data for this project was sourced from Kaggle, incorporating sales data from various online Australian car platforms throughout 2023. Extensive data cleaning was necessary to prepare it for the dashboard, which presented numerous opportunities to leverage powerful functions like XLOOKUP and IFS. The final result showcases relevant insights through pivot tables and visually appealing charts, allowing users to slice and dice the information based on their specific vehicle preferences.

OBJECTIVE

Given that purchasing or selling a car is a relatively infrequent event for most individuals, people often lack awareness of the current market prices for their vehicles. This knowledge gap is exacerbated by the multitude of factors that can influence vehicle values. The primary objective of this project was to create a user-friendly tool that empowers users to gain valuable insights into vehicle worth based on a variety of factors.

RESEARCH

According to a Carsales survey from August 2023, the number one variable in buying is price, with a staggering 76% of respondents considering it the most crucial factor. However, it’s essential to recognize that price is influenced by several variables, including brand,model, year, odometer reading, fuel efficiency, and location. Given this, the most critical variable for users of the dashboard remains price. To provide valuable insights, the visualisations could focus on dissecting these factors and presenting measures to help buyers identify the best options within their target budget.

DATA PREPARATION

Due to the nature of the data being car sales where duplicates sales can often exist, I avoided removing duplicate entries in this table.I started filtering out null values across columns containing essential data (price; kilometres; transmission; body type;condition; year). Rows with missing critical data were deleted (1166).

According to a Carsales survey from August 2023, the number one variable in buying is price, with a staggering 76% of respondents considering it the most crucial factor. However, it’s essential to recognize that price is influenced by several variables, including brand,model, year, odometer reading, fuel efficiency, and location. Given this, the most critical variable for users of the dashboard remains price. To provide valuable insights, the visualisations could focus on dissecting these factors and presenting measures to help buyers identify the best options within their target budget.

To improve visualisation clarity, I identified columns that could be grouped into brackets. Using an IFS formula I divided year of the cars into decades, and kilometers into ranges.

I combined premium and unleaded fuel entries into a single category called 'petrol' using find and replace. For entries labeled as 'other', I manually researched online to determine their actual fuel type.To improve visualisation clarity, I identified columns that could be grouped into brackets. Using an IFS formula I divided year of the cars into decades, and kilometers into ranges.

Since the dashboard requirement was to provide a state-level breakdown of sales, suburb information was unnecessary. I used a combination of RIGHT and LEN functions to create a formula which takes the length of the string, and subtracts from this the position of the comma ','. This leaves us with the text that follows after the comma, the state.

A similar formula was used to remove the interior colours listed in the 'ColourExtInt' column, allowing us to focus on popular exterior colours.

To maintain data integrity, I used a lookup table to fill in empty fuel consumption entries based on matching car titles. Entries that remained empty after this process were removed, reducing loss from 316 to 189 entries.A similar formula was used to remove the interior colours listed in the 'ColourExtInt' column, allowing us to focus on popular exterior colours.

Before cleaning there was a total of 16733 entries, after cleaning the data set was reduced to 13902 entries. 16.9% of the entries were removed to ensure completeness of data.