Manipulating Data Part 4

MORGAN KATZ
Nov 6, 2020
  • Add new column called “Seats” and fill in.
  • Add MPG for each car.
  • Calculate the Total fuel used in gallons based on the odometer.

Now for analysis: shuffle the order of the data to determine if order influences the outcome:

car_sales.sample(frac=1)

(frac = 1) is the sample size.

Apply a calculation with .apply:

Example

Change Odometer from KM to Miles. Note, conversion is 1.609 KM per mile:

car_sales[“Odometer (KM)”] = car_sales[“Odometer (KM)”].apply(lambda x: x/ 1.609)

Then change the title of the Odometer (KM) to represent miles instead of kilometers:

car_sales = car_sales.rename(columns={“Odometer (KM)”: “Odometer (Miles)”})

That’s all for the Data Manipulation series. Next up… NumPy!

--

--

MORGAN KATZ
0 Followers

Business Analyst with an MBA in Business Intelligence. Machine learning student. Exploring data through analysis and solving problems using ML tools.