Find Credit Card Info from Ecommerce using Python

This is an example of how to extract customer information, such as the credit card number from an Ecommerce using Python.

We start loading the data and showing the first 10 observations. We also can see the number of columns (14) and rows (10000) of the dataset.

import pandas as pd
ecom = pd.read_csv('C:/07 - R Website/dataset/Graph/Ecommerce Purchases')
ecom.head(10)
##                                              Address  ... Purchase Price
## 0  16629 Pace Camp Apt. 448\nAlexisborough, NE 77...  ...          98.14
## 1  9374 Jasmine Spurs Suite 508\nSouth John, TN 8...  ...          70.73
## 2                   Unit 0065 Box 5052\nDPO AP 27450  ...           0.95
## 3              7780 Julia Fords\nNew Stacy, WA 45798  ...          78.04
## 4  23012 Munoz Drive Suite 337\nNew Cynthia, TX 5...  ...          77.82
## 5  7502 Powell Mission Apt. 768\nTravisland, VA 3...  ...          25.15
## 6     93971 Conway Causeway\nAndersonburgh, AZ 75107  ...          88.56
## 7  260 Rachel Plains Suite 366\nCastroberg, WV 24...  ...          44.25
## 8            2129 Dylan Burg\nNew Michelle, ME 28650  ...          59.54
## 9    3795 Dawson Extensions\nLake Tinafort, ID 88739  ...          95.63
## 
## [10 rows x 14 columns]
ecom.columns
## Index(['Address', 'Lot', 'AM or PM', 'Browser Info', 'Company', 'Credit Card',
##        'CC Exp Date', 'CC Security Code', 'CC Provider', 'Email', 'Job',
##        'IP Address', 'Language', 'Purchase Price'],
##       dtype='object')
len(ecom.columns) # give the number of columns
## 14
len(ecom.index) # give the number of rows
## 10000

Now we can calculate the average, highest and lowest purchase price from the Purchase Price variable iside the Ecommerce dataset. Moreover, we can find from the Language column how many customers came speak english.

ecom['Purchase Price'].mean()
## 50.34730200000025
ecom['Purchase Price'].max()
## 99.99
ecom['Purchase Price'].min()
## 0.0
ecom[ecom['Language']=='en']['Language'].count()
## 1098

Here, we calculate how manu people made a purchase both at AM and PM from the variable called AM or PM. We can also calculate the 10 most common job titles of the Ecommerce customers from the Job column.

ecom['AM or PM'].value_counts()
## PM    5068
## AM    4932
## Name: AM or PM, dtype: int64
ecom['Job'].value_counts().head(10) # top job of customers
## Interior and spatial designer        31
## Lawyer                               30
## Social researcher                    28
## Purchasing manager                   27
## Research officer, political party    27
## Designer, jewellery                  27
## Dietitian                            26
## Designer, fashion/clothing           26
## Social worker                        26
## Charity fundraiser                   26
## Name: Job, dtype: int64

We can also find the email of the person with a specific credit card number. Moreover we can find how many peole have an American Express as their credit card and made a purchase above $80. Moreover, how many people have a credit card that expired on 2020.

ecom[ecom['Credit Card']==4926535242672853]['Email']
## 1234    bondellen@williams-garza.com
## Name: Email, dtype: object
ecom[(ecom['CC Provider']=='American Express') & (ecom['Purchase Price']>80)].count()
## Address             174
## Lot                 174
## AM or PM            174
## Browser Info        174
## Company             174
## Credit Card         174
## CC Exp Date         174
## CC Security Code    174
## CC Provider         174
## Email               174
## Job                 174
## IP Address          174
## Language            174
## Purchase Price      174
## dtype: int64
sum(ecom['CC Exp Date'].apply(lambda exp: exp[3:]== '20'))
## 988

Now we are searching from the most popular, top 5, email provider used to log-in in the Ecommerce.

# email = ecom['Email'].iloc[0]
# email.split('@')[1]

ecom['Email'].apply(lambda emailnum: emailnum.split('@')[1]).value_counts().head(5)
## hotmail.com     1638
## yahoo.com       1616
## gmail.com       1605
## smith.com         42
## williams.com      37
## Name: Email, dtype: int64