Interactive Data Visualization with Plotly Express

Task 1. Plotting Interactive Scatterplot

In [1]:
import warnings
warnings.filterwarnings("ignore")

import pandas as pd
import plotly.express as px
In [2]:
salary = pd.read_csv('/Users/mekki/Python_Projects_Datasets/Interactive Data Viz with Plotly Express/00-employee_salaries.csv')
salary
Out[2]:
Years_of_Experience Salary
0 1.000000 40000.00000
1 2.257942 65979.42119
2 2.450875 67253.57549
3 2.498713 67342.43510
4 2.613729 70532.20448
... ... ...
1995 19.178575 421534.69100
1996 19.254499 430478.02650
1997 19.353369 438090.84540
1998 19.842520 482242.16080
1999 20.000000 500000.00000

2000 rows × 2 columns

Plot the Scatterplot for Years of Experience vs. Salary

In [3]:
fig=px.scatter(salary, x='Years_of_Experience', y='Salary', height=600, width=1000)
fig.show()
In [4]:
admission=pd.read_csv('/Users/mekki/Python_Projects_Datasets/Interactive Data Viz with Plotly Express/00-university_admission.csv')
admission
Out[4]:
Serial No. GRE Score TOEFL Score University Rating SOP LOR CGPA Research Chance of Admit
0 1 337 118 4 4.5 4.5 9.65 1 0.92
1 2 324 107 4 4.0 4.5 8.87 1 0.76
2 3 316 104 3 3.0 3.5 8.00 1 0.72
3 4 322 110 3 3.5 2.5 8.67 1 0.80
4 5 314 103 2 2.0 3.0 8.21 0 0.65
... ... ... ... ... ... ... ... ... ...
495 496 332 108 5 4.5 4.0 9.02 1 0.87
496 497 337 117 5 5.0 5.0 9.87 1 0.96
497 498 330 120 5 4.5 5.0 9.56 1 0.93
498 499 312 103 4 4.0 5.0 8.43 0 0.73
499 500 327 113 4 4.5 4.5 9.04 0 0.84

500 rows × 9 columns

Plot the Scatterplot for GRE Score vs. Chance of Admission. Show the University Rating as a third dimension

In [5]:
fig=px.scatter(admission, x='GRE Score', y='Chance of Admit', color='University Rating', height=600, width=1000)
fig.show()

Task 2. Plotting Interactive Bubble Charts

Add a fourth dimension to the Scatterplot regarding the SOP

In [6]:
fig=px.scatter(admission, x='GRE Score', y='Chance of Admit', color='University Rating', size='SOP',
              height=600, width=1000)
fig.show()

Add the LOR as a Hover Data

In [7]:
fig=px.scatter(admission, x='GRE Score', y='Chance of Admit', color='University Rating', size='SOP',
              hover_data=['LOR'], height=600, width=1000)
fig.show()

Modify the SOP to make the bubble variations more prominent

In [8]:
admission['SOP']=admission['SOP']**2

fig=px.scatter(admission, x='GRE Score', y='Chance of Admit', color='University Rating', size='SOP',
              hover_data=['LOR'], height=600, width=1000)
fig.show()

Task 3. Plotting Interactive Single Line Plot

In [9]:
crypto=pd.read_csv('/Users/mekki/Python_Projects_Datasets/Interactive Data Viz with Plotly Express/00-crypto_prices.csv')
crypto
Out[9]:
Date BTC-USD Price ETH-USD Price LTC-USD Price
0 9/17/2014 457.334015 NaN 5.058550
1 9/18/2014 424.440002 NaN 4.685230
2 9/19/2014 394.795990 NaN 4.327770
3 9/20/2014 408.903992 NaN 4.286440
4 9/21/2014 398.821014 NaN 4.245920
... ... ... ... ...
2380 3/28/2021 55950.746090 1691.355957 185.028488
2381 3/29/2021 57750.199220 1819.684937 194.474777
2382 3/30/2021 58917.691410 1846.033691 196.682098
2383 3/31/2021 58918.832030 1918.362061 197.499100
2384 4/1/2021 59095.808590 1977.276855 204.112518

2385 rows × 4 columns

Plotting BTC data

In [10]:
fig=px.line(crypto, x='Date', y='BTC-USD Price', height=600, width=1000)
fig.show()

Plotting ETH data

In [11]:
fig=px.line(crypto, x='Date', y='ETH-USD Price', height=600, width=1000)
fig.show()

Plotting LTC data

In [12]:
fig=px.line(crypto, x='Date', y='LTC-USD Price', height=600, width=1000)
fig.show()

Task 4. Plotting Interactive Multiple Line Plots

Plotting BTC, ETH and LTC data

In [13]:
crypto.columns
Out[13]:
Index(['Date', 'BTC-USD Price', 'ETH-USD Price', 'LTC-USD Price'], dtype='object')
In [14]:
fig=px.line(height=600, width=980)

for i in crypto.columns[1:]:
    fig.add_scatter(x=crypto['Date'], y=crypto[i], name=i)

fig.show()

Retrieving ETH and LTC prices where BTC reached its peak

In [15]:
crypto[crypto['BTC-USD Price']==crypto['BTC-USD Price'].max()]
Out[15]:
Date BTC-USD Price ETH-USD Price LTC-USD Price
2365 3/13/2021 61243.08594 1924.685425 226.578293
In [16]:
crypto[crypto['Date'] == '3/13/2021']
Out[16]:
Date BTC-USD Price ETH-USD Price LTC-USD Price
2365 3/13/2021 61243.08594 1924.685425 226.578293

Task 5. Plotting Interactive Pie Charts

Create a dataset

In [17]:
my_dict={'allocation %':[20, 20, 20, 20, 20]}
my_dict
Out[17]:
{'allocation %': [20, 20, 20, 20, 20]}
In [18]:
crypto=pd.DataFrame(data=my_dict, index=['BTC', 'ETH', 'LTC', 'XRP', 'ADA'])
crypto
Out[18]:
allocation %
BTC 20
ETH 20
LTC 20
XRP 20
ADA 20

Plot the Pie Chart

In [19]:
fig=px.pie(crypto, values='allocation %', names=['BTC', 'ETH', 'LTC', 'XRP', 'ADA'], title='Crypto allocation',
           height=600, width=980)
fig.show()

Assume that you became bullish on XRP and decided to allocate 60% of your assets in it. You also decided to equally devide the rest of your assets in other coins (BTC, LTC, ADA and ETH). Change the allocations and plot the pie chart.

In [20]:
my_dict={'allocation %':[10, 10, 10, 60, 10]}
crypto=pd.DataFrame(data=my_dict, index=['BTC', 'ETH', 'LTC', 'XRP', 'ADA'])
crypto
Out[20]:
allocation %
BTC 10
ETH 10
LTC 10
XRP 60
ADA 10
In [21]:
fig=px.pie(crypto, values='allocation %', names=['BTC', 'ETH', 'LTC', 'XRP', 'ADA'], title='Crypto allocation',
           height=600, width=980, hole=0.55)
fig.show()

Task 6. Plotting Interactive Bar Charts

In [22]:
data=pd.read_csv('/Users/mekki/Python_Projects_Datasets/Interactive Data Viz with Plotly Express/00-gapminder.csv')
data
Out[22]:
Unnamed: 0 country continent year lifeExp pop gdpPercap iso_alpha iso_num
0 0 Afghanistan Asia 1952 28.801 8425333 779.445314 AFG 4
1 1 Afghanistan Asia 1957 30.332 9240934 820.853030 AFG 4
2 2 Afghanistan Asia 1962 31.997 10267083 853.100710 AFG 4
3 3 Afghanistan Asia 1967 34.020 11537966 836.197138 AFG 4
4 4 Afghanistan Asia 1972 36.088 13079460 739.981106 AFG 4
... ... ... ... ... ... ... ... ... ...
1699 1699 Zimbabwe Africa 1987 62.351 9216418 706.157306 ZWE 716
1700 1700 Zimbabwe Africa 1992 60.377 10704340 693.420786 ZWE 716
1701 1701 Zimbabwe Africa 1997 46.809 11404948 792.449960 ZWE 716
1702 1702 Zimbabwe Africa 2002 39.989 11926563 672.038623 ZWE 716
1703 1703 Zimbabwe Africa 2007 43.487 12311143 469.709298 ZWE 716

1704 rows × 9 columns

Retrieving data for Morocco

In [23]:
search_value = 'Morocco'
is_present = search_value in data['country'].values

result = 'Yes' if is_present else 'No'

print(result)
Yes
In [24]:
morocco=data[data.country=='Morocco']
morocco
Out[24]:
Unnamed: 0 country continent year lifeExp pop gdpPercap iso_alpha iso_num
1020 1020 Morocco Africa 1952 42.873 9939217 1688.203570 MAR 504
1021 1021 Morocco Africa 1957 45.423 11406350 1642.002314 MAR 504
1022 1022 Morocco Africa 1962 47.924 13056604 1566.353493 MAR 504
1023 1023 Morocco Africa 1967 50.335 14770296 1711.044770 MAR 504
1024 1024 Morocco Africa 1972 52.862 16660670 1930.194975 MAR 504
1025 1025 Morocco Africa 1977 55.730 18396941 2370.619976 MAR 504
1026 1026 Morocco Africa 1982 59.650 20198730 2702.620356 MAR 504
1027 1027 Morocco Africa 1987 62.677 22987397 2755.046991 MAR 504
1028 1028 Morocco Africa 1992 65.393 25798239 2948.047252 MAR 504
1029 1029 Morocco Africa 1997 67.660 28529501 2982.101858 MAR 504
1030 1030 Morocco Africa 2002 69.615 31167783 3258.495584 MAR 504
1031 1031 Morocco Africa 2007 71.164 33757175 3820.175230 MAR 504

Plotting the Bar Chart

In [25]:
fig=px.bar(morocco, x='year', y='pop', labels={'pop':'Population of Morocco'}, 
           height=600, width=980, color_discrete_sequence=['#c1272d'])
fig.show()

Adding Life Expectancy as a third dimension, and GDPperCapita on hover

In [26]:
fig=px.bar(morocco, x='year', y='pop', color='lifeExp', hover_data=['gdpPercap'], labels={'pop':'Population of Morocco'}, 
           height=600, width=980, color_continuous_scale='reds')
fig.show()

Task 7. Plotting Interactive Sunburst

In [27]:
restaurant=pd.read_csv('/Users/mekki/Python_Projects_Datasets/Interactive Data Viz with Plotly Express/00-restaurant.csv')
restaurant
Out[27]:
Unnamed: 0 total_bill tip sex smoker day time size
0 0 16.99 1.01 Female No Sun Dinner 2
1 1 10.34 1.66 Male No Sun Dinner 3
2 2 21.01 3.50 Male No Sun Dinner 3
3 3 23.68 3.31 Male No Sun Dinner 2
4 4 24.59 3.61 Female No Sun Dinner 4
... ... ... ... ... ... ... ... ...
239 239 29.03 5.92 Male No Sat Dinner 3
240 240 27.18 2.00 Female Yes Sat Dinner 2
241 241 22.67 2.00 Male Yes Sat Dinner 2
242 242 17.82 1.75 Male No Sat Dinner 2
243 243 18.78 3.00 Female No Thur Dinner 2

244 rows × 8 columns

In [28]:
fig=px.sunburst(restaurant, path=['day', 'time', 'sex'], values='total_bill', height=600, width=980)
fig.show()