Interactive Data Visualization with Plotly Express
Task 1. Plotting Interactive Scatterplot
import warnings
warnings.filterwarnings("ignore")
import pandas as pd
import plotly.express as px
salary = pd.read_csv('/Users/mekki/Python_Projects_Datasets/Interactive Data Viz with Plotly Express/00-employee_salaries.csv')
salary
Years_of_Experience | Salary | |
---|---|---|
0 | 1.000000 | 40000.00000 |
1 | 2.257942 | 65979.42119 |
2 | 2.450875 | 67253.57549 |
3 | 2.498713 | 67342.43510 |
4 | 2.613729 | 70532.20448 |
... | ... | ... |
1995 | 19.178575 | 421534.69100 |
1996 | 19.254499 | 430478.02650 |
1997 | 19.353369 | 438090.84540 |
1998 | 19.842520 | 482242.16080 |
1999 | 20.000000 | 500000.00000 |
2000 rows × 2 columns
Plot the Scatterplot for Years of Experience vs. Salary
fig=px.scatter(salary, x='Years_of_Experience', y='Salary', height=600, width=1000)
fig.show()
admission=pd.read_csv('/Users/mekki/Python_Projects_Datasets/Interactive Data Viz with Plotly Express/00-university_admission.csv')
admission
Serial No. | GRE Score | TOEFL Score | University Rating | SOP | LOR | CGPA | Research | Chance of Admit | |
---|---|---|---|---|---|---|---|---|---|
0 | 1 | 337 | 118 | 4 | 4.5 | 4.5 | 9.65 | 1 | 0.92 |
1 | 2 | 324 | 107 | 4 | 4.0 | 4.5 | 8.87 | 1 | 0.76 |
2 | 3 | 316 | 104 | 3 | 3.0 | 3.5 | 8.00 | 1 | 0.72 |
3 | 4 | 322 | 110 | 3 | 3.5 | 2.5 | 8.67 | 1 | 0.80 |
4 | 5 | 314 | 103 | 2 | 2.0 | 3.0 | 8.21 | 0 | 0.65 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
495 | 496 | 332 | 108 | 5 | 4.5 | 4.0 | 9.02 | 1 | 0.87 |
496 | 497 | 337 | 117 | 5 | 5.0 | 5.0 | 9.87 | 1 | 0.96 |
497 | 498 | 330 | 120 | 5 | 4.5 | 5.0 | 9.56 | 1 | 0.93 |
498 | 499 | 312 | 103 | 4 | 4.0 | 5.0 | 8.43 | 0 | 0.73 |
499 | 500 | 327 | 113 | 4 | 4.5 | 4.5 | 9.04 | 0 | 0.84 |
500 rows × 9 columns
Plot the Scatterplot for GRE Score vs. Chance of Admission. Show the University Rating as a third dimension
fig=px.scatter(admission, x='GRE Score', y='Chance of Admit', color='University Rating', height=600, width=1000)
fig.show()
Task 2. Plotting Interactive Bubble Charts
Add a fourth dimension to the Scatterplot regarding the SOP
fig=px.scatter(admission, x='GRE Score', y='Chance of Admit', color='University Rating', size='SOP',
height=600, width=1000)
fig.show()
Add the LOR as a Hover Data
fig=px.scatter(admission, x='GRE Score', y='Chance of Admit', color='University Rating', size='SOP',
hover_data=['LOR'], height=600, width=1000)
fig.show()
Modify the SOP to make the bubble variations more prominent
admission['SOP']=admission['SOP']**2
fig=px.scatter(admission, x='GRE Score', y='Chance of Admit', color='University Rating', size='SOP',
hover_data=['LOR'], height=600, width=1000)
fig.show()
Task 3. Plotting Interactive Single Line Plot
crypto=pd.read_csv('/Users/mekki/Python_Projects_Datasets/Interactive Data Viz with Plotly Express/00-crypto_prices.csv')
crypto
Date | BTC-USD Price | ETH-USD Price | LTC-USD Price | |
---|---|---|---|---|
0 | 9/17/2014 | 457.334015 | NaN | 5.058550 |
1 | 9/18/2014 | 424.440002 | NaN | 4.685230 |
2 | 9/19/2014 | 394.795990 | NaN | 4.327770 |
3 | 9/20/2014 | 408.903992 | NaN | 4.286440 |
4 | 9/21/2014 | 398.821014 | NaN | 4.245920 |
... | ... | ... | ... | ... |
2380 | 3/28/2021 | 55950.746090 | 1691.355957 | 185.028488 |
2381 | 3/29/2021 | 57750.199220 | 1819.684937 | 194.474777 |
2382 | 3/30/2021 | 58917.691410 | 1846.033691 | 196.682098 |
2383 | 3/31/2021 | 58918.832030 | 1918.362061 | 197.499100 |
2384 | 4/1/2021 | 59095.808590 | 1977.276855 | 204.112518 |
2385 rows × 4 columns
Plotting BTC data
fig=px.line(crypto, x='Date', y='BTC-USD Price', height=600, width=1000)
fig.show()
Plotting ETH data
fig=px.line(crypto, x='Date', y='ETH-USD Price', height=600, width=1000)
fig.show()
Plotting LTC data
fig=px.line(crypto, x='Date', y='LTC-USD Price', height=600, width=1000)
fig.show()
Task 4. Plotting Interactive Multiple Line Plots
Plotting BTC, ETH and LTC data
crypto.columns
Index(['Date', 'BTC-USD Price', 'ETH-USD Price', 'LTC-USD Price'], dtype='object')
fig=px.line(height=600, width=980)
for i in crypto.columns[1:]:
fig.add_scatter(x=crypto['Date'], y=crypto[i], name=i)
fig.show()
Retrieving ETH and LTC prices where BTC reached its peak
crypto[crypto['BTC-USD Price']==crypto['BTC-USD Price'].max()]
Date | BTC-USD Price | ETH-USD Price | LTC-USD Price | |
---|---|---|---|---|
2365 | 3/13/2021 | 61243.08594 | 1924.685425 | 226.578293 |
crypto[crypto['Date'] == '3/13/2021']
Date | BTC-USD Price | ETH-USD Price | LTC-USD Price | |
---|---|---|---|---|
2365 | 3/13/2021 | 61243.08594 | 1924.685425 | 226.578293 |
Task 5. Plotting Interactive Pie Charts
Create a dataset
my_dict={'allocation %':[20, 20, 20, 20, 20]}
my_dict
{'allocation %': [20, 20, 20, 20, 20]}
crypto=pd.DataFrame(data=my_dict, index=['BTC', 'ETH', 'LTC', 'XRP', 'ADA'])
crypto
allocation % | |
---|---|
BTC | 20 |
ETH | 20 |
LTC | 20 |
XRP | 20 |
ADA | 20 |
Plot the Pie Chart
fig=px.pie(crypto, values='allocation %', names=['BTC', 'ETH', 'LTC', 'XRP', 'ADA'], title='Crypto allocation',
height=600, width=980)
fig.show()
Assume that you became bullish on XRP and decided to allocate 60% of your assets in it. You also decided to equally devide the rest of your assets in other coins (BTC, LTC, ADA and ETH). Change the allocations and plot the pie chart.
my_dict={'allocation %':[10, 10, 10, 60, 10]}
crypto=pd.DataFrame(data=my_dict, index=['BTC', 'ETH', 'LTC', 'XRP', 'ADA'])
crypto
allocation % | |
---|---|
BTC | 10 |
ETH | 10 |
LTC | 10 |
XRP | 60 |
ADA | 10 |
fig=px.pie(crypto, values='allocation %', names=['BTC', 'ETH', 'LTC', 'XRP', 'ADA'], title='Crypto allocation',
height=600, width=980, hole=0.55)
fig.show()
Task 6. Plotting Interactive Bar Charts
data=pd.read_csv('/Users/mekki/Python_Projects_Datasets/Interactive Data Viz with Plotly Express/00-gapminder.csv')
data
Unnamed: 0 | country | continent | year | lifeExp | pop | gdpPercap | iso_alpha | iso_num | |
---|---|---|---|---|---|---|---|---|---|
0 | 0 | Afghanistan | Asia | 1952 | 28.801 | 8425333 | 779.445314 | AFG | 4 |
1 | 1 | Afghanistan | Asia | 1957 | 30.332 | 9240934 | 820.853030 | AFG | 4 |
2 | 2 | Afghanistan | Asia | 1962 | 31.997 | 10267083 | 853.100710 | AFG | 4 |
3 | 3 | Afghanistan | Asia | 1967 | 34.020 | 11537966 | 836.197138 | AFG | 4 |
4 | 4 | Afghanistan | Asia | 1972 | 36.088 | 13079460 | 739.981106 | AFG | 4 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
1699 | 1699 | Zimbabwe | Africa | 1987 | 62.351 | 9216418 | 706.157306 | ZWE | 716 |
1700 | 1700 | Zimbabwe | Africa | 1992 | 60.377 | 10704340 | 693.420786 | ZWE | 716 |
1701 | 1701 | Zimbabwe | Africa | 1997 | 46.809 | 11404948 | 792.449960 | ZWE | 716 |
1702 | 1702 | Zimbabwe | Africa | 2002 | 39.989 | 11926563 | 672.038623 | ZWE | 716 |
1703 | 1703 | Zimbabwe | Africa | 2007 | 43.487 | 12311143 | 469.709298 | ZWE | 716 |
1704 rows × 9 columns
Retrieving data for Morocco
search_value = 'Morocco'
is_present = search_value in data['country'].values
result = 'Yes' if is_present else 'No'
print(result)
Yes
morocco=data[data.country=='Morocco']
morocco
Unnamed: 0 | country | continent | year | lifeExp | pop | gdpPercap | iso_alpha | iso_num | |
---|---|---|---|---|---|---|---|---|---|
1020 | 1020 | Morocco | Africa | 1952 | 42.873 | 9939217 | 1688.203570 | MAR | 504 |
1021 | 1021 | Morocco | Africa | 1957 | 45.423 | 11406350 | 1642.002314 | MAR | 504 |
1022 | 1022 | Morocco | Africa | 1962 | 47.924 | 13056604 | 1566.353493 | MAR | 504 |
1023 | 1023 | Morocco | Africa | 1967 | 50.335 | 14770296 | 1711.044770 | MAR | 504 |
1024 | 1024 | Morocco | Africa | 1972 | 52.862 | 16660670 | 1930.194975 | MAR | 504 |
1025 | 1025 | Morocco | Africa | 1977 | 55.730 | 18396941 | 2370.619976 | MAR | 504 |
1026 | 1026 | Morocco | Africa | 1982 | 59.650 | 20198730 | 2702.620356 | MAR | 504 |
1027 | 1027 | Morocco | Africa | 1987 | 62.677 | 22987397 | 2755.046991 | MAR | 504 |
1028 | 1028 | Morocco | Africa | 1992 | 65.393 | 25798239 | 2948.047252 | MAR | 504 |
1029 | 1029 | Morocco | Africa | 1997 | 67.660 | 28529501 | 2982.101858 | MAR | 504 |
1030 | 1030 | Morocco | Africa | 2002 | 69.615 | 31167783 | 3258.495584 | MAR | 504 |
1031 | 1031 | Morocco | Africa | 2007 | 71.164 | 33757175 | 3820.175230 | MAR | 504 |
Plotting the Bar Chart
fig=px.bar(morocco, x='year', y='pop', labels={'pop':'Population of Morocco'},
height=600, width=980, color_discrete_sequence=['#c1272d'])
fig.show()
Adding Life Expectancy as a third dimension, and GDPperCapita on hover
fig=px.bar(morocco, x='year', y='pop', color='lifeExp', hover_data=['gdpPercap'], labels={'pop':'Population of Morocco'},
height=600, width=980, color_continuous_scale='reds')
fig.show()
Task 7. Plotting Interactive Sunburst
restaurant=pd.read_csv('/Users/mekki/Python_Projects_Datasets/Interactive Data Viz with Plotly Express/00-restaurant.csv')
restaurant
Unnamed: 0 | total_bill | tip | sex | smoker | day | time | size | |
---|---|---|---|---|---|---|---|---|
0 | 0 | 16.99 | 1.01 | Female | No | Sun | Dinner | 2 |
1 | 1 | 10.34 | 1.66 | Male | No | Sun | Dinner | 3 |
2 | 2 | 21.01 | 3.50 | Male | No | Sun | Dinner | 3 |
3 | 3 | 23.68 | 3.31 | Male | No | Sun | Dinner | 2 |
4 | 4 | 24.59 | 3.61 | Female | No | Sun | Dinner | 4 |
... | ... | ... | ... | ... | ... | ... | ... | ... |
239 | 239 | 29.03 | 5.92 | Male | No | Sat | Dinner | 3 |
240 | 240 | 27.18 | 2.00 | Female | Yes | Sat | Dinner | 2 |
241 | 241 | 22.67 | 2.00 | Male | Yes | Sat | Dinner | 2 |
242 | 242 | 17.82 | 1.75 | Male | No | Sat | Dinner | 2 |
243 | 243 | 18.78 | 3.00 | Female | No | Thur | Dinner | 2 |
244 rows × 8 columns
fig=px.sunburst(restaurant, path=['day', 'time', 'sex'], values='total_bill', height=600, width=980)
fig.show()