Mastering Data Analysis with Pandas #1
Task 1. Define a Pandas Series
import warnings
warnings.filterwarnings("ignore")
import pandas as pd
df=pd.read_csv('/Users/mekki/Python_Projects_Datasets/Mastering Data Analysis with Pandas Part 1/00-crypto.csv')
df
BTC-USD Price | |
---|---|
0 | 457.334015 |
1 | 424.440002 |
2 | 394.795990 |
3 | 408.903992 |
4 | 398.821014 |
... | ... |
2380 | 55950.746090 |
2381 | 57750.199220 |
2382 | 58917.691410 |
2383 | 58918.832030 |
2384 | 59095.808590 |
2385 rows × 1 columns
Define a Python List that contains 5 crypto currencies
crypto_list = ['BTC','XRP','LTC','ADA','ETH']
crypto_list
['BTC', 'XRP', 'LTC', 'ADA', 'ETH']
Confirm the Datatype
type(crypto_list)
list
Create a one dimensional Pandas Series from a Pandas List
crypto_series = pd.Series(data=crypto_list)
crypto_series
0 BTC 1 XRP 2 LTC 3 ADA 4 ETH dtype: object
Confirm the Datatype
type(crypto_series)
pandas.core.series.Series
Create a one dimensional Pandas Series from Numeric Values
crypto_prices_series=pd.Series(data=[2000,500,2000,20,50])
crypto_prices_series
0 2000 1 500 2 2000 3 20 4 50 dtype: int64
Task 2. Define a Pandas Series with Custom Index
Define a Python List that contains 5 crypto currencies
crypto_list = ['BTC','XRP','LTC','ADA','ETH']
crypto_list
['BTC', 'XRP', 'LTC', 'ADA', 'ETH']
Define a Python List to be used as Index
crypto_labels=['crypto#1', 'crypto#2', 'crypto#3', 'crypto#4', 'crypto#5']
crypto_labels
['crypto#1', 'crypto#2', 'crypto#3', 'crypto#4', 'crypto#5']
Create a one dimensional Pandas Series with index
crypto_series=pd.Series(data=crypto_list,index=crypto_labels)
crypto_series
crypto#1 BTC crypto#2 XRP crypto#3 LTC crypto#4 ADA crypto#5 ETH dtype: object
Confirm the Datatype
type(crypto_series)
pandas.core.series.Series
Task 3. Define a Pandas Series from a Dictionary
Define a Dictionary using key-value pairs
dict = {'Employee ID':1,
'Employee Name':'Steve',
'Salary [$]':2000,
'Years with Company':10}
dict
{'Employee ID': 1, 'Employee Name': 'Steve', 'Salary [$]': 2000, 'Years with Company': 10}
Confirm the Datatype
type(dict)
dict
Define a Pandas Series using a Dictionary
employee_series=pd.Series(dict)
employee_series
Employee ID 1 Employee Name Steve Salary [$] 2000 Years with Company 10 dtype: object
Task 4. Pandas Attributes
Define a Pandas Series
crypto_list = ['BTC','XRP','LTC','ADA','ETH']
crypto_series=pd.Series(data=crypto_list)
crypto_series
0 BTC 1 XRP 2 LTC 3 ADA 4 ETH dtype: object
Return the Values of the Series
crypto_series.values
array(['BTC', 'XRP', 'LTC', 'ADA', 'ETH'], dtype=object)
Return the Index of the Series
crypto_series.index
RangeIndex(start=0, stop=5, step=1)
Return the DataType of the Series
crypto_series.dtype
dtype('O')
Check if all elements are unique or not
crypto_series.is_unique
True
Check the Shape of the Series
crypto_series.shape
(5,)
Check the Size of the Series
crypto_series.size
5
Task 5. Pandas Methods
Define another Pandas Series that contain numeric values
crypto_prices=pd.Series(data=[400,500,1500,20,70])
crypto_prices
0 400 1 500 2 1500 3 20 4 70 dtype: int64
Obtain the sum of all elements
crypto_prices.sum()
2490
Obtain the product of all elements
crypto_prices.product()
420000000000
Obtain the mean of all elements
crypto_prices.mean()
498.0
Show the first 2 rows of the Series
crypto_prices.head(2)
0 400 1 500 dtype: int64
Create a new Dataframe using Head
new_crypto_prices=crypto_prices.head(2)
new_crypto_prices
0 400 1 500 dtype: int64
Check Memory Usage of the Series
crypto_prices.memory_usage()
172
Task 6. Import One Dimensional CSV Data
Using Squeeze to convert Dataframe to one dimensional Pandas Series
BTC_price_series=pd.read_csv('/Users/mekki/Python_Projects_Datasets/Mastering Data Analysis with Pandas Part 1/00-crypto.csv').squeeze()
BTC_price_series
0 457.334015 1 424.440002 2 394.795990 3 408.903992 4 398.821014 ... 2380 55950.746090 2381 57750.199220 2382 58917.691410 2383 58918.832030 2384 59095.808590 Name: BTC-USD Price, Length: 2385, dtype: float64
Task 7. Pandas built-in Functions
Obtain the Datatype of the Pandas Series
type(BTC_price_series)
pandas.core.series.Series
Obtain the Length of the Pandas Series
len(BTC_price_series)
2385
Obtain the Max of the Pandas Series
max(BTC_price_series)
61243.08594
Obtain the Min of the Pandas Series
min(BTC_price_series)
178.1029968
Task 8. Sorting Pandas Series
Sort Values Ascending
BTC_price_series.sort_values()
119 178.102997 122 199.259995 121 208.097000 120 209.843994 123 210.339004 ... 2382 58917.691410 2383 58918.832030 2384 59095.808590 2366 59302.316410 2365 61243.085940 Name: BTC-USD Price, Length: 2385, dtype: float64
Sort Values Descending
BTC_price_series.sort_values(ascending=False)
2365 61243.085940 2366 59302.316410 2384 59095.808590 2383 58918.832030 2382 58917.691410 ... 123 210.339004 120 209.843994 121 208.097000 122 199.259995 119 178.102997 Name: BTC-USD Price, Length: 2385, dtype: float64
Sort Values inplace
BTC_price_series_1=BTC_price_series.copy()
BTC_price_series_1.sort_values(inplace=True)
BTC_price_series_1
119 178.102997 122 199.259995 121 208.097000 120 209.843994 123 210.339004 ... 2382 58917.691410 2383 58918.832030 2384 59095.808590 2366 59302.316410 2365 61243.085940 Name: BTC-USD Price, Length: 2385, dtype: float64
Sort Index inplace
BTC_price_series.sort_index(inplace=True)
BTC_price_series
0 457.334015 1 424.440002 2 394.795990 3 408.903992 4 398.821014 ... 2380 55950.746090 2381 57750.199220 2382 58917.691410 2383 58918.832030 2384 59095.808590 Name: BTC-USD Price, Length: 2385, dtype: float64
Task 9. Perform Math Operations on Pandas Series
Apply Sum Method as Pandas Series
BTC_price_series.sum()
15435379.738852698
Apply Count Method as Pandas Series
BTC_price_series.count()
2385
Apply Max Method as Pandas Series
BTC_price_series.max()
61243.08594
Apply Min Method as Pandas Series
BTC_price_series.min()
178.1029968
Apply Mean Method as Pandas Series
BTC_price_series.mean()
6471.857332852284
Apply Describe Method as Pandas Series
BTC_price_series.describe()
count 2385.000000 mean 6471.857333 std 9289.022505 min 178.102997 25% 454.618988 50% 4076.632568 75% 8864.766602 max 61243.085940 Name: BTC-USD Price, dtype: float64
Task 10. Check if a given Element exists in a Pandas Series
Check if a given number exists in a Pandas Series
1295 in BTC_price_series
True
Check if a given number exists in a Pandas Series Values
1295 in BTC_price_series.values
False
Check if a given number exists in a Pandas Series Index
1295 in BTC_price_series.index
True