Mastering Data Analysis with Pandas #1

Task 1. Define a Pandas Series

In [1]:
import warnings
warnings.filterwarnings("ignore")

import pandas as pd
In [2]:
df=pd.read_csv('/Users/mekki/Python_Projects_Datasets/Mastering Data Analysis with Pandas Part 1/00-crypto.csv')
In [3]:
df
Out[3]:
BTC-USD Price
0 457.334015
1 424.440002
2 394.795990
3 408.903992
4 398.821014
... ...
2380 55950.746090
2381 57750.199220
2382 58917.691410
2383 58918.832030
2384 59095.808590

2385 rows × 1 columns

Define a Python List that contains 5 crypto currencies

In [4]:
crypto_list = ['BTC','XRP','LTC','ADA','ETH']
crypto_list
Out[4]:
['BTC', 'XRP', 'LTC', 'ADA', 'ETH']

Confirm the Datatype

In [5]:
type(crypto_list)
Out[5]:
list

Create a one dimensional Pandas Series from a Pandas List

In [6]:
crypto_series = pd.Series(data=crypto_list)
crypto_series
Out[6]:
0    BTC
1    XRP
2    LTC
3    ADA
4    ETH
dtype: object

Confirm the Datatype

In [7]:
type(crypto_series)
Out[7]:
pandas.core.series.Series

Create a one dimensional Pandas Series from Numeric Values

In [8]:
crypto_prices_series=pd.Series(data=[2000,500,2000,20,50])
crypto_prices_series
Out[8]:
0    2000
1     500
2    2000
3      20
4      50
dtype: int64

Task 2. Define a Pandas Series with Custom Index

Define a Python List that contains 5 crypto currencies

In [9]:
crypto_list = ['BTC','XRP','LTC','ADA','ETH']
crypto_list
Out[9]:
['BTC', 'XRP', 'LTC', 'ADA', 'ETH']

Define a Python List to be used as Index

In [10]:
crypto_labels=['crypto#1', 'crypto#2', 'crypto#3', 'crypto#4', 'crypto#5']
crypto_labels
Out[10]:
['crypto#1', 'crypto#2', 'crypto#3', 'crypto#4', 'crypto#5']

Create a one dimensional Pandas Series with index

In [11]:
crypto_series=pd.Series(data=crypto_list,index=crypto_labels)
crypto_series
Out[11]:
crypto#1    BTC
crypto#2    XRP
crypto#3    LTC
crypto#4    ADA
crypto#5    ETH
dtype: object

Confirm the Datatype

In [12]:
type(crypto_series)
Out[12]:
pandas.core.series.Series

Task 3. Define a Pandas Series from a Dictionary

Define a Dictionary using key-value pairs

In [13]:
dict = {'Employee ID':1,
       'Employee Name':'Steve',
       'Salary [$]':2000,
       'Years with Company':10}
dict
Out[13]:
{'Employee ID': 1,
 'Employee Name': 'Steve',
 'Salary [$]': 2000,
 'Years with Company': 10}

Confirm the Datatype

In [14]:
type(dict)
Out[14]:
dict

Define a Pandas Series using a Dictionary

In [15]:
employee_series=pd.Series(dict)
employee_series
Out[15]:
Employee ID               1
Employee Name         Steve
Salary [$]             2000
Years with Company       10
dtype: object

Task 4. Pandas Attributes

Define a Pandas Series

In [16]:
crypto_list = ['BTC','XRP','LTC','ADA','ETH']
crypto_series=pd.Series(data=crypto_list)
crypto_series
Out[16]:
0    BTC
1    XRP
2    LTC
3    ADA
4    ETH
dtype: object

Return the Values of the Series

In [17]:
crypto_series.values
Out[17]:
array(['BTC', 'XRP', 'LTC', 'ADA', 'ETH'], dtype=object)

Return the Index of the Series

In [18]:
crypto_series.index
Out[18]:
RangeIndex(start=0, stop=5, step=1)

Return the DataType of the Series

In [19]:
crypto_series.dtype
Out[19]:
dtype('O')

Check if all elements are unique or not

In [20]:
crypto_series.is_unique
Out[20]:
True

Check the Shape of the Series

In [21]:
crypto_series.shape
Out[21]:
(5,)

Check the Size of the Series

In [22]:
crypto_series.size
Out[22]:
5

Task 5. Pandas Methods

Define another Pandas Series that contain numeric values

In [23]:
crypto_prices=pd.Series(data=[400,500,1500,20,70])
crypto_prices
Out[23]:
0     400
1     500
2    1500
3      20
4      70
dtype: int64

Obtain the sum of all elements

In [24]:
crypto_prices.sum()
Out[24]:
2490

Obtain the product of all elements

In [25]:
crypto_prices.product()
Out[25]:
420000000000

Obtain the mean of all elements

In [26]:
crypto_prices.mean()
Out[26]:
498.0

Show the first 2 rows of the Series

In [27]:
crypto_prices.head(2)
Out[27]:
0    400
1    500
dtype: int64

Create a new Dataframe using Head

In [28]:
new_crypto_prices=crypto_prices.head(2)
new_crypto_prices
Out[28]:
0    400
1    500
dtype: int64

Check Memory Usage of the Series

In [29]:
crypto_prices.memory_usage()
Out[29]:
172

Task 6. Import One Dimensional CSV Data

Using Squeeze to convert Dataframe to one dimensional Pandas Series

In [30]:
BTC_price_series=pd.read_csv('/Users/mekki/Python_Projects_Datasets/Mastering Data Analysis with Pandas Part 1/00-crypto.csv').squeeze()
BTC_price_series
Out[30]:
0         457.334015
1         424.440002
2         394.795990
3         408.903992
4         398.821014
            ...     
2380    55950.746090
2381    57750.199220
2382    58917.691410
2383    58918.832030
2384    59095.808590
Name: BTC-USD Price, Length: 2385, dtype: float64

Task 7. Pandas built-in Functions

Obtain the Datatype of the Pandas Series

In [31]:
type(BTC_price_series)
Out[31]:
pandas.core.series.Series

Obtain the Length of the Pandas Series

In [32]:
len(BTC_price_series)
Out[32]:
2385

Obtain the Max of the Pandas Series

In [33]:
max(BTC_price_series)
Out[33]:
61243.08594

Obtain the Min of the Pandas Series

In [34]:
min(BTC_price_series)
Out[34]:
178.1029968

Task 8. Sorting Pandas Series

Sort Values Ascending

In [35]:
BTC_price_series.sort_values()
Out[35]:
119       178.102997
122       199.259995
121       208.097000
120       209.843994
123       210.339004
            ...     
2382    58917.691410
2383    58918.832030
2384    59095.808590
2366    59302.316410
2365    61243.085940
Name: BTC-USD Price, Length: 2385, dtype: float64

Sort Values Descending

In [36]:
BTC_price_series.sort_values(ascending=False)
Out[36]:
2365    61243.085940
2366    59302.316410
2384    59095.808590
2383    58918.832030
2382    58917.691410
            ...     
123       210.339004
120       209.843994
121       208.097000
122       199.259995
119       178.102997
Name: BTC-USD Price, Length: 2385, dtype: float64

Sort Values inplace

In [37]:
BTC_price_series_1=BTC_price_series.copy()
BTC_price_series_1.sort_values(inplace=True)
BTC_price_series_1
Out[37]:
119       178.102997
122       199.259995
121       208.097000
120       209.843994
123       210.339004
            ...     
2382    58917.691410
2383    58918.832030
2384    59095.808590
2366    59302.316410
2365    61243.085940
Name: BTC-USD Price, Length: 2385, dtype: float64

Sort Index inplace

In [38]:
BTC_price_series.sort_index(inplace=True)
BTC_price_series
Out[38]:
0         457.334015
1         424.440002
2         394.795990
3         408.903992
4         398.821014
            ...     
2380    55950.746090
2381    57750.199220
2382    58917.691410
2383    58918.832030
2384    59095.808590
Name: BTC-USD Price, Length: 2385, dtype: float64

Task 9. Perform Math Operations on Pandas Series

Apply Sum Method as Pandas Series

In [39]:
BTC_price_series.sum()
Out[39]:
15435379.738852698

Apply Count Method as Pandas Series

In [40]:
BTC_price_series.count()
Out[40]:
2385

Apply Max Method as Pandas Series

In [41]:
BTC_price_series.max()
Out[41]:
61243.08594

Apply Min Method as Pandas Series

In [42]:
BTC_price_series.min()
Out[42]:
178.1029968

Apply Mean Method as Pandas Series

In [43]:
BTC_price_series.mean()
Out[43]:
6471.857332852284

Apply Describe Method as Pandas Series

In [44]:
BTC_price_series.describe()
Out[44]:
count     2385.000000
mean      6471.857333
std       9289.022505
min        178.102997
25%        454.618988
50%       4076.632568
75%       8864.766602
max      61243.085940
Name: BTC-USD Price, dtype: float64

Task 10. Check if a given Element exists in a Pandas Series

Check if a given number exists in a Pandas Series

In [45]:
1295 in BTC_price_series
Out[45]:
True

Check if a given number exists in a Pandas Series Values

In [46]:
1295 in BTC_price_series.values
Out[46]:
False

Check if a given number exists in a Pandas Series Index

In [47]:
1295 in BTC_price_series.index
Out[47]:
True