Mekki MOURADI
Data Analyst
Rabat, Maroc
À Propos
Data Analyst Junior avec une solide expérience en marketing numérique et des compétences approfondies dans l’utilisation de Microsoft Excel, SQL, Tableau Software et Python.
Performing Data Aggregation using SQL
Liens
Compétences & Outils
Description du projet
Apprendre à manipuler les données via les Fonctions d’Aggregation :
- COUNT
- SELECT DISTINCT & GROUP BY
- HAVING
- SUM
- MIN & MAX
- AVERAGE
- ROUND
Objectifs de l'analyse
- Utiliser les différentes Fonctions d’Aggregation pour aider l’entreprise à mieux comprendre ses données.
- COUNT
- How many employees are in the company?
SELECT COUNT(DISTINCT emp_no)
FROM employees;
| number_employees |
|---|
| 11797 |
- Is there any employee without a first name
SELECT COUNT(emp_no)
FROM employees
WHERE first_name IS NULL;
| count |
|---|
| 0 |
- How many records are in the salaries table?
SELECT COUNT (emp_no)
FROM salaries;
| count |
|---|
| 111943 |
- How many annual contracts with a value higher than or equal to 100k$ have been registered in the salaries table?
SELECT COUNT(*)
FROM salaries
WHERE salary >=100000;
| count |
|---|
| 3876 |
- How many times have we payed salaries to each employee?
SELECT emp_no,
COUNT(emp_no) AS times_payed
FROM salaries
GROUP BY emp_no
ORDER BY emp_no
LIMIT 10;
| emp_no | times_payed |
|---|---|
| 10001 | 17 |
| 10002 | 6 |
| 10003 | 7 |
| 10004 | 16 |
| 10005 | 13 |
| 10006 | 12 |
| 10007 | 14 |
| 10008 | 3 |
| 10009 | 18 |
| 10010 | 6 |
- SELECT DISTINCT & GROUP BY
- How many different names can be found in the employees table?
SELECT COUNT(DISTINCT first_name)
FROM employees;
| count |
|---|
| 1275 |
- Select different names from the employees table
SELECT DISTINCT(first_name)
FROM employees
LIMIT 10;
| first_name |
|---|
| Harjit |
| Divine |
| Masaru |
| Conrado |
| Magy |
| Torsten |
| Mandell |
| Abdelghani |
| Steve |
| Gila |
- List all the names and how many of them are in the employees table
SELECT first_name,
COUNT(first_name) AS frequency
FROM employees
GROUP BY first_name
ORDER BY first_name
LIMIT 10;
| first_name | frequency |
|---|---|
| Aamer | 6 |
| Aamod | 8 |
| Abdelaziz | 8 |
| Abdelghani | 5 |
| Abdelkader | 7 |
| Abdelwaheb | 12 |
| Abdulah | 8 |
| Abdulla | 10 |
| Achilleas | 12 |
| Adam | 8 |
- Retrieve a list of how many employees earn over 80k$ and how much they earn. Rename the second column as emp_with_same_salary
SELECT salary,
COUNT(emp_no) AS emp_with_same_salary
FROM salaries
WHERE salary > 80000
GROUP BY salary
ORDER BY 2 DESC
LIMIT 10;
| salary | emp_with_same_salary |
|---|---|
| 81949 | 7 |
| 80252 | 7 |
| 81186 | 6 |
| 85790 | 6 |
| 85152 | 6 |
| 80606 | 6 |
| 87179 | 5 |
| 81175 | 5 |
| 89356 | 5 |
| 92572 | 5 |
- HAVING
- Extract a list of names of employees, where the number of employees is more than 15. Order by first_name
SELECT first_name,
COUNT(first_name) AS frequency
FROM employees
GROUP BY first_name
HAVING COUNT(first_name) > 15
ORDER BY COUNT(first_name) DESC
LIMIT 10;
| first_name | frequency |
|---|---|
| Kish | 20 |
| Stepehn | 19 |
| JiYoung | 19 |
| Morris | 18 |
| Shawna | 18 |
| Ramalingam | 17 |
| Valeri | 17 |
| Oguz | 17 |
| Mohammad | 17 |
| Hisao | 17 |
- Retrieve a list of employee numbers and the average salary, make sure to return where the average salary is more than 120k$
SELECT emp_no,
AVG(salary) AS average_salary
FROM salaries
GROUP BY emp_no
HAVING AVG(salary) > 120000
ORDER BY AVG(salary) DESC
LIMIT 10;
| emp_no | average_salary |
|---|---|
| 18006 | 128619.200000000000 |
| 18997 | 128594.000000000000 |
| 20004 | 128211.500000000000 |
| 21587 | 127105.692307692308 |
| 14523 | 126436.444444444444 |
| 14487 | 125684.875000000000 |
| 11486 | 124462.111111111111 |
| 12149 | 123656.928571428571 |
| 15039 | 123642.687500000000 |
| 21413 | 121109.000000000000 |
- Extract a list of all names that have encountered less than 200 times. Let the data refer to people hired after 1st of january, 1999
SELECT emp_no,
first_name,
COUNT(first_name) AS count_names, hire_date
FROM employees
WHERE hire_date > '1999-01-01'
GROUP BY emp_no
HAVING COUNT(first_name) < 200
ORDER BY COUNT(first_name) DESC
LIMIT 10;
| emp_no | first_name | count_names | hire_date |
|---|---|---|---|
| 18235 | Elvia | 1 | 1999-06-16 |
| 12550 | Shawna | 1 | 1999-01-04 |
| 15124 | Adhemar | 1 | 1999-02-20 |
| 19854 | Youssef | 1 | 1999-08-28 |
| 16665 | Zhilian | 1 | 1999-03-04 |
| 19056 | Leaf | 1 | 1999-07-29 |
| 16244 | Parviz | 1 | 1999-02-22 |
| 18000 | Tzvetan | 1 | 1999-05-03 |
| 15368 | Seongbin | 1 | 1999-04-01 |
| 16308 | Yannis | 1 | 1999-05-08 |
- Select the employees numbers of all individuals who have signed more than 1 contract after the 1st january, 2000
SELECT emp_no,
COUNT(emp_no)
FROM dept_emp
WHERE from_date > '2000-01-01'
GROUP BY emp_no
HAVING COUNT(emp_no)>=2
ORDER BY COUNT(emp_no);
| emp_no | count |
|---|---|
| 20594 | 2 |
| 13626 | 2 |
| 15893 | 2 |
| 19964 | 2 |
| 16689 | 2 |
| 11609 | 2 |
- SUM
- Retrieve the total amount the company has paid in salary
SELECT SUM(salary)
FROM salaries;
| sum |
|---|
| 7162796017 |
- What is the total amount of money spent on salaries for all contracts starting after the 1st of january, 1997
SELECT SUM(salary)
FROM salaries
WHERE from_date > '1997-01-01';
| sum |
|---|
| 3712936185 |
- MIN & MAX
- Retrieve the MIN and MAX salary paid by the company
SELECT MIN(salary)
FROM salaries;
| min |
|---|
| 38812 |
SELECT MAX(salary)
FROM salaries;
| max |
|---|
| 145732 |
- Retrieve the LOWEST and HIGHEST employee number
SELECT MIN(emp_no)
FROM salaries;
| min |
|---|
| 10001 |
SELECT MAX(emp_no)
FROM salaries;
| max |
|---|
| 21797 |
- AVERAGE
- How much have the company paid on average to employees?
SELECT AVG(salary)
FROM salaries;
| avg |
|---|
| 63986.100220648008 |
- What is the average annual salary paid to employees who started after the 1st of january, 1997?
SELECT AVG(salary)
FROM salaries
WHERE from_date > '1997-01-01';
| avg |
|---|
| 67775.335140463282 |
- ROUND
- Round the average salary to the nearest whole number
SELECT ROUND(AVG(salary))
FROM salaries;
| round |
|---|
| 63986 |
- Round the average amount of money spent on salaries for all contracts that started after the 1st of january, 1997 to a precision of cents
SELECT ROUND(AVG(salary),2)
FROM salaries
WHERE from_date > '1997-01-01';
| round |
|---|
| 67775.34 |
- Retrieve the range of salary
SELECT ROUND(MAX(salary)-MIN(salary), 2) AS salary_range
FROM salaries;
| salary_range |
|---|
| 106920.00 |
- Retrieve the mid-range of salary
SELECT ROUND((MAX(salary)-MIN(salary))/2, 2) AS salary_mid_range
FROM salaries;
| salary_mid_range |
|---|
| 53460.00 |