Mekki MOURADI

Data Analyst

Rabat, Maroc

À Propos

Data Analyst Junior avec une solide expérience en marketing numérique et des compétences approfondies dans l’utilisation de Microsoft Excel, SQL, Tableau Software et Python.

Coffee Sales Project with SQL

Frame 28

Compétences & Outils

Description du projet

Utiliser toute la panoplie de fonctions que nous offre SQL pour récupérer, modifier, extraire, et trier les données :
  • WHERE CLAUSE + AND & OR
  • IN, NOT IN, IS NULL, BETWEEN,
  • ORDER BY, DISTINCT
  • EXTRACT
  • UPPER, LOWER, LENGTH
  • CONCATENATION, BOOLEAN, EXPRESSIONS AND WILDCARDS
  • SUBSTRING, POSITION, COALESCE
  • MIN, MAX, AVG, SUM, COUNT
  • GROUP BY, HAVING
  • CASE, GROUP BY, TRANSPOSITION
  • SUBQUERIES

Objectifs de l'analyse

  • Répondre à différentes business questions pour accompagner l’entreprise dans sa prise de décision
  1. WHERE CLAUSE + AND & OR
  1. Select only the employees who work in Common Grounds and make more than 50k
				
					SELECT *
FROM employees
WHERE coffeeshop_id=1 AND salary > 50000
LIMIT 10;
				
			
employee_idfirst_namelast_nameemailhire_dategendersalarycoffeeshop_id
144108KhalilCorrkcorr1@github.io2014-04-23M528021
232690PincasMariottepmariotte1b@reuters.com2018-12-08M638601
193281AubertaGrevesagreves1e@wikispaces.com2012-07-06F546981
577294EstevanWornumNULL2011-02-27M644901
986129NedCastellanosncastellanos1x@hostgator.com2016-05-28M653171
858668WrightJermyNULL2021-12-30M518181
181482JacentaBettinsonjbettinson2t@sourceforge.net2019-01-25F579141
633934KimberleyVautinkvautin2w@prweb.com2014-08-31F566691
578451MurvynLanneymlanney38@squarespace.com2018-02-09M560111
503156JulitaGellyjgelly3m@woothemes.com2016-09-11F668371
  1. Select only the employees who work in Early Rise or make more thant 50k
				
					SELECT *
FROM employees
WHERE coffeeshop_id=2 OR salary > 50000
LIMIT 10;
				
			
employee_idfirst_namelast_nameemailhire_dategendersalarycoffeeshop_id
144108KhalilCorrkcorr1@github.io2014-04-23M528021
782284VilhelminaRaymanvrayman2@jigsy.com2015-08-17F578552
614903HamelGrocockhgrocock4@nasa.gov2016-03-27M665663
938560SamanthaNewallsnewalla@comsenz.com2013-08-10F102232
746871JoshuaWinscumjwinscumb@sciencedirect.com2022-12-29M282322
75097WallyHuebnerwhuebnerc@dmoz.org2020-08-30F577314
353657BrighamBoucherbboucherh@army.mil2016-03-09M388992
608868AnnabelleOttiwillaottiwillj@wordpress.com2016-07-19F548575
891720AmyeDilgeradilgern@amazon.com2014-12-29F636713
826977LucasCuphusNULL2021-12-21M585665
  1. IN, NOT IN, IS NULL, BETWEEN
  1. Select all rows from the table where the supplier is Beans and Barley
				
					SELECT *
FROM suppliers
WHERE supplier_name = 'Beans and Barley';
				
			
coffeeshop_idsupplier_namecoffee_type
1Beans and BarleyArabica
2Beans and BarleyArabica
3Beans and BarleyArabica
5Beans and BarleyArabica
  1. Select all rows from the table where the supplier is not Beans and Barley
				
					SELECT *
FROM suppliers
WHERE NOT supplier_name = 'Beans and Barley';
-- WHERE supplier_name <> 'Beans and Barley';
				
			
coffeeshop_idsupplier_namecoffee_type
1Cool BeansRobusta
2Vanilla BeanLiberica
2Cool BeansRobusta
3Bean Me UpExcelsa
3Vanilla BeanLiberica
3Cool BeansRobusta
4Vanilla BeanLiberica
4Bean Me UpExcelsa
5Vanilla BeanLiberica
5Bean Me UpExcelsa
  1. Select all Robusta and Arabica coffee types
				
					SELECT *
FROM suppliers
WHERE coffee_type IN ('Robusta', 'Arabica');
				
			
coffeeshop_idsupplier_namecoffee_type
1Beans and BarleyArabica
1Cool BeansRobusta
2Beans and BarleyArabica
2Cool BeansRobusta
3Cool BeansRobusta
3Beans and BarleyArabica
5Beans and BarleyArabica
  1. Select all coffee types that are not Robusta and Arabica
				
					SELECT *
FROM suppliers
WHERE coffee_type NOT IN ('Robusta', 'Arabica');
				
			
coffeeshop_idsupplier_namecoffee_type
2Vanilla BeanLiberica
3Bean Me UpExcelsa
3Vanilla BeanLiberica
4Vanilla BeanLiberica
4Bean Me UpExcelsa
5Vanilla BeanLiberica
5Bean Me UpExcelsa
  1. Select all employees with missing emails
				
					SELECT *
FROM employees
WHERE email IS NULL
LIMIT 10;
				
			
employee_idfirst_namelast_nameemailhire_dategendersalarycoffeeshop_id
877425HortenByreNULL2022-05-21M404585
826977LucasCuphusNULL2021-12-21M585665
824952MarjaLaceyNULL2010-02-28F286285
793404AiliStoweNULL2014-08-09F193371
702218ElizaGonzaloNULL2022-12-11F612765
815230RockieSimionatoNULL2016-01-23M190165
577294EstevanWornumNULL2011-02-27M644901
117778LoriannaTomeiNULL2021-06-18F528653
725760PaxtonNequestNULL2019-09-02M174841
141019WilliCroshawNULL2019-02-06M393862
  1. Select all employees who make between 35k and 50k
				
					SELECT employee_id, first_name, last_name, salary
FROM employees
WHERE salary BETWEEN 35000 AND 50000
LIMIT 10;

				
			
employee_idfirst_namelast_namesalary
225709EleenTarpey48048
243999JeremyWhitlam41159
599230WebbHevey48477
755091ClemKitchingman46818
925779PavelButchard35003
353657BrighamBoucher38899
877425HortenByre40458
593979RockieMeriot45651
790257RabDafter49327
854959GustavusKettlestringe47048
  1. ORDER BY, DISTINCT
  1. Order the data from the lowest salary to the highest
				
					SELECT employee_id, first_name, last_name, salary
FROM employees
ORDER BY salary
LIMIT 10;
				
			
employee_idfirst_namelast_namesalary
230049KevonPerrigo9878
714848AraRinge10086
310537GilbertineWhiscard10176
615880LemarCotgrove10220
938560SamanthaNewall10223
765510GloriaDoring10239
783014CeceliaHuws10241
771498MalvinaKloser10256
684717EvviePenlington10436
123607DewainSeadon10447
  1. Retrieve a list of the top 5 highest paid employees
				
					SELECT employee_id, first_name, last_name, salary
FROM employees
ORDER BY salary DESC
LIMIT 5;
				
			
employee_idfirst_namelast_namesalary
830873ForresterRoze67724
356659DillonBourges67599
275505HuntleyLovewell67560
297631NolanCoye67548
249536AustineBenge67539
  1. Return all unique countries
				
					SELECT DISTINCT country
FROM locations;
				
			
country
United States
United Kingdom
  1. EXTRACT
  1. Return the year, month and day of the hire_date for each employee
				
					SELECT  EXTRACT (YEAR FROM hire_date) AS year,
		EXTRACT (MONTH FROM hire_date) AS month,
		EXTRACT (DAY FROM hire_date) AS day
FROM employees
LIMIT 10;
				
			
yearmonthday
2015829
2014423
2015817
2016914
2016327
20211222
2014121
2010427
201473
2015817
  1. UPPER, LOWER, LENGTH
  1. Uppercase last name, lowercase first name, return email and length of email
				
					SELECT  UPPER(last_name) AS upper_last_name,
		LOWER(first_name) AS lower_first_name,
		email,
		LENGTH(email) AS email_length
FROM employees
LIMIT 10;
				
			
upper_last_namelower_first_nameemailemail_length
MOSCONIcarsoncmosconi0@census.gov20
CORRkhalilkcorr1@github.io16
RAYMANvilhelminavrayman2@jigsy.com18
TARPEYeleenetarpey3@devhub.com19
GROCOCKhamelhgrocock4@nasa.gov18
BALLSfrazierfballs5@nydailynews.com23
WHITLAMjeremyjwhitlam6@nydailynews.com25
HEVEYwebbwhevey7@wikia.com17
SEXCEYkatharineksexcey8@harvard.edu20
LILLOWbartonblillow9@cam.ac.uk18
  1. CONCATENATION, BOOLEAN, EXPRESSIONS AND WILDCARDS
  1. Concatenate the first name and last name
				
					SELECT first_name || ' ' ||last_name AS full_name
FROM employees
LIMIT 5;
				
			
full_name
Carson Mosconi
Khalil Corr
Vilhelmina Rayman
Eleen Tarpey
Hamel Grocock
  1. If the person is Female and makes more than 50k then true else false
				
					SELECT  first_name || ' ' ||last_name AS full_name,
		salary,
		gender,
		(salary>50000 AND gender='F') AS less_than_50k
FROM employees
LIMIT 10;
				
			
full_namesalarygenderless_than_50k
Carson Mosconi32973MFalse
Khalil Corr52802MFalse
Vilhelmina Rayman57855FTrue
Eleen Tarpey48048FFalse
Hamel Grocock66566MFalse
Frazier Balls15235MFalse
Jeremy Whitlam41159MFalse
Webb Hevey48477MFalse
Katharine Sexcey23772FFalse
Barton Lillow15083MFalse
  1. If the email contains ".com" return true else false
				
					SELECT  email,
		(email like '%.com%') AS ".com"
FROM employees
LIMIT 10;
				
			
email.com
cmosconi0@census.govFalse
kcorr1@github.ioFalse
vrayman2@jigsy.comTrue
etarpey3@devhub.comTrue
hgrocock4@nasa.govFalse
fballs5@nydailynews.comTrue
jwhitlam6@nydailynews.comTrue
whevey7@wikia.comTrue
ksexcey8@harvard.eduFalse
blillow9@cam.ac.ukFalse
  1. SUBSTRING, POSITION, COALESCE
  1. Find the email clients of the employees
				
					SELECT  email,
		SUBSTRING(email FROM POSITION('@' IN email))
FROM employees
LIMIT 10;
				
			
emailsubstring
cmosconi0@census.gov@census.gov
kcorr1@github.io@github.io
vrayman2@jigsy.com@jigsy.com
etarpey3@devhub.com@devhub.com
hgrocock4@nasa.gov@nasa.gov
fballs5@nydailynews.com@nydailynews.com
jwhitlam6@nydailynews.com@nydailynews.com
whevey7@wikia.com@wikia.com
ksexcey8@harvard.edu@harvard.edu
blillow9@cam.ac.uk@cam.ac.uk
  1. Fill missing emails with 'NO EMAIL PROVIDED'
				
					SELECT  employee_id,
		email,
		COALESCE(email,'NO EMAIL PROVIDED')
FROM employees
ORDER BY email DESC
LIMIT 10;
				
			
employee_idemailcoalesce
117778NULLNO EMAIL PROVIDED
725760NULLNO EMAIL PROVIDED
702218NULLNO EMAIL PROVIDED
793404NULLNO EMAIL PROVIDED
815230NULLNO EMAIL PROVIDED
577294NULLNO EMAIL PROVIDED
826977NULLNO EMAIL PROVIDED
877425NULLNO EMAIL PROVIDED
824952NULLNO EMAIL PROVIDED
141019NULLNO EMAIL PROVIDED
  1. MIN, MAX, AVG, SUM, COUNT
  1. Return the min salary, max salary, average salary, sum of all salaries, and the count of emails of employees
				
					SELECT  MIN(salary),
		MAX(salary),
		ROUND(AVG(salary),2),
		SUM(salary),
		COUNT(email)
FROM employees;
				
			
minmaxroundsumcount
98786772438181.2038181199791
  1. GROUP BY, HAVING
  1. Return the total number of employees for each coffeeshop, the total salaries for each coffeeshop, average, min and max salaries. Return only the coffeeshops with more than 200 employees
				
					SELECT  coffeeshop_id,
		COUNT(employee_id),
		SUM(salary),
		MAX(salary),
		MIN(salary),
		ROUND(AVG(salary),2)
FROM employees
GROUP BY coffeeshop_id
HAVING COUNT(employee_id)>=200;
				
			
coffeeshop_idcountsummaxminround
32148585485675601059240119.09
52037343255677241022036173.67
4210787549367599987837502.35
  1. CASE WITH GROUP BY AND CASE FOR TRANSPOSING DATA
  1. Return a list of pay categories for salaries lower than 25k (Low Pay), salaries between 25k and 45k (Medium Pay), and salaries higher than 45k (High Pay)
				
					SELECT  employee_id,
		first_name,
		last_name,
		salary,
		CASE
			WHEN salary < 25000 THEN 'Low Pay'
			WHEN salary BETWEEN 25000 AND 45000 THEN 'Medium Pay'
			ELSE 'High Pay'
		END	AS pay_range
FROM employees
LIMIT 10;
				
			
employee_idfirst_namelast_namesalarypay_range
501559CarsonMosconi32973Medium Pay
144108KhalilCorr52802High Pay
782284VilhelminaRayman57855High Pay
225709EleenTarpey48048High Pay
614903HamelGrocock66566High Pay
590293FrazierBalls15235Low Pay
243999JeremyWhitlam41159Medium Pay
599230WebbHevey48477High Pay
758331KatharineSexcey23772Low Pay
561012BartonLillow15083Low Pay
  1. Return the count of employees in each pay category
				
					WITH cte AS (
SELECT  employee_id,
		first_name,
		last_name,
		salary,
		CASE
			WHEN salary < 25000 THEN 'Low Pay'
			WHEN salary BETWEEN 25000 AND 45000 THEN 'Medium Pay'
			ELSE 'High Pay'
		END	AS pay_range
FROM employees
)
SELECT pay_range, COUNT(*) AS count_employees
FROM cte
GROUP BY pay_range;
				
			
pay_rangecount_employees
High Pay380
Low Pay273
Medium Pay347
  1. Transpose the output above
				
					SELECT
		SUM(CASE WHEN salary < 25000 THEN 1 ELSE 0 END) AS "Low Pay",
		SUM(CASE WHEN salary BETWEEN 25000 AND 45000 THEN 1 ELSE 0 END) AS "Medium Pay",
		SUM(CASE WHEN salary > 45000 THEN 1 ELSE 0 END) AS "High Pay"
FROM employees;
				
			
Low PayMedium PayHigh Pay
273347380
  1. SUBQUERIES
  1. Return first and last name, salary, and the difference between the salary and the average salary of all employees
				
					SELECT  first_name,
		last_name,
		salary,
		salary - (SELECT ROUND(AVG(salary),0) FROM employees) AS salary_difference
FROM employees
LIMIT 10;
				
			
first_namelast_namesalarysalary_difference
CarsonMosconi32973-5208
KhalilCorr5280214621
VilhelminaRayman5785519674
EleenTarpey480489867
HamelGrocock6656628385
FrazierBalls15235-22946
JeremyWhitlam411592978
WebbHevey4847710296
KatharineSexcey23772-14409
BartonLillow15083-23098
  1. Return all employees who work in US coffee shops and make more than 35k
				
					SELECT *
FROM employees
WHERE salary > 35000 AND coffeeshop_id IN
	(
	SELECT coffeeshop_id
	FROM shops
	WHERE city_id IN
		(SELECT city_id 
		 FROM locations
		 WHERE country = 'United States')
	)
LIMIT 10;
				
			
employee_idfirst_namelast_nameemailhire_dategendersalarycoffeeshop_id
144108KhalilCorrkcorr1@github.io2014-04-23M528021
782284VilhelminaRaymanvrayman2@jigsy.com2015-08-17F578552
243999JeremyWhitlamjwhitlam6@nydailynews.com2014-01-21M411594
599230WebbHeveywhevey7@wikia.com2010-04-27M484774
75097WallyHuebnerwhuebnerc@dmoz.org2020-08-30F577314
755091ClemKitchingmanckitchingmane@pinterest.com2014-07-23M468184
925779PavelButchardpbutchardf@opera.com2016-09-21M350035
353657BrighamBoucherbboucherh@army.mil2016-03-09M388992
877425HortenByreNULL2022-05-21M404585
608868AnnabelleOttiwillaottiwillj@wordpress.com2016-07-19F548575
  1. Return the 30 day moving total pay
				
					SELECT  hire_date,
		salary,
		(SELECT SUM(salary)
		FROM employees e2
		WHERE e2.hire_date BETWEEN e1.hire_date - 30 AND e1.hire_date) AS pay_pattern
FROM employees e1
ORDER BY hire_date
LIMIT 20;
				
			
hire_datesalarypay_pattern
2010-01-011769117691
2010-01-113240850099
2010-01-1365993116092
2010-01-2321508150505
2010-01-2312905150505
2010-02-0639072171886
2010-02-0735271207157
2010-02-0913829220986
2010-02-1020835241821
2010-02-1219639229052
2010-02-2828628157274
2010-03-0255740213014
2010-03-0337106250120
2010-03-0439757289877
2010-03-1218104219809
2010-03-2737313263663
2010-03-2747015263663
2010-03-2955722319385
2010-04-0719095177249
2010-04-1323985183130