Pivoting Data in SQL
Contents
Pivoting Data in SQL¶
Setup¶
Start by installing MySQL from here: https://dev.mysql.com/downloads/
Follow the instructions from here to setup MySQL on your local machine.
Install MySQL Workbench from here: https://dev.mysql.com/downloads/workbench/
Download dataset used in this lecture from here: https://github.com/ardhiraka/PFDS_sources/blob/master/players.csv and https://github.com/ardhiraka/PFDS_sources/blob/master/teams.csv
Pivot Table¶
Let’s start by aggregating the data to show the number of players of each year in each conference, similar to the first example in the inner join lesson:
SELECT teams.conference AS conference,
players.year,
COUNT(1) AS players
FROM hacktiv8.players players
JOIN hacktiv8.teams teams
ON teams.school_name = players.school_name
GROUP BY 1,2
ORDER BY 1,2
In order to transform the data, we’ll need to put the above query into a subquery. It can be helpful to create the subquery and select all columns from it before starting to make transformations. Re-running the query at incremental steps like this makes it easier to debug if your query doesn’t run. Note that you can eliminate the ORDER BY clause from the subquery since we’ll reorder the results in the outer query.
SELECT *
FROM (
SELECT teams.conference AS conference,
players.year,
COUNT(1) AS players
FROM hacktiv8.players players
JOIN hacktiv8.teams teams
ON teams.school_name = players.school_name
GROUP BY 1,2
) sub
Assuming that works as planned (results should look exactly the same as the first query), it’s time to break the results out into different columns for various years. Each item in the SELECT statement creates a column, so you’ll have to create a separate column for each year:
SELECT conference,
SUM(CASE WHEN year = 'FR' THEN players ELSE NULL END) AS fr,
SUM(CASE WHEN year = 'SO' THEN players ELSE NULL END) AS so,
SUM(CASE WHEN year = 'JR' THEN players ELSE NULL END) AS jr,
SUM(CASE WHEN year = 'SR' THEN players ELSE NULL END) AS sr
FROM (
SELECT teams.conference AS conference,
players.year,
COUNT(1) AS players
FROM hacktiv8.players players
JOIN hacktiv8.teams teams
ON teams.school_name = players.school_name
GROUP BY 1,2
) sub
GROUP BY 1
ORDER BY 1
But this could still be made a little better. You’ll notice that the above query produces a list that is ordered alphabetically by Conference. It might make more sense to add a “total players” column and order by that (largest to smallest):
SELECT conference,
SUM(players) AS total_players,
SUM(CASE WHEN year = 'FR' THEN players ELSE NULL END) AS fr,
SUM(CASE WHEN year = 'SO' THEN players ELSE NULL END) AS so,
SUM(CASE WHEN year = 'JR' THEN players ELSE NULL END) AS jr,
SUM(CASE WHEN year = 'SR' THEN players ELSE NULL END) AS sr
FROM (
SELECT teams.conference AS conference,
players.year,
COUNT(1) AS players
FROM hacktiv8.players players
JOIN hacktiv8.teams teams
ON teams.school_name = players.school_name
GROUP BY 1,2
) sub
GROUP BY 1
ORDER BY 2 DESC
SQL with Python¶
Install MySQL Driver¶
If you want to use Python, you need to install mysql-connector first with pip install -q mysql-connector-python
Test MySQL Connector¶
Below are the example commands to use MySQL with Python directly.
import mysql.connector
cnx = mysql.connector.connect(
user='root',
password='',
host='localhost')
# get cursor from connection
cur = cnx.cursor()
# Query to execute
query = """select *
from hacktiv8.players
limit 5"""
# execute this query
cur.execute(query)
# get rows
rows = cur.fetchall()
for r in rows:
print(r)
('Cincinnati Bearcats', 'Cincinnati', 'Ralph Abernathy', 'RB', 67.0, 161.0, 'JR', 'ATLANTA, GA', 'GA', 1)
('Cincinnati Bearcats', 'Cincinnati', 'Mekale McKay', 'WR', 78.0, 195.0, 'SO', 'LOUISVILLE, KY', 'KY', 2)
('Cincinnati Bearcats', 'Cincinnati', 'Trenier Orr', 'CB', 71.0, 177.0, 'SO', 'WINTER GARDEN, FL', 'FL', 3)
('Cincinnati Bearcats', 'Cincinnati', 'Bennie Coney', 'QB', 75.0, 216.0, 'FR', 'PLANT CITY, FL', 'FL', 4)
('Cincinnati Bearcats', 'Cincinnati', 'Johnny Holton', 'WR', 75.0, 190.0, 'JR', 'MIAMI, FL', 'FL', 5)
Python MySQL Example¶
Now let’s move to Python. First we are going to make connect.py
file to test the connection.
import mysql.connector
db = mysql.connector.connect(
host="localhost",
user="root",
passwd=""
)
if db.is_connected():
print("Successfully connected to MySQL database")
Try running the above code and see if you can connect to MySQL with python3 connect.py
.
Now let’s make some database, tables, values to start with. Make new file create_db.py
and run it.
import mysql.connector
db = mysql.connector.connect(
host="localhost",
user="root",
passwd=""
)
cursor = db.cursor()
cursor.execute("CREATE DATABASE ftds")
print("Database Successfully Created")
To check if the database is created, you can open your MySQL Workbench and select the database ftds
.
Then you can create tables and insert values.
import mysql.connector
db = mysql.connector.connect(
host="localhost",
user="root",
passwd="",
database="ftds"
)
cursor = db.cursor()
sql = """CREATE TABLE student (
student_id INT AUTO_INCREMENT PRIMARY KEY,
name VARCHAR(255),
address Varchar(255)
)
"""
cursor.execute(sql)
print("Student Table Successfully Created")
Save with create_table.py
and run it.
Insert one values into the table, save with insert_one.py
.
import mysql.connector
db = mysql.connector.connect(
host="localhost",
user="root",
passwd="",
database="ftds"
)
cursor = db.cursor()
sql = "INSERT INTO student (name, address) VALUES (%s, %s)"
val = ("Raka", "Semarang")
cursor.execute(sql, val)
db.commit()
print("{} data added ".format(cursor.rowcount))
Look at the code we use to insert one value.
sql = "INSERT INTO student (name, address) VALUES (%s, %s)"
val = ("Raka", "Semarang")
cursor.execute(sql, val)
db.commit()
We use %s
to indicate that we are passing in a string. Never use sql = "INSERT INTO customers (name, address) VALUES ('"+ name +"', '" + address +"')"
to avoid SQL Injection.
Finally we use db.commit() to commit the changes to the database. This method usually used when we need to insert, update, or delete data. But how about if we need to insert multiple data at once? Let’s make insert_many.py
file and run it.
import mysql.connector
db = mysql.connector.connect(
host="localhost",
user="root",
passwd="",
database="ftds"
)
cursor = db.cursor()
sql = "INSERT INTO customers (name, address) VALUES (%s, %s)"
values = [
("Danu", "Jakarta"),
("Fahmi", "Surabaya"),
("Sardi", "Bandung"),
("Hana", "Depok")
]
for val in values:
cursor.execute(sql, val)
db.commit()
print("{} data added".format(len(values)))
We can show data from the table with select_all.py
file and run it.
We can use fetchall()
to get all data from the table, fetchone()
to get one data from the table, and fetchmany()
to get multiple data from the table.
import mysql.connector
db = mysql.connector.connect(
host="localhost",
user="root",
passwd="",
database="ftds"
)
cursor = db.cursor()
sql = "SELECT * FROM student"
cursor.execute(sql)
results = cursor.fetchall()
for data in results:
print(data)
Next we want to try to use fetchoone()
to get one data from the table, with select_one.py
file and run it.
import mysql.connector
db = mysql.connector.connect(
host="localhost",
user="root",
passwd="",
database="ftds"
)
cursor = db.cursor()
sql = "SELECT * FROM student"
cursor.execute(sql)
result = cursor.fetchone()
print(result)
Let’s try update data with update_one.py
file and run it.
import mysql.connector
db = mysql.connector.connect(
host="localhost",
user="root",
passwd="",
database="ftds"
)
cursor = db.cursor()
sql = "UPDATE student SET name=%s, address=%s WHERE student_id=%s"
val = ("Rachman", "Lombok", 1)
cursor.execute(sql, val)
db.commit()
print("{} data updated".format(cursor.rowcount))
We give new value to the data with student_id
= 1.
We can use delete_one.py
file and run it.
import mysql.connector
db = mysql.connector.connect(
host="localhost",
user="root",
passwd="",
database="ftds"
)
cursor = db.cursor()
sql = "DELETE FROM student WHERE student_id=%s"
val = (1, )
cursor.execute(sql, val)
db.commit()
print("{} data deleted".format(cursor.rowcount))
Finally we can combine everything we have learned in this session with py_sql.py
file and run it.
import mysql.connector
import os
db = mysql.connector.connect(
host="localhost",
user="root",
passwd="",
database="ftds"
)
def insert_data(db):
name = input("Name: ")
address = input("Address: ")
val = (name, address)
cursor = db.cursor()
sql = "INSERT INTO student (name, address) VALUES (%s, %s)"
cursor.execute(sql, val)
db.commit()
print("{} data added".format(cursor.rowcount))
def show_data(db):
cursor = db.cursor()
sql = "SELECT * FROM student"
cursor.execute(sql)
results = cursor.fetchall()
if cursor.rowcount < 0:
print("No data")
else:
for data in results:
print(data)
def update_data(db):
cursor = db.cursor()
show_data(db)
student_id = input("Select student id> ")
name = input("New Name: ")
address = input("New Address: ")
sql = "UPDATE student SET name=%s, address=%s WHERE student_id=%s"
val = (name, address, student_id)
cursor.execute(sql, val)
db.commit()
print("{} data changed".format(cursor.rowcount))
def delete_data(db):
cursor = db.cursor()
show_data(db)
student_id = input("Select student id> ")
sql = "DELETE FROM student WHERE student_id=%s"
val = (student_id,)
cursor.execute(sql, val)
db.commit()
print("{} data deleted".format(cursor.rowcount))
def search_data(db):
cursor = db.cursor()
keyword = input("Keyword: ")
sql = "SELECT * FROM student WHERE name LIKE %s OR address LIKE %s"
val = ("%{}%".format(keyword), "%{}%".format(keyword))
cursor.execute(sql, val)
results = cursor.fetchall()
if cursor.rowcount < 0:
print("No Data")
else:
for data in results:
print(data)
def show_menu(db):
print("=== H8 Py MySQL Exercise ===")
print("1. Insert Data")
print("2. Show Data")
print("3. Update Data")
print("4. Delete Data")
print("5. Search Data")
print("0. Exit")
print("------------------")
menu = input("Select menu> ")
#clear screen
os.system("clear")
if menu == "1":
insert_data(db)
elif menu == "2":
show_data(db)
elif menu == "3":
update_data(db)
elif menu == "4":
delete_data(db)
elif menu == "5":
search_data(db)
elif menu == "0":
exit()
else:
print("Wrong Input")
if __name__ == "__main__":
while(True):
show_menu(db)
Full code here https://github.com/ardhiraka/FTDS_post/tree/master/py_sql