Pivoting Data in SQL

Setup

  1. Start by installing MySQL from here: https://dev.mysql.com/downloads/

  2. Follow the instructions from here to setup MySQL on your local machine.

  3. Install MySQL Workbench from here: https://dev.mysql.com/downloads/workbench/

  4. Download dataset used in this lecture from here: https://github.com/ardhiraka/PFDS_sources/blob/master/players.csv and https://github.com/ardhiraka/PFDS_sources/blob/master/teams.csv

Pivot Table

Let’s start by aggregating the data to show the number of players of each year in each conference, similar to the first example in the inner join lesson:

SELECT teams.conference AS conference,
       players.year,
       COUNT(1) AS players
  FROM hacktiv8.players players
  JOIN hacktiv8.teams teams
    ON teams.school_name = players.school_name
 GROUP BY 1,2
 ORDER BY 1,2

In order to transform the data, we’ll need to put the above query into a subquery. It can be helpful to create the subquery and select all columns from it before starting to make transformations. Re-running the query at incremental steps like this makes it easier to debug if your query doesn’t run. Note that you can eliminate the ORDER BY clause from the subquery since we’ll reorder the results in the outer query.

SELECT *
  FROM (
        SELECT teams.conference AS conference,
               players.year,
               COUNT(1) AS players
            FROM hacktiv8.players players
			JOIN hacktiv8.teams teams
            ON teams.school_name = players.school_name
         GROUP BY 1,2
       ) sub

Assuming that works as planned (results should look exactly the same as the first query), it’s time to break the results out into different columns for various years. Each item in the SELECT statement creates a column, so you’ll have to create a separate column for each year:

SELECT conference,
       SUM(CASE WHEN year = 'FR' THEN players ELSE NULL END) AS fr,
       SUM(CASE WHEN year = 'SO' THEN players ELSE NULL END) AS so,
       SUM(CASE WHEN year = 'JR' THEN players ELSE NULL END) AS jr,
       SUM(CASE WHEN year = 'SR' THEN players ELSE NULL END) AS sr
  FROM (
        SELECT teams.conference AS conference,
               players.year,
               COUNT(1) AS players
          FROM hacktiv8.players players
          JOIN hacktiv8.teams teams
            ON teams.school_name = players.school_name
         GROUP BY 1,2
       ) sub
 GROUP BY 1
 ORDER BY 1

But this could still be made a little better. You’ll notice that the above query produces a list that is ordered alphabetically by Conference. It might make more sense to add a “total players” column and order by that (largest to smallest):

SELECT conference,
       SUM(players) AS total_players,
       SUM(CASE WHEN year = 'FR' THEN players ELSE NULL END) AS fr,
       SUM(CASE WHEN year = 'SO' THEN players ELSE NULL END) AS so,
       SUM(CASE WHEN year = 'JR' THEN players ELSE NULL END) AS jr,
       SUM(CASE WHEN year = 'SR' THEN players ELSE NULL END) AS sr
  FROM (
        SELECT teams.conference AS conference,
               players.year,
               COUNT(1) AS players
          FROM hacktiv8.players players
          JOIN hacktiv8.teams teams
            ON teams.school_name = players.school_name
         GROUP BY 1,2
       ) sub
 GROUP BY 1
 ORDER BY 2 DESC

SQL with Python

Install MySQL Driver

If you want to use Python, you need to install mysql-connector first with pip install -q mysql-connector-python

Test MySQL Connector

Below are the example commands to use MySQL with Python directly.

import mysql.connector

cnx = mysql.connector.connect(
    user='root', 
    password='',
    host='localhost')
# get cursor from connection
cur = cnx.cursor()

# Query to execute

query = """select * 
from hacktiv8.players
limit 5"""

#  execute this query 
cur.execute(query)

# get rows
rows  = cur.fetchall()
for r in rows:
    print(r)
('Cincinnati Bearcats', 'Cincinnati', 'Ralph Abernathy', 'RB', 67.0, 161.0, 'JR', 'ATLANTA, GA', 'GA', 1)
('Cincinnati Bearcats', 'Cincinnati', 'Mekale McKay', 'WR', 78.0, 195.0, 'SO', 'LOUISVILLE, KY', 'KY', 2)
('Cincinnati Bearcats', 'Cincinnati', 'Trenier Orr', 'CB', 71.0, 177.0, 'SO', 'WINTER GARDEN, FL', 'FL', 3)
('Cincinnati Bearcats', 'Cincinnati', 'Bennie Coney', 'QB', 75.0, 216.0, 'FR', 'PLANT CITY, FL', 'FL', 4)
('Cincinnati Bearcats', 'Cincinnati', 'Johnny Holton', 'WR', 75.0, 190.0, 'JR', 'MIAMI, FL', 'FL', 5)

Python MySQL Example

Now let’s move to Python. First we are going to make connect.py file to test the connection.

import mysql.connector

db = mysql.connector.connect(
  host="localhost",
  user="root",
  passwd=""
)

if db.is_connected():
  print("Successfully connected to MySQL database")

Try running the above code and see if you can connect to MySQL with python3 connect.py.

Now let’s make some database, tables, values to start with. Make new file create_db.py and run it.

import mysql.connector

db = mysql.connector.connect(
  host="localhost",
  user="root",
  passwd=""
)

cursor = db.cursor()
cursor.execute("CREATE DATABASE ftds")

print("Database Successfully Created")

To check if the database is created, you can open your MySQL Workbench and select the database ftds.

Then you can create tables and insert values.

import mysql.connector

db = mysql.connector.connect(
  host="localhost",
  user="root",
  passwd="",
  database="ftds"
)

cursor = db.cursor()
sql = """CREATE TABLE student (
  student_id INT AUTO_INCREMENT PRIMARY KEY,
  name VARCHAR(255),
  address Varchar(255)
)
"""
cursor.execute(sql)

print("Student Table Successfully Created")

Save with create_table.py and run it.

Insert one values into the table, save with insert_one.py.

import mysql.connector

db = mysql.connector.connect(
  host="localhost",
  user="root",
  passwd="",
  database="ftds"
)

cursor = db.cursor()
sql = "INSERT INTO student (name, address) VALUES (%s, %s)"
val = ("Raka", "Semarang")
cursor.execute(sql, val)

db.commit()

print("{} data added ".format(cursor.rowcount))

Look at the code we use to insert one value.

sql = "INSERT INTO student (name, address) VALUES (%s, %s)"
val = ("Raka", "Semarang")
cursor.execute(sql, val)

db.commit()

We use %s to indicate that we are passing in a string. Never use sql = "INSERT INTO customers (name, address) VALUES ('"+ name +"', '" + address +"')" to avoid SQL Injection. Finally we use db.commit() to commit the changes to the database. This method usually used when we need to insert, update, or delete data. But how about if we need to insert multiple data at once? Let’s make insert_many.py file and run it.

import mysql.connector

db = mysql.connector.connect(
  host="localhost",
  user="root",
  passwd="",
  database="ftds"
)

cursor = db.cursor()
sql = "INSERT INTO customers (name, address) VALUES (%s, %s)"
values = [
  ("Danu", "Jakarta"),
  ("Fahmi", "Surabaya"),
  ("Sardi", "Bandung"),
  ("Hana", "Depok")
]

for val in values:
  cursor.execute(sql, val)
  db.commit()

print("{} data added".format(len(values)))

We can show data from the table with select_all.py file and run it.

We can use fetchall() to get all data from the table, fetchone() to get one data from the table, and fetchmany() to get multiple data from the table.

import mysql.connector

db = mysql.connector.connect(
  host="localhost",
  user="root",
  passwd="",
  database="ftds"
)

cursor = db.cursor()
sql = "SELECT * FROM student"
cursor.execute(sql)

results = cursor.fetchall()

for data in results:
  print(data)

Next we want to try to use fetchoone() to get one data from the table, with select_one.py file and run it.

import mysql.connector

db = mysql.connector.connect(
  host="localhost",
  user="root",
  passwd="",
  database="ftds"
)

cursor = db.cursor()
sql = "SELECT * FROM student"
cursor.execute(sql)

result = cursor.fetchone()

print(result)

Let’s try update data with update_one.py file and run it.

import mysql.connector

db = mysql.connector.connect(
  host="localhost",
  user="root",
  passwd="",
  database="ftds"
)

cursor = db.cursor()
sql = "UPDATE student SET name=%s, address=%s WHERE student_id=%s"
val = ("Rachman", "Lombok", 1)
cursor.execute(sql, val)

db.commit()

print("{} data updated".format(cursor.rowcount))

We give new value to the data with student_id = 1.


We can use delete_one.py file and run it.

import mysql.connector

db = mysql.connector.connect(
  host="localhost",
  user="root",
  passwd="",
  database="ftds"
)

cursor = db.cursor()
sql = "DELETE FROM student WHERE student_id=%s"
val = (1, )
cursor.execute(sql, val)

db.commit()

print("{} data deleted".format(cursor.rowcount))

Finally we can combine everything we have learned in this session with py_sql.py file and run it.

import mysql.connector
import os

db = mysql.connector.connect(
  host="localhost",
  user="root",
  passwd="",
  database="ftds"
)


def insert_data(db):
  name = input("Name: ")
  address = input("Address: ")
  val = (name, address)
  cursor = db.cursor()
  sql = "INSERT INTO student (name, address) VALUES (%s, %s)"
  cursor.execute(sql, val)
  db.commit()
  print("{} data added".format(cursor.rowcount))


def show_data(db):
  cursor = db.cursor()
  sql = "SELECT * FROM student"
  cursor.execute(sql)
  results = cursor.fetchall()
  
  if cursor.rowcount < 0:
    print("No data")
  else:
    for data in results:
      print(data)


def update_data(db):
  cursor = db.cursor()
  show_data(db)
  student_id = input("Select student id> ")
  name = input("New Name: ")
  address = input("New Address: ")

  sql = "UPDATE student SET name=%s, address=%s WHERE student_id=%s"
  val = (name, address, student_id)
  cursor.execute(sql, val)
  db.commit()
  print("{} data changed".format(cursor.rowcount))


def delete_data(db):
  cursor = db.cursor()
  show_data(db)
  student_id = input("Select student id> ")
  sql = "DELETE FROM student WHERE student_id=%s"
  val = (student_id,)
  cursor.execute(sql, val)
  db.commit()
  print("{} data deleted".format(cursor.rowcount))


def search_data(db):
  cursor = db.cursor()
  keyword = input("Keyword: ")
  sql = "SELECT * FROM student WHERE name LIKE %s OR address LIKE %s"
  val = ("%{}%".format(keyword), "%{}%".format(keyword))
  cursor.execute(sql, val)
  results = cursor.fetchall()
  
  if cursor.rowcount < 0:
    print("No Data")
  else:
    for data in results:
      print(data)


def show_menu(db):
  print("=== H8 Py MySQL Exercise ===")
  print("1. Insert Data")
  print("2. Show Data")
  print("3. Update Data")
  print("4. Delete Data")
  print("5. Search Data")
  print("0. Exit")
  print("------------------")
  menu = input("Select menu> ")

  #clear screen
  os.system("clear")

  if menu == "1":
    insert_data(db)
  elif menu == "2":
    show_data(db)
  elif menu == "3":
    update_data(db)
  elif menu == "4":
    delete_data(db)
  elif menu == "5":
    search_data(db)
  elif menu == "0":
    exit()
  else:
    print("Wrong Input")


if __name__ == "__main__":
  while(True):
    show_menu(db)

Full code here https://github.com/ardhiraka/FTDS_post/tree/master/py_sql