• Live notes about this first tutorial are available at this link

Visual Studio Code

The preliminary step of this tutorial requires to install Visual Studio Code.

  1. We can install Visual Studio from this web page
  2. We can handle .py files according to this page. In this tutorial execute this section
    1. Install Jupyter Notebook plugin
  3. Create an empty folder for our exercises and open it with File > Open File... menu on Visual Studio

Great! We are ready to move on to the next part of the tutorial.

CSV files in Python

In this tutorial, we see how to use Python to manage CSV files following an online tutorial.

  1. Copy and save this snippet of code in a .csv file:
Title,Release Date,Director
And Now For Something Completely Different,1971,Ian MacNaughton
Monty Python And The Holy Grail,1975,Terry Gilliam and Terry Jones
Monty Python's Life Of Brian,1979,Terry Jones
Monty Python Live At The Hollywood Bowl,1982,Terry Hughes
Monty Python's The Meaning Of Life,1983,Terry Jones
  1. Execute part of this tutorial on this newly saved .csv file:

    a. Run this example reading the previous .csv file

import csv
 
filename = "film.csv"
 
fields = []
rows = []
 
 
with open(filename, 'r') as csvfile:
    csvreader = csv.reader(csvfile)
    fields = next(csvreader)
    for row in csvreader:
        rows.append(row)
    print("Total no. of rows: %d" % (csvreader.line_num))
 
print('Field names are:' + ', '.join(field for field in fields))
 
print('\nFirst 5 rows are:\n')
for row in rows[:5]:
    for col in row:
        print("%10s" % col, end=" "),
    print('\n')

b. Run this other example reading the same .csv file

import csv
 
with open('film.csv', mode='r') as file:
    csv_reader = csv.DictReader(file)
    data_list = []
    for row in csv_reader:
        data_list.append(row)
 
for data in data_list:
    print(data)
 

c. Run this example writing a .csv file

import csv
fields = ['Name', 'Branch', 'Year', 'CGPA']
rows = [['Nikhil', 'COE', '2', '9.0'],
        ['Sanchit', 'COE', '2', '9.1'],
        ['Aditya', 'IT', '2', '9.3'],
        ['Sagar', 'SE', '1', '9.5'],
        ['Prateek', 'MCE', '3', '7.8'],
        ['Sahil', 'EP', '2', '9.1']]
 
filename = "university_records.csv"
 
with open(filename, 'w') as csvfile:
    csvwriter = csv.writer(csvfile)
    csvwriter.writerow(fields)
    csvwriter.writerows(rows)
 

d. Create a Python script that reads from the previous .csv file and a new one with the content below and create a third .csv file with the content of both the files (choose on the reading/writing technique from this tutorial)

Title,Release Date,Director
Monty Python's Flying Circus - Series 1,1969-1970,Various Directors
Monty Python's Flying Circus - Series 2,1970-1971,Various Directors
Monty Python's Flying Circus - Series 3,1972-1973,Various Directors
Monty Python's Flying Circus - Series 4,1974,Various Directors
Monty Python Conquers America,2003,Frank Cvitanovich
Monty Python Live (Mostly),2014,Eric Idle

Pandas

A smarter and more abstract tool to handle .csv files is another Python tool named Pandas

We use Pandas to:

  1. Manipulate data.
  2. Visualize and plot data combined with matplotlib library

Install Pandas and Matplotlib

Install pandas following the instruction available at this web page.

Install matplotlib following the instruction available at this web page.

Manipulate and Visualize data

Let us take a look at this presentation.

Keeping Pandas API Documentation at hand with this link, let us see Pandas in action now:

  1. Create a file named ManipulateVisualize.ipynb
  2. Run the Jupiter Notebook in Visual Studio Code (more details here) on the .csv dataset Salary Data.csv
  3. Follow and repeat my instructions on your Jupiter!

Exercises

Complete this exercise on Jupiter Notebook on the same dataset.