Who this is for: Grade 8–9 students who already know Python basics (loops, functions, lists/dicts) and want real-world data skills.
What you'll learn:
- What JSON and CSV are and when to use them
- Reading and writing JSON files in Python
- Reading and writing CSV files in Python
- Converting between JSON and CSV
- Handling common pitfalls (encodings, numbers, headers)
- A hands-on mini project: Gradebook Analyzer
1) Quick concepts
JSON (JavaScript Object Notation)
- Great for structured data with nested objects and lists.
- Common in web APIs and app configs.
- Type mapping (JSON → Python):
- object → dict
- array → list
- string → str
- number → int/float
- true/false → True/False
- null → None
CSV (Comma-Separated Values)
- Great for tabular data (rows and columns), like spreadsheets.
- Simple and human-readable, but not good for deeply nested structures.
2) JSON in Python (module: json)
Read JSON from a file:
Use json.load to read files, json.loads to parse a string.
Example: reading, updating, and saving JSON
import json
from pathlib import Path
data_path = Path("data")
data_path.mkdir(exist_ok=True)
# Create a sample JSON file
sample = {
"class": "9A",
"teacher": "Ms. Lee",
"students": [
{"name": "Alex", "age": 14, "scores": {"math": 88, "science": 92}},
{"name": "Rita", "age": 15, "scores": {"math": 95, "science": 85}}
]
}
with (data_path / "class.json").open("w", encoding="utf-8") as f:
json.dump(sample, f, indent=2, ensure_ascii=False)
# Read it back
with (data_path / "class.json").open("r", encoding="utf-8") as f:
data = json.load(f)
# Pretty print to console
print(json.dumps(data, indent=2, ensure_ascii=False))
# Update a value
data["students"][0]["scores"]["math"] = 90
# Save updated JSON
with (data_path / "class_updated.json").open("w", encoding="utf-8") as f:
json.dump(data, f, indent=2, ensure_ascii=False, sort_keys=True)Notes:
indentmakes it human-readable.ensure_ascii=Falsekeeps non-English characters readable.sort_keys=Truewrites keys in alphabetical order (useful for consistent files).
Handling bad JSON safely:
bad = "{ name: 'Alex' }" # invalid JSON (unquoted key, single quotes)
try:
parsed = json.loads(bad)
except json.JSONDecodeError as e:
print("Invalid JSON:", e)3) CSV in Python (module: csv)
Read CSV as rows (each row is a list):
Good for simple data without headers.
Read CSV as dictionaries (DictReader):
- Best when the first row has column names.
- You get rows as dicts like
{"name": "Alex", "math": "88"}
Example: reading and writing CSV
import csv
from pathlib import Path
data_path = Path("data")
data_path.mkdir(exist_ok=True)
# Create a sample CSV file
with (data_path / "students.csv").open("w", newline="", encoding="utf-8") as f:
writer = csv.writer(f)
writer.writerow(["name", "math", "science"])
writer.writerow(["Alex", "88", "92"])
writer.writerow(["Rita", "95", "85"])
writer.writerow(["Sam", "", "78"]) # missing math score
# Read with DictReader
def to_float(s):
try:
return float(s)
except (TypeError, ValueError):
return None
rows = []
with (data_path / "students.csv").open("r", newline="", encoding="utf-8") as f:
reader = csv.DictReader(f)
for row in reader:
math = to_float(row["math"])
science = to_float(row["science"])
avg = None
nums = [x for x in (math, science) if x is not None]
if nums:
avg = sum(nums) / len(nums)
rows.append({
"name": row["name"],
"math": math,
"science": science,
"average": avg
})
print(rows)
# Write a new CSV with averages
with (data_path / "averages.csv").open("w", newline="", encoding="utf-8") as f:
fieldnames = ["name", "math", "science", "average"]
writer = csv.DictWriter(f, fieldnames=fieldnames)
writer.writeheader()
writer.writerows(rows)Important CSV tips:
- Always open CSV files with
newline=""to avoid extra blank lines (especially on Windows). - Use
encoding="utf-8"for international text. - DictReader requires that the first row has headers.
- All values read from CSV are strings. Convert to int/float if needed.
4) Converting between JSON and CSV
CSV → JSON (flat rows):
Read with DictReader, convert types, dump with json.dump.
JSON → CSV:
- Works well if JSON is a list of flat objects.
- For nested objects, you can flatten keys like "scores.math" or split into multiple files.
Example: flat JSON to CSV
import json, csv
from pathlib import Path
data_path = Path("data")
# Suppose we have a flat JSON list
people = [
{"name": "Lina", "city": "Lagos", "age": 14},
{"name": "Jai", "city": "Mumbai", "age": 15}
]
with (data_path / "people.json").open("w", encoding="utf-8") as f:
json.dump(people, f, indent=2, ensure_ascii=False)
# Read JSON and write CSV
with (data_path / "people.json").open("r", encoding="utf-8") as f:
people_data = json.load(f)
fieldnames = sorted({k for p in people_data for k in p.keys()})
with (data_path / "people.csv").open("w", newline="", encoding="utf-8") as f:
writer = csv.DictWriter(f, fieldnames=fieldnames)
writer.writeheader()
writer.writerows(people_data)Example: flatten a simple nested JSON for CSV
def flatten(prefix, d, out):
for k, v in d.items():
key = f"{prefix}.{k}" if prefix else k
if isinstance(v, dict):
flatten(key, v, out)
else:
out[key] = v
return out
record = {"name": "Alex", "scores": {"math": 90, "science": 92}}
flat = flatten("", record, {})
# flat -> {"name": "Alex", "scores.math": 90, "scores.science": 92}5) Practical project: Gradebook Analyzer
Goal:
- Read a CSV of students and their test scores.
- Compute each student's average and letter grade.
- Save a JSON report for the class.
- Save a CSV leaderboard of top students.
Input CSV (data/gradebook.csv):
- Columns: name, math, science, english
- Missing values allowed (blank cells)
Step-by-step code:
import csv, json
from pathlib import Path
data_path = Path("data")
data_path.mkdir(exist_ok=True)
# 1) Create a sample gradebook
with (data_path / "gradebook.csv").open("w", newline="", encoding="utf-8") as f:
writer = csv.writer(f)
writer.writerow(["name", "math", "science", "english"])
writer.writerow(["Alex", "88", "92", "84"])
writer.writerow(["Rita", "95", "85", "91"])
writer.writerow(["Sam", "", "78", "80"]) # missing math
writer.writerow(["Maya", "100", "98", ""]) # missing english
# 2) Helpers
def to_float(s):
try:
return float(s)
except (TypeError, ValueError):
return None
def average(values):
vals = [v for v in values if v is not None]
return sum(vals) / len(vals) if vals else None
def letter_grade(avg):
if avg is None:
return "N/A"
if avg >= 90: return "A"
if avg >= 80: return "B"
if avg >= 70: return "C"
if avg >= 60: return "D"
return "F"
# 3) Read CSV and compute metrics
students = []
with (data_path / "gradebook.csv").open("r", newline="", encoding="utf-8") as f:
reader = csv.DictReader(f)
for row in reader:
scores = {
"math": to_float(row.get("math")),
"science": to_float(row.get("science")),
"english": to_float(row.get("english"))
}
avg = average(scores.values())
students.append({
"name": row["name"],
"scores": scores,
"average": avg,
"letter": letter_grade(avg)
})
# 4) Save JSON report
report = {
"class": "9A",
"count": len(students),
"students": students,
"top_student": max(students, key=lambda s: (- (s["average"] or -1), s["name"]))["name"]
}
with (data_path / "grade_report.json").open("w", encoding="utf-8") as f:
json.dump(report, f, indent=2, ensure_ascii=False)
# 5) Save CSV leaderboard (sorted by average desc)
leaderboard = sorted(
(s for s in students if s["average"] is not None),
key=lambda s: s["average"],
reverse=True
)
with (data_path / "leaderboard.csv").open("w", newline="", encoding="utf-8") as f:
fieldnames = ["name", "average", "letter"]
writer = csv.DictWriter(f, fieldnames=fieldnames)
writer.writeheader()
for s in leaderboard:
writer.writerow({
"name": s["name"],
"average": round(s["average"], 2),
"letter": s["letter"]
})
# 6) Print a quick summary
print("Saved:", data_path / "grade_report.json")
print("Saved:", data_path / "leaderboard.csv")What you practiced:
- Reading CSV with DictReader
- Handling missing numbers
- Writing formatted JSON with json.dump
- Writing a CSV with DictWriter and headers
- Sorting data and computing fields
6) Common pitfalls and pro tips
- CSV numbers are strings: Always convert to int/float before math.
- Newline handling: When opening CSV files, use
newline=""to prevent blank lines on Windows. - Encodings: Use
encoding="utf-8"for both JSON and CSV to support all characters. - Missing data: Use helper functions to safely convert or default values.
- Nested JSON to CSV: Flatten nested dicts or choose a subset of fields.
- Pretty JSON: Use
indent=2in dumps/dump for readability; skip for maximum speed/size. - Validation: Wrap
json.loadin try/exceptjson.JSONDecodeErrorfor untrusted files. - Big files: Process CSV line by line (streaming). For JSON, consider JSON Lines (each line is a JSON object) if you need streaming.
Writing JSON Lines:
# write each record on its own line
with open("data/students.jsonl", "w", encoding="utf-8") as f:
for s in students:
f.write(json.dumps(s, ensure_ascii=False) + "\n")7) Summary
- JSON is best for nested, structured data; CSV is best for tables.
- In Python, use
json.load/dumpfor files andcsv.DictReader/DictWriterfor labeled rows. - Always handle encodings, newlines, and type conversions carefully.
- You can convert between JSON and CSV when data is flat or after flattening.
- The Gradebook Analyzer project shows how to read data, compute results, and produce both JSON and CSV outputs—skills you'll use in real data tasks.
Next steps:
- Try adding median and standard deviation to the Gradebook Analyzer.
- Add per-subject leaderboards.
- Explore pandas for more powerful data analysis once you're comfortable with csv/json.