Python for begginers 11 min read

Python course Part 5: Error Handling & Modern File Operations (Bulletproofing Your Code)

Adrian Kuczyński
Senior Security Developer
Python course Part 5: Error Handling & Modern File Operations (Bulletproofing Your Code)

Up to this point, your Python code has lived in a sterile, temporary bubble. The moment the terminal closes, your data is gone. We've processed lists, filtered dictionaries, and calculated metrics, but every result has dissolved into the ether the instant the program ended.

Worse, we've been operating on the most dangerous assumption in software: that the happy path is the only path.

We've assumed users will behave, files will exist, and input will always be clean. But if you come from .NET, systems work, or security engineering, you already know the truth — the real world is hostile. Connections drop. Disks fill up. Input files get corrupted mid-write. A script that throws a raw stack trace and dies the moment it meets one malformed line is not production-ready.

Today we bridge the gap between fragile scripts and resilient software.

We're going to build a Server Log Cleanser: a utility that ingests a large, partially-corrupted log file, extracts the valid IP addresses, skips the broken lines without crashing, and streams the clean output to a new file on disk.

This isn't an academic exercise — it's the exact shape of utility you end up writing in DevOps pipelines, backend services, and security ingestion jobs. To build it, we first need to understand how Python handles the filesystem without falling into the resource-leak traps of older languages.

Here's the target output:

[INFO] Attempting to process log file...
[INFO] Processing line 1...
[INFO] Processing line 100...
[INFO] Processing line 250...
[WARN] Skipping corrupted line 312: "GARBAGE_DATA%%!!"
[INFO] Processing line 500...
[SUCCESS] Extracted 487 valid IP addresses from 500 lines.
[SUCCESS] Clean data saved to clean_ips.txt

Notice how the script doesn't crash on a corrupted line. It reports the problem, handles it, and keeps moving. That's what bulletproof code looks like.

🛑 Dev Callout: IDisposable vs. Context Managers

Coming from C# or .NET? You know the pain of a leaked file handle when someone forgets to call .Dispose() — which is exactly why you wrap things in a using block. Python has the same concept, spelled differently. Instead of using, you get context managers via the with keyword. It guarantees the file is closed the instant the block ends, even if an exception is thrown inside it.

// C# — you know this pattern
using (var reader = new StreamReader("log.txt"))
{
    var content = reader.ReadToEnd();
} // stream disposed automatically here
# Python — the equivalent
with open("log.txt") as reader:
    content = reader.read()
# stream closed automatically here

The mental model is identical: "set up this resource, use it, and tear it down no matter what happens." Skip the with and you're leaving file handles dangling — same bug as forgetting Dispose().

1. When Things Go Wrong (try / except / finally)

Errors aren't failures. Errors are information — they tell you something unexpected happened, and your code gets to decide what to do about it. Python handles them with the try/except block, the equivalent of try/catch in C# or Java.

The danger of "Pokémon exception handling"

There's an anti-pattern common enough to have a name: Pokémon exception handling — because you "gotta catch 'em all."

# ❌ DANGER: a bare except catches EVERYTHING
try:
    result = do_something_risky()
except:  # also catches KeyboardInterrupt, SystemExit, typos in your own code...
    print("Something went wrong")

A bare except: is dangerously broad. It swallows:

  • KeyboardInterrupt — the user pressing Ctrl+C to stop the program

  • SystemExit — a deliberate sys.exit() call

  • SyntaxError-class bugs — typos that should fail loudly while you're developing

You don't want to silently eat any of those. You want Ctrl+C to actually stop the program, and you want a typo to crash so you can fix it. Catch only what you expect; let everything else propagate.

Catch specific errors

Always name the exception you anticipate:

try:
    user_id = int("admin")
except ValueError as e:
    print(f"[ERROR] Cannot convert input to integer: {e}")
[ERROR] Cannot convert input to integer: invalid literal for int() with base 10: 'admin'

The string "admin" can't become an integer, so Python raises a ValueError. We catch that specific error, log something useful, and carry on. A TypeError or KeyError would still crash — which, during development, is exactly what you want.

The finally block

finally runs no matter what — whether the try succeeded, an exception was caught, or an uncaught exception is on its way out the door. It's where cleanup that must happen goes:

database_connection = open_connection()

try:
    result = database_connection.query("SELECT * FROM users")
    print(result)
except ConnectionError as e:
    print(f"[ERROR] Database unreachable: {e}")
finally:
    # Always runs — success, handled error, or unhandled error:
    database_connection.close()
    print("[INFO] Database connection closed.")

2. Reading and Writing Files (the with statement)

Now that we can handle errors, let's move data in and out of the program. The built-in open() function is your gateway to the filesystem.

File modes

Mode

Name

Behaviour

'r'

Read

Opens for reading (the default). Raises FileNotFoundError if the file doesn't exist.

'w'

Write

Opens for writing. Overwrites the file if it exists; creates it if it doesn't.

'a'

Append

Opens for writing. Adds to the end if it exists; creates it if it doesn't.

Choose 'w' when you want a fresh file every run. Choose 'a' when you're adding to an existing log.

Read line-by-line (and why)

You could slurp an entire file into memory with file.read() — but what happens when the log is 10 GB? Your program grinds to a halt trying to load it all into RAM. Instead, iterate over the file object directly:

# ✅ Memory-efficient — one line in memory at a time
with open("huge_log.txt", "r") as file:
    for line in file:
        process(line)

This reads a single line, processes it, discards it, then grabs the next. A 10 GB file flows through just as smoothly as a 10 KB one.

Writing: a practical example

Appending a timestamped event to an audit log:

from datetime import datetime

new_event = "User 'admin' logged in from 192.168.1.10"
timestamp = datetime.now().strftime("%Y-%m-%d %H:%M:%S")

with open("audit_log.txt", "a") as file:
    file.write(f"[{timestamp}] {new_event}\n")

print("[INFO] Audit event recorded.")

Run it a few times and audit_log.txt fills up:

[2026-01-15 09:14:22] User 'admin' logged in from 192.168.1.10
[2026-01-15 09:15:01] User 'admin' logged in from 192.168.1.10
[2026-01-15 09:17:45] User 'admin' logged in from 192.168.1.10

The with block closes the file the moment it ends. No file.close(), no risk of forgetting it.

3. The Modern Standard: pathlib

Stop building file paths with string concatenation. Hardcoding a / or a \ is a bug waiting to fire the moment your code runs on a different operating system.

The old way (os.path)

You'll see this in older tutorials:

import os

# Functional, but clunky — functions wrapped in functions:
log_path = os.path.join(os.path.expanduser("~"), "server_logs", "access.log")

The modern way (pathlib.Path)

pathlib arrived in Python 3.4 and is the standard now. It overloads the / operator to join paths and handles the Windows-vs-Unix separator difference for you:

from pathlib import Path

# Works correctly on Linux, macOS, and Windows:
log_path = Path.home() / "server_logs" / "access.log"

if not log_path.exists():
    print("Log file missing!")

Path.home() resolves to the user's home directory (C:\Users\adrian on Windows, /home/adrian on Linux), the / operator joins the segments, and Python emits the right separator for the current OS. No os.path.join, no expanduser. Just readable code.

A few pathlib methods you'll use constantly:

Method / property

Purpose

path.exists()

True if the path exists

path.is_file()

True if it's a file

path.is_dir()

True if it's a directory

path.parent

The parent directory, as a Path

path.name

The filename with extension (access.log)

path.stem

The filename without extension (access)

path.suffix

Just the extension (.log)

4. Bringing It Together: The Server Log Cleanser

Now we combine everything — pathlib, with open(), and try/except — into one robust utility.

The scenario: you have a file called raw_traffic.log. Some lines hold valid IP addresses, some are corrupted with garbage, and sometimes the file doesn't exist at all. The script needs to survive every one of those without falling over.

We'll build it up one idea at a time.

Step 1 — Define paths with pathlib:

from pathlib import Path

input_path = Path("raw_traffic.log")
output_path = Path("clean_ips.txt")

Step 2 — Wrap the file opening in try / except FileNotFoundError:

try:
    with open(input_path, "r") as infile:
        pass  # processing goes here
except FileNotFoundError:
    print(f"[ERROR] File not found: {input_path}")

Step 3 — Open both files at once. with happily manages more than one resource on a single line:

try:
    with open(input_path, "r") as infile, open(output_path, "w") as outfile:
        pass  # read from infile, write to outfile
except FileNotFoundError:
    print(f"[ERROR] File not found: {input_path}")

If anything goes wrong, both files still get closed cleanly.

Step 4 — Loop, and handle corrupted lines with an inner try/except. This is the pattern that matters: we want to skip a single bad line without aborting the whole job, so the small try goes inside the loop.

import re
from pathlib import Path

# A rough pattern for an IP address: four groups of 1-3 digits, dot-separated.
IP_PATTERN = re.compile(r"\b(?:\d{1,3}\.){3}\d{1,3}\b")

input_path = Path("raw_traffic.log")
output_path = Path("clean_ips.txt")

valid_ips = []
skipped_lines = 0
total_lines = 0

print("[INFO] Attempting to process log file...")

try:
    with open(input_path, "r") as infile, open(output_path, "w") as outfile:
        # enumerate gives us the line number for free; start=1 so we count like humans.
        for line_number, line in enumerate(infile, start=1):
            total_lines += 1
            try:
                match = IP_PATTERN.search(line.strip())
                if match:
                    ip_address = match.group()
                    valid_ips.append(ip_address)
                    outfile.write(ip_address + "\n")
                else:
                    # No IP on this line — treat it as corrupted, but keep going.
                    skipped_lines += 1
                    print(f'[WARN] Skipping corrupted line {line_number}: "{line.strip()[:40]}"')
            except (ValueError, AttributeError):
                skipped_lines += 1
                print(f"[WARN] Skipping unreadable line {line_number}.")

except FileNotFoundError:
    print(f"[ERROR] File not found: {input_path}")
    print("[INFO] Please make sure raw_traffic.log exists in the current directory.")
else:
    # The else block runs ONLY if the try finished with no exception.
    print(f"[SUCCESS] Extracted {len(valid_ips)} valid IP addresses from {total_lines} lines.")
    if skipped_lines:
        print(f"[WARN] Skipped {skipped_lines} corrupted lines.")
    print(f"[SUCCESS] Clean data saved to {output_path}")

Look at the architecture:

  • The outer try/except catches the catastrophic failure — the file isn't there at all.

  • The inner try/except catches line-level problems — one bad line doesn't kill the run.

  • The else block on the outer try runs only when the file was found and fully processed.

This is the pattern that separates production code from a toy script. It degrades gracefully, reports what went wrong, and keeps working.

Want a file to test against? Create raw_traffic.log with a few good lines like 192.168.1.10 - GET /index.html and a junk line like GARBAGE_DATA%%!!, then run the script and watch it sort the good from the bad.

What's Next?

We now have scripts that make decisions, store complex data, use functions, and read and write files that persist. Our code survives the real world.

But what if we could compress ten lines of data-cleaning logic into a single, readable line? What if we could filter, transform, and reshape collections with a conciseness that makes other languages jealous? In Part 6: The Modern Developer's Toolkit, we dive into list comprehensions and lambdas — the secret weapons of genuinely Pythonic code.

Discussion

Read Next