Chapter 6: Automating Desktop Applications

Automate Windows-based applications and repetitive tasks. Develop desktop automation solutions using real Python code, browser automation and local computer workflows.

Desktop Automation Windows Apps PyAutoGUI Pywinauto Selenium
Mouse
Automation
Keyboard
Automation
Window
Control
Website
Testing

6.1 Chapter Overview

Desktop automation means using software to control repetitive tasks on a local computer. Instead of a human repeatedly clicking, typing, copying, pasting, saving files and opening applications, an automation script can perform these actions quickly and consistently.

In RPA and industrial digital transformation, desktop automation is useful when business applications do not provide APIs or system integration. It allows bots to interact with existing Windows applications, forms, spreadsheets, websites and reports.

Learning Outcome: By the end of this chapter, learners should be able to develop desktop automation solutions for Windows-based applications and repetitive tasks using real working Python examples.
DESKTOP AUTOMATION FLOW SCRIPT Python Bot ACTIONS Click / Type WINDOWS APP Notepad / ERP OUTPUT Saved File A desktop automation bot controls mouse, keyboard, windows, forms and files

6.2 Learning Objectives

  • Understand desktop automation concepts and use cases.
  • Install Python libraries for local automation.
  • Use PyAutoGUI for mouse, keyboard and screen automation.
  • Automate Notepad and simple Windows tasks.
  • Automate file and CSV processing tasks.
  • Use pywinauto for Windows application control.
  • Use Selenium to test web-based automation locally or online.
  • Develop simple desktop automation projects with real working code.

6.3 Desktop Automation Use Cases

Use Case Manual Work Automation Solution
Daily report preparation Open files, copy data, prepare summary and save report. Python script reads CSV/Excel and generates report automatically.
Data entry into Windows application User types repeated data into forms. Bot enters data using keyboard and window automation.
Email attachment processing User downloads files and renames them. Bot monitors folder and processes files.
Website form testing User manually opens website and fills form. Selenium automates browser interaction.
System health check User opens apps and checks logs. Bot runs checks and generates status report.

6.4 Setup for Working Examples

The examples in this chapter can be tested on a local Windows computer. Some browser automation examples can also be tested using local HTML files.

Step 1: Install Python

Install Python 3.10 or above from the official Python website. During installation, select Add Python to PATH.

Step 2: Install Required Libraries

pip install pyautogui
pip install pywinauto
pip install selenium
pip install pandas
pip install openpyxl

Step 3: Test Python

print("PDTC Desktop Automation Setup Successful")
Expected Output:
PDTC Desktop Automation Setup Successful
Safety Note: Desktop automation controls your mouse and keyboard. Test on sample files first. Do not run automation on live business systems until it has been tested and approved.

6.5 PyAutoGUI for Mouse and Keyboard Automation

PyAutoGUI is a Python library that can control the mouse, keyboard and screen. It is useful for simple desktop automation tasks such as clicking buttons, typing text, pressing shortcut keys and taking screenshots.

Example 1: Move Mouse and Click

import pyautogui
import time

# Wait 3 seconds so you can move your hand away from the mouse.
time.sleep(3)

# Move mouse to x=500, y=300 position on screen.
pyautogui.moveTo(500, 300, duration=1)

# Click at the current mouse position.
pyautogui.click()

print("Mouse moved and clicked successfully.")

Line-by-Line Explanation

Code Explanation
import pyautogui Imports the library used to control mouse and keyboard.
import time Imports time library for delay.
time.sleep(3) Waits 3 seconds before starting.
pyautogui.moveTo(500, 300) Moves the mouse pointer to screen coordinate x=500, y=300.
pyautogui.click() Performs a mouse click.

Example 2: Type Text Automatically

import pyautogui
import time

time.sleep(3)

pyautogui.write("Welcome to PDTC Desktop Automation Training", interval=0.05)

print("Text typed successfully.")
How to Test: Open Notepad manually, place cursor inside Notepad, then run this script. The text should appear automatically.

6.6 Getting Mouse Position

When building desktop automation, you may need to know the x and y coordinates of a button or field. The following code prints the current mouse position every second.

import pyautogui
import time

print("Move your mouse. Press Ctrl + C to stop.")

while True:
    x, y = pyautogui.position()
    print("Mouse Position:", x, y)
    time.sleep(1)
Expected Output Example:
Mouse Position: 624 312
Mouse Position: 711 448

6.7 Working Example: Automate Notepad

This example opens Notepad, types a training message and saves the file on the desktop. It can be tested on a local Windows computer.

import pyautogui
import time
import os

# Open the Windows Run dialog.
pyautogui.hotkey("win", "r")
time.sleep(1)

# Type notepad and press Enter.
pyautogui.write("notepad")
pyautogui.press("enter")
time.sleep(2)

# Type content into Notepad.
message = """PDTC Desktop Automation Example

This file was created automatically using Python and PyAutoGUI.

Learning Outcome:
Develop desktop automation solutions.
"""

pyautogui.write(message, interval=0.01)

# Save file using Ctrl + S.
pyautogui.hotkey("ctrl", "s")
time.sleep(1)

# Prepare file path.
desktop = os.path.join(os.path.expanduser("~"), "Desktop")
file_path = os.path.join(desktop, "PDTC_Automation_Test.txt")

# Type file path in Save As dialog.
pyautogui.write(file_path)
pyautogui.press("enter")

print("Notepad file created successfully:", file_path)
Important: This script depends on Windows dialogs and keyboard focus. Do not touch the keyboard or mouse while it is running.

Expected Result

A text file named PDTC_Automation_Test.txt should be created on your Desktop.

6.8 Screenshot Automation

Screenshots are useful in RPA for evidence, error logging and audit trail.

import pyautogui
import os
from datetime import datetime

desktop = os.path.join(os.path.expanduser("~"), "Desktop")

timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")

file_name = "PDTC_Screenshot_" + timestamp + ".png"

file_path = os.path.join(desktop, file_name)

screenshot = pyautogui.screenshot()

screenshot.save(file_path)

print("Screenshot saved:", file_path)

Real Use Case

Scenario Why Screenshot Helps
Bot completes transaction Screenshot provides evidence.
Bot encounters error Screenshot helps troubleshooting.
Compliance audit Screenshot can support audit logs.

6.9 Working Example: Automate CSV Report Generation

This example can run on any computer with Python and pandas installed. It creates sample sales data, calculates totals and produces a summary CSV report.

import pandas as pd

# Create sample transaction data.
data = {
    "Department": ["Finance", "HR", "Finance", "Training", "Training", "HR"],
    "Task": ["Invoice", "Payroll", "Report", "Attendance", "Certificate", "Onboarding"],
    "Minutes_Saved": [120, 90, 60, 45, 75, 80]
}

# Convert dictionary into DataFrame.
df = pd.DataFrame(data)

# Group data by department and calculate total minutes saved.
summary = df.groupby("Department")["Minutes_Saved"].sum().reset_index()

# Save original and summary report.
df.to_csv("pdtc_automation_tasks.csv", index=False)
summary.to_csv("pdtc_automation_summary.csv", index=False)

print("Original Data:")
print(df)

print("\nSummary Report:")
print(summary)

print("\nCSV reports created successfully.")
Expected Output:
A file named pdtc_automation_tasks.csv and another file named pdtc_automation_summary.csv will be created in the same folder.

6.10 Working Example: Excel Report with openpyxl

This example creates an Excel workbook with automation savings data.

from openpyxl import Workbook
from openpyxl.styles import Font, PatternFill

# Create workbook and worksheet.
wb = Workbook()
ws = wb.active
ws.title = "Automation Savings"

# Add header row.
headers = ["Department", "Process", "Minutes Saved"]
ws.append(headers)

# Add data rows.
rows = [
    ["Finance", "Invoice Processing", 120],
    ["HR", "Payroll Support", 90],
    ["Training", "Attendance Report", 45],
    ["Training", "Certificate Generation", 75]
]

for row in rows:
    ws.append(row)

# Format header.
for cell in ws[1]:
    cell.font = Font(bold=True)
    cell.fill = PatternFill(start_color="FFD400", end_color="FFD400", fill_type="solid")

# Save workbook.
wb.save("PDTC_Automation_Savings.xlsx")

print("Excel report created successfully.")
How to Test: Run the script. Open the generated file named PDTC_Automation_Savings.xlsx.

6.11 Pywinauto for Windows Application Control

Pywinauto is a Python library designed for automating Windows GUI applications. Unlike pure coordinate-based automation, it can connect to windows and controls by title, class or automation ID.

PyAutoGUI Pywinauto
Controls mouse and keyboard by coordinates and keystrokes. Controls Windows applications using UI elements.
Simple but sensitive to screen position. More reliable for Windows application automation.
Good for quick tasks. Good for structured desktop application automation.

Working Example: Open Notepad and Type Text with Pywinauto

from pywinauto.application import Application
import time

# Start Notepad application.
app = Application(backend="uia").start("notepad.exe")

# Wait for Notepad to open.
time.sleep(2)

# Connect to Notepad window.
window = app.window(title_re=".*Notepad.*")

# Type text inside Notepad.
window.type_keys("PDTC Pywinauto Desktop Automation Example", with_spaces=True)

print("Text typed in Notepad using pywinauto.")
Note: This works on Windows. If your Notepad version has a different window title, you may need to inspect the title and adjust title_re.

6.12 Working Example: Calculator Automation with Pywinauto

This example opens Windows Calculator and performs a calculation. Windows Calculator versions may differ, so this script uses keyboard input after launching the app.

from pywinauto.application import Application
import time
import pyautogui

# Start Windows Calculator.
app = Application(backend="uia").start("calc.exe")

# Wait for calculator to open.
time.sleep(2)

# Use keyboard input to calculate 25 + 17.
pyautogui.write("25")
pyautogui.press("+")
pyautogui.write("17")
pyautogui.press("enter")

print("Calculator automation completed.")
Expected Result: Windows Calculator should display 42.

6.13 Website Automation with Selenium

Many modern business applications are browser-based. Selenium automates web browsers such as Chrome, Edge and Firefox. It is useful for testing web forms, login screens, dashboards and online processes.

Install Selenium

pip install selenium

Working Example: Open a Website and Print Page Title

from selenium import webdriver
from selenium.webdriver.chrome.options import Options

# Create Chrome browser options.
options = Options()

# Open browser.
driver = webdriver.Chrome(options=options)

# Go to PDTC website.
driver.get("https://perakskills.com")

# Print page title.
print("Page Title:", driver.title)

# Close browser.
driver.quit()
Note: This requires Chrome installed. Selenium Manager in recent Selenium versions can manage drivers automatically.

6.14 Website Form Automation You Can Test Locally

Create a file named test_form.html using the code below. Then run the Selenium script after it.

Step 1: Create test_form.html

<!DOCTYPE html>
<html>
<head>
<title>PDTC Test Form</title>
</head>
<body>

<h1>PDTC Automation Test Form</h1>

<label>Name</label>
<input id="name" type="text">

<br><br>

<label>Course</label>
<input id="course" type="text">

<br><br>

<button id="submitBtn" onclick="showResult()">Submit</button>

<p id="result"></p>

<script>
function showResult(){
  let name = document.getElementById("name").value;
  let course = document.getElementById("course").value;
  document.getElementById("result").innerHTML =
  "Submitted: " + name + " - " + course;
}
</script>

</body>
</html>

Step 2: Selenium Script to Fill the Local Form

from selenium import webdriver
from selenium.webdriver.common.by import By
from pathlib import Path
import time

# Get full file path of local HTML form.
file_path = Path("test_form.html").resolve()

# Open Chrome browser.
driver = webdriver.Chrome()

# Open local HTML file.
driver.get(file_path.as_uri())

# Fill name field.
driver.find_element(By.ID, "name").send_keys("PDTC Student")

# Fill course field.
driver.find_element(By.ID, "course").send_keys("Certified Industrial Automation Professional")

# Click submit button.
driver.find_element(By.ID, "submitBtn").click()

# Wait so user can see result.
time.sleep(3)

# Print result text.
result = driver.find_element(By.ID, "result").text
print(result)

# Close browser.
driver.quit()
Expected Result: The browser opens the local form, fills the name and course, clicks Submit and prints the submitted result.

6.15 Interactive Web Example inside This Chapter

This small form below simulates a web application. It is not Python-based, but it helps learners understand what Selenium would automate in a browser.

Click the button to simulate web automation output.

6.16 Error Handling in Desktop Automation

Real automation must handle errors such as missing windows, wrong file paths, failed clicks and unexpected popups.

import pyautogui
import time

try:
    time.sleep(2)

    # Try to press Ctrl + S.
    pyautogui.hotkey("ctrl", "s")

    print("Save shortcut executed successfully.")

except Exception as error:
    print("Automation error occurred:", error)

Better Error Logging Example

from datetime import datetime

try:
    result = 10 / 0

except Exception as error:
    with open("automation_error_log.txt", "a") as log:
        log.write(str(datetime.now()) + " - " + str(error) + "\n")

    print("Error logged successfully.")

6.17 Desktop Automation Best Practices

Best Practice Reason
Start with sample files Prevents damage to real business data.
Add time delays Allows applications to load properly.
Use screenshots for error evidence Helps troubleshooting and audit.
Use stable selectors when possible More reliable than screen coordinates.
Log all important steps Helps track what the bot completed.
Do not store passwords directly in code Protects security and compliance.
Test with normal and exception cases Prepares the bot for real-world variations.

6.18 Complete Mini Projects

Mini Project 1: Notepad Report Generator

Create a Python script that opens Notepad, writes a daily automation report and saves it on the desktop.

Mini Project 2: CSV Automation Report

Create a CSV file with department, process name and minutes saved. Use Python to generate a summary report by department.

Mini Project 3: Local Website Form Testing

Create a local HTML form and use Selenium to fill and submit it automatically.

Mini Project 4: Screenshot Evidence Bot

Create a script that takes a screenshot before and after a desktop task and saves both images with timestamps.

6.19 Practical Activities

Activity 1: Mouse Position Practice

Use PyAutoGUI to print your mouse coordinates and identify coordinates for a button on your screen.

Activity 2: Keyboard Automation

Open Notepad and use PyAutoGUI to type your name, course title and date automatically.

Activity 3: File Automation

Create a Python script that creates three text files named report1.txt, report2.txt and report3.txt.

Activity 4: Browser Automation

Use Selenium to open a website, print the page title and close the browser.

6.20 Interactive Final Assessment Quiz

Each correct answer gives +1 mark.
Each wrong answer gives -0.5 mark.

Instructions: Select the correct answer for each question and click Submit Assessment.

1. Desktop automation can control mouse and keyboard actions.

2. Which Python library is commonly used for mouse and keyboard automation?

3. Pywinauto is used for Windows GUI application automation.

4. Selenium is mainly used for:

5. Desktop automation should be tested on sample files first.

6. Which command presses Ctrl + S in PyAutoGUI?

7. Screenshots can be used as evidence in automation logs.

8. It is safe to store business passwords directly inside automation code.

9. openpyxl can be used to create and edit Excel files in Python.

10. Error handling is important in desktop automation.

Your Score: 0

6.21 Chapter Summary

In this chapter, learners studied desktop automation for Windows-based applications and repetitive tasks. They explored real Python code using PyAutoGUI, pywinauto, pandas, openpyxl and Selenium. Learners also practiced local Notepad automation, CSV report generation, Excel report creation, website form automation and error handling.

Remember: A good desktop automation solution is reliable, tested, secure, documented and able to handle errors gracefully.