Chapter 6: Automating Desktop Applications
Automate Windows-based applications and repetitive tasks. Develop desktop automation solutions using real Python code, browser automation and local computer workflows.
Automation
Automation
Control
Testing
6.1 Chapter Overview
Desktop automation means using software to control repetitive tasks on a local computer. Instead of a human repeatedly clicking, typing, copying, pasting, saving files and opening applications, an automation script can perform these actions quickly and consistently.
In RPA and industrial digital transformation, desktop automation is useful when business applications do not provide APIs or system integration. It allows bots to interact with existing Windows applications, forms, spreadsheets, websites and reports.
6.2 Learning Objectives
- Understand desktop automation concepts and use cases.
- Install Python libraries for local automation.
- Use PyAutoGUI for mouse, keyboard and screen automation.
- Automate Notepad and simple Windows tasks.
- Automate file and CSV processing tasks.
- Use pywinauto for Windows application control.
- Use Selenium to test web-based automation locally or online.
- Develop simple desktop automation projects with real working code.
6.3 Desktop Automation Use Cases
| Use Case | Manual Work | Automation Solution |
|---|---|---|
| Daily report preparation | Open files, copy data, prepare summary and save report. | Python script reads CSV/Excel and generates report automatically. |
| Data entry into Windows application | User types repeated data into forms. | Bot enters data using keyboard and window automation. |
| Email attachment processing | User downloads files and renames them. | Bot monitors folder and processes files. |
| Website form testing | User manually opens website and fills form. | Selenium automates browser interaction. |
| System health check | User opens apps and checks logs. | Bot runs checks and generates status report. |
6.4 Setup for Working Examples
The examples in this chapter can be tested on a local Windows computer. Some browser automation examples can also be tested using local HTML files.
Step 1: Install Python
Install Python 3.10 or above from the official Python website. During installation, select Add Python to PATH.
Step 2: Install Required Libraries
pip install pyautogui pip install pywinauto pip install selenium pip install pandas pip install openpyxl
Step 3: Test Python
print("PDTC Desktop Automation Setup Successful")
PDTC Desktop Automation Setup Successful
6.5 PyAutoGUI for Mouse and Keyboard Automation
PyAutoGUI is a Python library that can control the mouse, keyboard and screen. It is useful for simple desktop automation tasks such as clicking buttons, typing text, pressing shortcut keys and taking screenshots.
Example 1: Move Mouse and Click
import pyautogui
import time
# Wait 3 seconds so you can move your hand away from the mouse.
time.sleep(3)
# Move mouse to x=500, y=300 position on screen.
pyautogui.moveTo(500, 300, duration=1)
# Click at the current mouse position.
pyautogui.click()
print("Mouse moved and clicked successfully.")
Line-by-Line Explanation
| Code | Explanation |
|---|---|
| import pyautogui | Imports the library used to control mouse and keyboard. |
| import time | Imports time library for delay. |
| time.sleep(3) | Waits 3 seconds before starting. |
| pyautogui.moveTo(500, 300) | Moves the mouse pointer to screen coordinate x=500, y=300. |
| pyautogui.click() | Performs a mouse click. |
Example 2: Type Text Automatically
import pyautogui
import time
time.sleep(3)
pyautogui.write("Welcome to PDTC Desktop Automation Training", interval=0.05)
print("Text typed successfully.")
6.6 Getting Mouse Position
When building desktop automation, you may need to know the x and y coordinates of a button or field. The following code prints the current mouse position every second.
import pyautogui
import time
print("Move your mouse. Press Ctrl + C to stop.")
while True:
x, y = pyautogui.position()
print("Mouse Position:", x, y)
time.sleep(1)
Mouse Position: 624 312
Mouse Position: 711 448
6.7 Working Example: Automate Notepad
This example opens Notepad, types a training message and saves the file on the desktop. It can be tested on a local Windows computer.
import pyautogui
import time
import os
# Open the Windows Run dialog.
pyautogui.hotkey("win", "r")
time.sleep(1)
# Type notepad and press Enter.
pyautogui.write("notepad")
pyautogui.press("enter")
time.sleep(2)
# Type content into Notepad.
message = """PDTC Desktop Automation Example
This file was created automatically using Python and PyAutoGUI.
Learning Outcome:
Develop desktop automation solutions.
"""
pyautogui.write(message, interval=0.01)
# Save file using Ctrl + S.
pyautogui.hotkey("ctrl", "s")
time.sleep(1)
# Prepare file path.
desktop = os.path.join(os.path.expanduser("~"), "Desktop")
file_path = os.path.join(desktop, "PDTC_Automation_Test.txt")
# Type file path in Save As dialog.
pyautogui.write(file_path)
pyautogui.press("enter")
print("Notepad file created successfully:", file_path)
Expected Result
6.8 Screenshot Automation
Screenshots are useful in RPA for evidence, error logging and audit trail.
import pyautogui
import os
from datetime import datetime
desktop = os.path.join(os.path.expanduser("~"), "Desktop")
timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
file_name = "PDTC_Screenshot_" + timestamp + ".png"
file_path = os.path.join(desktop, file_name)
screenshot = pyautogui.screenshot()
screenshot.save(file_path)
print("Screenshot saved:", file_path)
Real Use Case
| Scenario | Why Screenshot Helps |
|---|---|
| Bot completes transaction | Screenshot provides evidence. |
| Bot encounters error | Screenshot helps troubleshooting. |
| Compliance audit | Screenshot can support audit logs. |
6.9 Working Example: Automate CSV Report Generation
This example can run on any computer with Python and pandas installed. It creates sample sales data, calculates totals and produces a summary CSV report.
import pandas as pd
# Create sample transaction data.
data = {
"Department": ["Finance", "HR", "Finance", "Training", "Training", "HR"],
"Task": ["Invoice", "Payroll", "Report", "Attendance", "Certificate", "Onboarding"],
"Minutes_Saved": [120, 90, 60, 45, 75, 80]
}
# Convert dictionary into DataFrame.
df = pd.DataFrame(data)
# Group data by department and calculate total minutes saved.
summary = df.groupby("Department")["Minutes_Saved"].sum().reset_index()
# Save original and summary report.
df.to_csv("pdtc_automation_tasks.csv", index=False)
summary.to_csv("pdtc_automation_summary.csv", index=False)
print("Original Data:")
print(df)
print("\nSummary Report:")
print(summary)
print("\nCSV reports created successfully.")
A file named pdtc_automation_tasks.csv and another file named pdtc_automation_summary.csv will be created in the same folder.
6.10 Working Example: Excel Report with openpyxl
This example creates an Excel workbook with automation savings data.
from openpyxl import Workbook
from openpyxl.styles import Font, PatternFill
# Create workbook and worksheet.
wb = Workbook()
ws = wb.active
ws.title = "Automation Savings"
# Add header row.
headers = ["Department", "Process", "Minutes Saved"]
ws.append(headers)
# Add data rows.
rows = [
["Finance", "Invoice Processing", 120],
["HR", "Payroll Support", 90],
["Training", "Attendance Report", 45],
["Training", "Certificate Generation", 75]
]
for row in rows:
ws.append(row)
# Format header.
for cell in ws[1]:
cell.font = Font(bold=True)
cell.fill = PatternFill(start_color="FFD400", end_color="FFD400", fill_type="solid")
# Save workbook.
wb.save("PDTC_Automation_Savings.xlsx")
print("Excel report created successfully.")
6.11 Pywinauto for Windows Application Control
Pywinauto is a Python library designed for automating Windows GUI applications. Unlike pure coordinate-based automation, it can connect to windows and controls by title, class or automation ID.
| PyAutoGUI | Pywinauto |
|---|---|
| Controls mouse and keyboard by coordinates and keystrokes. | Controls Windows applications using UI elements. |
| Simple but sensitive to screen position. | More reliable for Windows application automation. |
| Good for quick tasks. | Good for structured desktop application automation. |
Working Example: Open Notepad and Type Text with Pywinauto
from pywinauto.application import Application
import time
# Start Notepad application.
app = Application(backend="uia").start("notepad.exe")
# Wait for Notepad to open.
time.sleep(2)
# Connect to Notepad window.
window = app.window(title_re=".*Notepad.*")
# Type text inside Notepad.
window.type_keys("PDTC Pywinauto Desktop Automation Example", with_spaces=True)
print("Text typed in Notepad using pywinauto.")
6.12 Working Example: Calculator Automation with Pywinauto
This example opens Windows Calculator and performs a calculation. Windows Calculator versions may differ, so this script uses keyboard input after launching the app.
from pywinauto.application import Application
import time
import pyautogui
# Start Windows Calculator.
app = Application(backend="uia").start("calc.exe")
# Wait for calculator to open.
time.sleep(2)
# Use keyboard input to calculate 25 + 17.
pyautogui.write("25")
pyautogui.press("+")
pyautogui.write("17")
pyautogui.press("enter")
print("Calculator automation completed.")
6.13 Website Automation with Selenium
Many modern business applications are browser-based. Selenium automates web browsers such as Chrome, Edge and Firefox. It is useful for testing web forms, login screens, dashboards and online processes.
Install Selenium
pip install selenium
Working Example: Open a Website and Print Page Title
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
# Create Chrome browser options.
options = Options()
# Open browser.
driver = webdriver.Chrome(options=options)
# Go to PDTC website.
driver.get("https://perakskills.com")
# Print page title.
print("Page Title:", driver.title)
# Close browser.
driver.quit()
6.14 Website Form Automation You Can Test Locally
Create a file named test_form.html using the code below. Then run the Selenium script after it.
Step 1: Create test_form.html
<!DOCTYPE html>
<html>
<head>
<title>PDTC Test Form</title>
</head>
<body>
<h1>PDTC Automation Test Form</h1>
<label>Name</label>
<input id="name" type="text">
<br><br>
<label>Course</label>
<input id="course" type="text">
<br><br>
<button id="submitBtn" onclick="showResult()">Submit</button>
<p id="result"></p>
<script>
function showResult(){
let name = document.getElementById("name").value;
let course = document.getElementById("course").value;
document.getElementById("result").innerHTML =
"Submitted: " + name + " - " + course;
}
</script>
</body>
</html>
Step 2: Selenium Script to Fill the Local Form
from selenium import webdriver
from selenium.webdriver.common.by import By
from pathlib import Path
import time
# Get full file path of local HTML form.
file_path = Path("test_form.html").resolve()
# Open Chrome browser.
driver = webdriver.Chrome()
# Open local HTML file.
driver.get(file_path.as_uri())
# Fill name field.
driver.find_element(By.ID, "name").send_keys("PDTC Student")
# Fill course field.
driver.find_element(By.ID, "course").send_keys("Certified Industrial Automation Professional")
# Click submit button.
driver.find_element(By.ID, "submitBtn").click()
# Wait so user can see result.
time.sleep(3)
# Print result text.
result = driver.find_element(By.ID, "result").text
print(result)
# Close browser.
driver.quit()
6.15 Interactive Web Example inside This Chapter
This small form below simulates a web application. It is not Python-based, but it helps learners understand what Selenium would automate in a browser.
6.16 Error Handling in Desktop Automation
Real automation must handle errors such as missing windows, wrong file paths, failed clicks and unexpected popups.
import pyautogui
import time
try:
time.sleep(2)
# Try to press Ctrl + S.
pyautogui.hotkey("ctrl", "s")
print("Save shortcut executed successfully.")
except Exception as error:
print("Automation error occurred:", error)
Better Error Logging Example
from datetime import datetime
try:
result = 10 / 0
except Exception as error:
with open("automation_error_log.txt", "a") as log:
log.write(str(datetime.now()) + " - " + str(error) + "\n")
print("Error logged successfully.")
6.17 Desktop Automation Best Practices
| Best Practice | Reason |
|---|---|
| Start with sample files | Prevents damage to real business data. |
| Add time delays | Allows applications to load properly. |
| Use screenshots for error evidence | Helps troubleshooting and audit. |
| Use stable selectors when possible | More reliable than screen coordinates. |
| Log all important steps | Helps track what the bot completed. |
| Do not store passwords directly in code | Protects security and compliance. |
| Test with normal and exception cases | Prepares the bot for real-world variations. |
6.18 Complete Mini Projects
Mini Project 1: Notepad Report Generator
Create a Python script that opens Notepad, writes a daily automation report and saves it on the desktop.
Mini Project 2: CSV Automation Report
Create a CSV file with department, process name and minutes saved. Use Python to generate a summary report by department.
Mini Project 3: Local Website Form Testing
Create a local HTML form and use Selenium to fill and submit it automatically.
Mini Project 4: Screenshot Evidence Bot
Create a script that takes a screenshot before and after a desktop task and saves both images with timestamps.
6.19 Practical Activities
Activity 1: Mouse Position Practice
Use PyAutoGUI to print your mouse coordinates and identify coordinates for a button on your screen.
Activity 2: Keyboard Automation
Open Notepad and use PyAutoGUI to type your name, course title and date automatically.
Activity 3: File Automation
Create a Python script that creates three text files named report1.txt, report2.txt and report3.txt.
Activity 4: Browser Automation
Use Selenium to open a website, print the page title and close the browser.
6.20 Interactive Final Assessment Quiz
Each correct answer gives +1 mark.
Each wrong answer gives -0.5 mark.
1. Desktop automation can control mouse and keyboard actions.
2. Which Python library is commonly used for mouse and keyboard automation?
3. Pywinauto is used for Windows GUI application automation.
4. Selenium is mainly used for:
5. Desktop automation should be tested on sample files first.
6. Which command presses Ctrl + S in PyAutoGUI?
7. Screenshots can be used as evidence in automation logs.
8. It is safe to store business passwords directly inside automation code.
9. openpyxl can be used to create and edit Excel files in Python.
10. Error handling is important in desktop automation.
Your Score: 0
6.21 Chapter Summary
In this chapter, learners studied desktop automation for Windows-based applications and repetitive tasks. They explored real Python code using PyAutoGUI, pywinauto, pandas, openpyxl and Selenium. Learners also practiced local Notepad automation, CSV report generation, Excel report creation, website form automation and error handling.