Lab Experiment 2 - Transport Optimization Model using Python¶
_by Parikshit Pareek
This Notebook will walk you through defining a simple transport flow model using Pyomo, a Python-based optimization modeling language. You will then interact with the solutions and modify the model to add additional constraints. The base code is taken from the course materials are jointly developed by Michael Davidson and Jesse Jenkins at Princeton and UC San Diego
Setting up the Model¶
Although you should already have these, I am just placing them for completeness.
Load Packages¶
We need the following Python packages:
- Pyomo: The optimization modeling framework (similar to JuMP in Julia)
- Pandas: For data manipulation using DataFrames
- NumPy: For numerical array operations
If you don't have Pyomo installed, run: pip install pyomo (well what were you doing in last class! )
You'll also need a solver. Install GLPK or HiGHS Solver
# Import required packages
import pyomo.environ as pyo # Pyomo optimization library
import pandas as pd # For DataFrames
import numpy as np # For numerical operations
Define Sets¶
Sets are collections of indices that organize our problem. Think of them as the "dimensions" of our optimization problem.
In optimization terminology:
- A set is a collection of elements used to index variables, parameters, and constraints
- Sets make models scalable - if we add more plants or markets, the model structure remains the same
Production Plants, $P$
These are the locations where products are manufactured. In our problem, we have two plants: Mumbai and Delhi.
# Define the set of production plants as a Python list
P = ["Mumbai", "Delhi"]
print("Production Plants (P):", P)
Markets for Products, $M$
These are the locations where products need to be delivered. We have three markets: Jaipur, Delhi, and Roorkee.
# Define the set of markets as a Python list
M = ["Jaipur", "Delhi", "Roorkee"]
print("Markets (M):", M)
Note: In Python, sets can also be defined using
range()for numerical indices (e.g.,range(1, 11)for 1 to 10) or as any iterable collection.
Define Parameters¶
Parameters are the known input data for our optimization problem. They represent quantities that don't change during optimization. We'll use the defined sets as indices for our parameters.
A standard Linear Programming (LP) problem can be expressed as:
$$ \begin{aligned} \min \quad & \sum_{j \in J} c_j x_j \\ \text{s.t.} \quad & \sum_{j \in J} a_{ij} x_j \geq b_i, \quad \forall i \in I \\ & x_j \geq 0, \quad \forall j \in J \end{aligned} $$
In this formulation, the parameters are the fixed input values $c_j$, $a_{ij}$, and $b_i$, which represent the objective coefficients, constraint coefficients, and right-hand side values, respectively.
Plant Production Capacities
Each plant has a maximum production capacity - the most it can produce and ship. We store this in a Pandas DataFrame for easy access.
# Create DataFrame with plant capacities
# Mumbai can produce 350 units, Delhi can produce 650 units
plants = pd.DataFrame({
'plant': P,
'capacity': [350, 650]
})
plants
Demand for Products
Each market has a specific demand that must be satisfied. This is stored in a Pandas DataFrame.
# Create DataFrame with market demands
# Jaipur needs 325, Delhi needs 300, Roorkee needs 275
markets = pd.DataFrame({
'market': M,
'demand': [325, 300, 275]
})
markets
Accessing DataFrame Values
Here are a few different ways to access data from our DataFrames in Python/Pandas (all are equivalent):
# Option 1: Using .loc with boolean indexing
plants.loc[plants['plant'] == 'Delhi', 'capacity']
# Option 2: Using query method
plants.query('plant == "Delhi"')['capacity']
# Option 3: Using boolean mask directly
plants[plants['plant'] == 'Delhi']['capacity']
Note: Pandas indexing returns a Series by default. To get a single scalar value, use
.values[0]at the end:
plants[plants['plant'] == 'Delhi']['capacity'].values[0] # Returns: 650
# Getting a single value (not a Series)
delhi_capacity = plants[plants['plant'] == 'Delhi']['capacity'].values[0]
print(f"Delhi capacity: {delhi_capacity}")
print(f"Type: {type(delhi_capacity)}")
Distance from Plants to Markets
We need to know the distance from each plant to each market. This is stored in a 2D array (matrix) where:
- Rows represent plants (sources)
- Columns represent markets (destinations)
We'll use a Pandas DataFrame with the plant and market names as indices for easy access.
# Create a distance matrix as a DataFrame
# Rows = plants, Columns = markets
# Values are in hundreds of km
distances = pd.DataFrame(
[
[10.5, 14.0, 15.0], # Mumbai to [Jaipur, Delhi, Roorkee]
[2.7, 0.0, 1.8] # Delhi to [Jaipur, Delhi, Roorkee]
],
index=P, # Row labels = plant names
columns=M # Column labels = market names
)
distances
Accessing Distance Values
Here are a few ways to access the distance matrix values:
# Example 1: Distance from Mumbai to Jaipur
print(f"Distance from Mumbai to Jaipur: {distances.loc['Mumbai', 'Jaipur']} hundred km")
# Example 2: Distance from Delhi to Jaipur
print(f"Distance from Delhi to Jaipur: {distances.loc['Delhi', 'Jaipur']} hundred km")
# Example 3: Using .at for scalar access (faster for single values)
print(f"Distance from Delhi to Roorkee: {distances.at['Delhi', 'Roorkee']} hundred km")
Cost of Transport
The freight cost represents the cost per unit of product per unit of distance.
# Cost of freight shipment per unit per hundred km
freight_cost = 90 # rupees per unit per hundred km
print(f"Freight cost: ₹{freight_cost} per unit per hundred km")
Create Model¶
Now we create the Pyomo model object. This is the container that holds all our decision variables, constraints, and objective function.
In Pyomo, we use ConcreteModel() because all our data is known at model creation time (as opposed to AbstractModel() which separates model structure from data).
# Create a Pyomo concrete model
# This is equivalent to: transport = Model(HiGHS.Optimizer) in JuMP
transport = pyo.ConcreteModel(name="Transport_Problem")
print("Model created:", transport.name)
Define Variables¶
Decision variables are what the optimizer will determine. These are the unknowns we're solving for.
In this problem, we need to decide: How much product should we ship from each plant to each market?
We define $X_{p,m}$ = quantity to ship from plant $p$ to market $m$
Quantities of Product to Transport from Plant $p \in P$ to Market $m \in M$
# Define decision variable X[p,m] for shipping quantities
# X[p,m] >= 0 (can't ship negative quantities)
#
# In JuMP this was: @variable(transport, X[P,M] >= 0)
# In Pyomo, we use pyo.Var() with the domain set to NonNegativeReals
transport.X = pyo.Var(
P, M, # Indexed over plants and markets
within=pyo.NonNegativeReals, # X >= 0 constraint built into variable
doc="Quantity shipped from plant p to market m"
)
# Display the variable
transport.X.display()
Accessing Variables
Here's an example of referencing a specific decision variable:
# Example: Reference to the quantity shipped from Delhi to Roorkee
print(transport.X['Delhi', 'Roorkee'])
Define Constraints¶
Constraints are the rules that our solution must satisfy. They represent physical, logical, or business limitations.
We have two types of constraints:
- Supply constraints: Don't ship more than a plant can produce
- Demand constraints: Satisfy all demand at each market
Supply Capacity Constraint
Each plant can only ship up to its production capacity.
Mathematical formulation: $$\sum_{m \in M} X_{p,m} \leq \text{capacity}_p \quad \forall p \in P$$
In words: For each plant, the total shipments to all markets must be less than or equal to the plant's capacity.
# Define supply capacity constraint
#
# In JuMP this was:
# @constraint(transport, cSupply[p in P],
# sum(X[p,m] for m in M) <= plants.capacity[plants.plant.==p][1])
#
# In Pyomo, we define a "rule" function that returns the constraint expression
def supply_rule(model, p):
"""
Supply constraint: Total shipments from plant p <= capacity of plant p
Args:
model: The Pyomo model
p: Plant index from set P
Returns:
Constraint expression
"""
# Get the capacity for plant p from our DataFrame
plant_capacity = plants[plants['plant'] == p]['capacity'].values[0]
# Return the constraint: sum of shipments from p <= capacity
return sum(model.X[p, m] for m in M) <= plant_capacity
# Add the constraint to the model (one constraint for each plant)
transport.cSupply = pyo.Constraint(P, rule=supply_rule)
# Display the constraints
transport.cSupply.display()
Demand Balance Constraint
Ensure all demand is satisfied at each market.
Mathematical formulation: $$\sum_{p \in P} X_{p,m} \geq \text{demand}_m \quad \forall m \in M$$
In words: For each market, the total shipments from all plants must be greater than or equal to the market's demand.
# Define demand balance constraint
#
# In JuMP this was:
# @constraint(transport, cDemand[m in M],
# sum(X[p,m] for p in P) >= markets.demand[markets.market.==m][1])
def demand_rule(model, m):
"""
Demand constraint: Total shipments to market m >= demand at market m
Args:
model: The Pyomo model
m: Market index from set M
Returns:
Constraint expression
"""
# Get the demand for market m from our DataFrame
market_demand = markets[markets['market'] == m]['demand'].values[0]
# Return the constraint: sum of shipments to m >= demand
return sum(model.X[p, m] for p in P) >= market_demand
# Add the constraint to the model (one constraint for each market)
transport.cDemand = pyo.Constraint(M, rule=demand_rule)
# Display the constraints
transport.cDemand.display()
Define Objective Function¶
The objective function is what we're trying to optimize (minimize or maximize).
Goal: Minimize the total cost of transportation to satisfy all demand.
Mathematical formulation: $$\text{Minimize } Z = \sum_{p \in P} \sum_{m \in M} \text{freight\_cost} \times \text{distance}_{p,m} \times X_{p,m}$$
In words: For each plant-market pair, multiply the freight cost by the distance by the quantity shipped, then sum all these costs.
First, let's define an expression for the total cost of shipments (this is like JuMP's @expression):
# Define a named expression for total cost
def cost_expression(model):
"""
Calculate total transportation cost.
Cost = sum over all (p,m) pairs of: freight_cost × distance[p,m] × X[p,m]
"""
return sum(
freight_cost * distances.at[p, m] * model.X[p, m]
for p in P
for m in M
)
# Store the expression as an Expression object (optional, for reference)
transport.eCost = pyo.Expression(rule=cost_expression)
print("Cost expression defined.")
Now we'll set the objective to minimize this total cost:
# Define the objective function
#
# In Pyomo, we use pyo.Objective() with sense=pyo.minimize
transport.objective = pyo.Objective(
expr=transport.eCost, # The expression to minimize
sense=pyo.minimize, # We want to minimize cost
doc="Minimize total transportation cost"
)
print("Objective function defined: Minimize total cost")
Interact with the Model¶
(a) Now let's solve the model. Use the SolverFactory to create a solver and then call solve().¶
Common solvers:
glpk- GNU Linear Programming Kit (open source)cbc- COIN-OR Branch and Cut (open source)appsi_highs- HiGHS solver (open source, fast)
# Create a solver instance using HiGHS
# HiGHS is a modern, fast open-source solver
solver = pyo.SolverFactory('appsi_highs')
# Solve the model
# tee=True shows solver output (similar to verbose mode)
results = solver.solve(transport, tee=True)
# Check the solver status
print("\n" + "="*60)
print(f"Solver Status: {results.solver.status}")
print(f"Termination Condition: {results.solver.termination_condition}")
(b) You've got a solution! Now query the objective function value and save it to a variable.¶
In Pyomo, use pyo.value() to extract the numerical value of expressions and variables.
# Get the optimal objective value (total minimum cost)
(c) Now query and save the optimal solution for X (the decisions about shipment quantities from plant to market) to a DataFrame.¶
# Extract optimal values of X and store in a DataFrame
# Method 1: Create a matrix-style DataFrame
# Method 2: Create a detailed DataFrame with all shipping info
(d) Please interpret your results by writing an explanation below:¶
Which facility or facilities supplies the most demand in Jaipur? Does this result make sense? Why?
[Your interpretation here]
Which facility or facilities supplies the most demand in Roorkee? Does this result make sense? Why?
[Your interpretation here]
Which facility or facilities supplies the demand in Delhi? Does this result make sense? Why?
[Your interpretation here]
# Code to analyze the results
(e) New Marke Addition¶
A new market in MP appears say in Bhopal, with a demand for 50 units. It is located 750 hundred km from Mumbai and 500 hundred km from Delhi. Add this market to the model and solve again.
We need to:
- Extend the market set M
- Update the demand parameters
- Update the distance matrix
- Rebuild the model with the new data
# Create a new model with the extended data
(f) What is the new optimal solution?¶
# Get new optimal cost
# new_optimal_cost =
(g) Interpret this result: Which facility or facilities supplies the demand in Lucknow? Does this result make sense? Why?¶
[Your interpretation here]
# Code to analyze Lucknow supply
Summary¶
This notebook demonstrated a classic transportation problem using Python/Pyomo:
Key Concepts:¶
- Sets: Collections of indices (plants, markets) that organize our problem
- Parameters: Known input data (capacities, demands, distances, costs)
- Decision Variables: Unknowns to solve for (shipment quantities)
- Constraints: Rules the solution must satisfy (supply ≤ capacity, deliveries ≥ demand)
- Objective Function: What we optimize (minimize total cost)