Problem 1. Maximization of the log-likelihood function
Mathematical Problem Statement
Problem dimension and solving time
Solution in Run-File Environment
Solution in MATLAB Environment
Problem 2. Maximization of the log-likelihood function minus additional regularization term
Mathematical Problem Statement
Problem dimension and solving time
Solution in Run-File Environment
Solution in MATLAB Environment
Problem 3. Maximization of the log-likelihood function subject to constraint on cardinality
Mathematical Problem Statement
Problem dimension and solving time
Solution in Run-File Environment
Solution in MATLAB Environment
Problem 4. 4-fold Cross-validation for maximization of the log-likelihood function
Mathematical Problem Statement
Problem dimension and solving time
Solution in Run-File Environment
Solution in MATLAB Environment
This case study finds an optimal estimate of the cesarean section (CS) rate in a population. The risk of difficult labor is described by a mathematical model that depends on measurable demographic factors. We use regular (“plain vanilla”) logistic regression and a regularized logistic regression to evaluate the effects of demographic factors on the probability of CS. This case study considers 6 primary factors: age, height, weight, maternal weight gain, gestational age, and birth weight.
We made 4-fold cross-validation for Problem 1. The optimization problem was run 4 times. In each run we selected ¾ of data as in-sample dataset (on which we calibrated the model). Then, we tested the performance of the model on the remaining out-of-sample dataset containing ¼ part of data.
Maximization of the log-likelihood function (“plain vanilla” logistic regression).
maximize
logexp_sum
Value:
logistic
where
logexp_sum = log-likelihood function for logistic regression (Logarithms Exponents Sum)
logistic = calculates values of logistic function for every observation (scenario)
Mathematical Problem Statement
Problem dimension and solving time
Number of Variables |
6 |
Number of Scenarios |
12,690 |
Objective Value |
-0.495793 |
Solving Time (sec) |
0.08 |
Solution in Run-File Environment
Input Files to run CS:
Output Files:
Solution in MATLAB Environment
Solved with riskprog and riskconstrprog PSG subroutines (General (Text) Format of PSG in MATLAB):
Input Files to run CS:
Input Files to run CS:
Maximization of the log-likelihood function minus additional regularization term (regularized logistic regression).
maximize
logexp_sum
-polynom_abs
Value:
logistic
where
logexp_sum = log-likelihood function for logistic regression (Logarithms Exponents Sum)
logistic = calculates values of logistic function for every observation (scenario)
Mathematical Problem Statement
Problem dimension and solving time
Number of Variables |
6 |
Number of Scenarios |
12,690 |
Objective Value |
-0.498204 |
Solving Time (sec) |
0.05 |
Solution in Run-File Environment
Input Files to run CS:
Output Files:
Solution in MATLAB Environment
Solved with riskprog and riskconstrprog PSG subroutines (General (Text) Format of PSG in MATLAB):
Input Files to run CS:
Input Files to run CS:
Maximization of the log-likelihood function subject to constraint on cardinality.
maximize
logexp_sum
Constraint: <= 4
cardn
Solver: precision = 9
Value:
logistic
where
logexp_sum = log-likelihood function for logistic regression (Logarithms Exponents Sum)
cardn = cardinality function
logistic = calculates values of logistic function for every observation (scenario)
Mathematical Problem Statement
Problem dimension and solving time
Number of Variables |
6 |
Number of Scenarios |
12,690 |
Objective Value |
-0.497135 |
Solving Time (sec) |
0.35 |
Solution in Run-File Environment
Input Files to run CS:
Output Files:
Solution in MATLAB Environment
Solved with riskprog and riskconstrprog PSG subroutines (General (Text) Format of PSG in MATLAB):
Input Files to run CS:
Input Files to run CS:
4-fold Cross-validation (4 in-sample data and 4 out-of-sample data) for maximization of the log-likelihood function.
4-fold crossvalidation
Maximize logexp_sum
Value:
logistic (function Logistic on the in-sample data)
logistic (function Logistic on the out-of-sample data)
where
crossvalidation(N,Matrix) = matrix operation splits input Matrix into N pairs of complementary sub-matrices
logexp_sum = log-likelihood function for logistic regression (Logarithms Exponents Sum)
logistic = calculates values of logistic function for every observation (scenario)
Mathematical Problem Statement
Problem dimension and solving time
For one problem in Cross-validation:
|
Dataset1 |
Dataset2 |
Dataset3 |
Dataset4 |
Number of Variables |
6 |
6 |
6 |
6 |
Number of Scenarios |
9,517 |
9,517 |
9,517 |
9,517 |
Objective Value |
-0.496 |
-0.495 |
-0.498 |
-0.494 |
Solving Time (sec) |
0.15 |
0.18 |
0.05 |
0.08 |
Solution in Run-File Environment
Input Files to run CS:
Output Files:
Solution in MATLAB Environment
Solved with tbpsg_run function (PSG MATLAB Toolbox):
Input Files to run CS:
Input Files to run CS: