辅导Stat 310、讲解Java，c/c++，Python程序语言、BFGS algorithm辅导

2020.03.15 - 首页 >> C/C++编程

Stat 310, Final Take Home Due 03/18/2020 at 3:00 pm in one of
the Department Offices. I will specify by an announcement
which one, it will be either with Zellencia or with Kirsten.
For this assignment, submit your scripts as follows. In your submission folder
for this class, create a subfolder, FINAL. Submit then the scripts at the problem
below as indicated for each. ALL files required to run your code must be
submitted too (except intval, which you may assume we have if you use it).
Problem 1. (65 points, out of which 10 for correct submission) Consider the
problem I have shared with you online today, for which you can find details at
https://canvas.uchicago.edu/courses/26199/pages/code-for-the-burgersequation-version-of-the-final
The code describes the simulation of a hidden markov model and provide the
function that computes the likelihod. Your assignment is set around trying to
compute the maximum likelihood (or, equivalently, the minimum of the negative log
likelihood which is given in burger_smoothing_short()). Read the details on how to
run it at the page above. You will started at the observed state (obs(:), which is a
noisy version of what you are trying to find out). On my 2013 laptop my Newtondogleg
approach solves the problem in 10-25 iterations, depending on the random
generation, within 1 minute.
1) Use a finite difference approach to compute gradients of the function
burger_smoothing_short_fgH(x) or, equivalently, of burgers_smoothing(Nx, Nt,
T, nu, sigma_B, sigma_m, sigma_o, obs, u) with respect to the last variable, u
at the point obs(:)
a) Explain what method you would propose and why you would choose it as
well as its parameters.
b) Propose a way to validate it, I.e to verify it and carry out numerically on the
example provided to demonstrate you are computing it correctly.
2) Use an algorithm of your choosing to solve this problem among any algorithms
you have already implemented starting at the point obs(:) and stop it when you
reduced the gradient of the objective by 1e-3 relative to its original value. Plot the
surface of the result side-by-side with the observation. You can use either gradients
or Hessians computed with intval, or the divided difference ones above. Describe
what the optimization appears to produce.
3) Apply now the limited memory BFGS algorithm 7.5 while storing only 4 pairs of
vectors and stop it after 10 iterations. To avoid the code being stuck, I suggest
outputting the norm of the gradient any time you compute it. Plot the surface of the
result side-by-side with the observations (obs(:)).
Mihai Anitescu, STAT 310, Assigned: 03/11/2020; Due: 03/18/2020
a) Do you see a material difference with the previous output?
b) How do you choose the line search method and why? What must you pay
attention in the line search selection to so that the L-BFGS method works fine?
c) Do you expect the convergence rate of the L-BFGS method to be superlinear? How
about the one of the BFGS method?
d) Can you think of a circumstance where the L-BFGS method with 4 vectors stored
would be superlinearily convergent when being randomly started?
e) Can you explain the relative advantages and disavantages of the BFGS and L-BFGS
methods? If in the problem N_x = N_t = 1e^8 which method would you choose?
Would you be able to run it in this case (I am NOT asking you to).
4) Submit your code as follows. Create a folder FINAL and insert in it the file
final1.m whose calling sequence should be [x,gnorm]=final1(x_in); where x_in is the
initial value (for 2,3 you will run it from obs(:) but for this outcome I would like to
be able to run it for any point; it should be in column vector format and not array);
whereas x is the solution and g_norm is the norm of the gradient at x.
Problem 2 (35 points, 5 points for correct code submission) Consider the class of
optimization problems
Here we will change Note that may be negative, that is the QP
may be nonconvex.
1) Is this problem guaranteed to have a solution? Write the optimality conditions at
an optimum point.
2) Propose an algorithm of your choosing to solve this problem for a given
(if unsure, you can always choose Algorithm
16.5 as we discussed in class). The algorithm must find a KKT point for this problem
for n=10 for any choice of the other parameters in a reasonable amount of time (say,
in at most twice the time it takes the code I shared with you to solve the problem at
part 1 for the choice of parameters I provided)., but of course work for any n so that
the KKT residual of the solution you provided is less than 1e-4.
3) Describe the algorithm carefully, or if you use one from the textbook, simply
mention the page and number.
4) Explain how you extract the Lagrange multipliers from your solution.
5) Is your algorithm guaranteed to produce a second-order stationary point? Why or
why not?
6) Report the solution time of the algorithm for n being 10, 20, 30 for uniformly
random instances of parameters by starting
it at a point of all zeroes. Compute the mean and plot it as a function of n. How does
the effort appear to change with n?
7) Submit your code as follows: In the FINAL folder, insert the file final2.m with the
calling sequence [x,lambda]=final2(n,gamma,a,eps). Here n is the size of the
problem, gamma and a ( a column vector with n entries) are problem parameters
Mihai Anitescu, STAT 310, Assigned: 03/11/2020; Due: 03/18/2020
defined above and eps is the stopping tolerance. As an output x is the solution, and
lambda are the Lagrange multipliers.