Local Extremas and Nesterov Accelerated Gradient
Python. To find a global minimum for a selected function, I compared brute force sweeping with Nesterov Accelerated Gradient (NAG)optimisation technique. By applying a small deltaX increments, I obtained local extremas in the range of selected lower and upper bounds. Then, I obtained the global minimum by iterating a nested NAG function. Finally, I compared iteration counts. I used the function y = x^2 + sin(4x) – 1.5 in the range of [-2, 2] with the deltaXs = 0.001 (brute force) and 0.2 (nested NAG) respectively. The brute force method’s iteration count is 4000 whereas Nesterov Accelerated Gradient’s iteration count is 1137 with its specific parameters.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 | import math import numpy as np def function(x): return x**2 + math.sin(4*x) - 1.5 def firstDerivative(x): return 2 * x + 4 * math.cos(4*x) def secondDerivative(x): return 2 - 16 * math.sin(4*x) def printResult(x): if secondDerivative(x) > 0: print("The concave up extrema is at:", "(",round(x, 4),",",round(function(x), 4),")") elif secondDerivative(x) < 0: print("The concave down extrema is at:", "(",round(x, 4),",",round(function(x), 4),")") def localExtremas(deltaX, lowerBound, upperBound): x = lowerBound isNegativeSlope = firstDerivative(x) < 0 while x < upperBound: if firstDerivative(x) > 0 and isNegativeSlope == True: isNegativeSlope = False printResult(x) elif firstDerivative(x) < 0 and isNegativeSlope == False: isNegativeSlope = True printResult(x) x += deltaX def nesterovOptimizer(x, alpha=0.01, beta=0.9, maxIter=1000, tol=1e-5): v = 0 count = 0 for i in range(maxIter): gradient = firstDerivative(x + beta * v) v = beta * v - alpha * gradient xNew = x + v count += 1 if np.abs(xNew - x) < tol: break x = xNew return x, function(x), count def globalMinimum(deltaX, lowerBound, upperBound): x = lowerBound optimalX = lowerBound minValue = function(x) totalCount = 0 while x < upperBound: newOptimalX, newMinValue, count = nesterovOptimizer(x) if newMinValue < minValue: minValue = newMinValue optimalX = newOptimalX x += deltaX totalCount += count return optimalX, minValue, totalCount if __name__ == "__main__": # Determine local extremas with incremental sweep in the range of [-2, 2] with 0.001 deltaX lowerBound = -2.0 upperBound = 2.0 deltaX = 0.001 localExtremas(deltaX , lowerBound, upperBound) print("The iteration count for local extramas:", int(np.abs((upperBound - lowerBound)/ deltaX))) # Calling Nesterov Accelerated Gradent with 0.2 incremental deltaX advancements deltaX = 0.2 optimalX, minValue, count = globalMinimum(deltaX, lowerBound, upperBound) print("The global minimum by Nesterov Accelerated Gradient is at:", "(",round(optimalX, 4),",", round(minValue, 4),")") print("The iteration count by Nesterov Accelerated Gradient:", count) |
[1] On the Importance of Initialization and Momentum in Deep Learning (2013), I. Sutskever, J. Martens, G. Dahl, G. Hinton [2] Nesterov Accelerated Gradient [3] Top Optimizers for Neural Networks [4] Desmos Calculator