I am trying to solve coupled ODEs, but I present a small part of the problem here. I have tried to solve for R along the entire grid. The grid size is too big to run conventional methods, hence I am trying to parallelise the same in Julia but I am still unable to run the code within my Mac environment. I use M1 with 8GB RAM and have tried multiple methods to solve the problem. Could you please let me know, given the below code, if I am doing anything wrong?
function f(R, v)
numerator = R * exp(1 / (3 - 50 * R))
denominator = (2 + (200 * R) / 3)^(4 / 9) * (2 - (100 * R) / 3)^(5 / 9)
return v - (numerator / denominator)
end
function df(R, v)
# Approximate the derivative using central difference method
h = 1e-6
return (f(R + h, v) - f(R - h, v)) / (2 * h)
end
# Newton's method implementation
function newton(f, df, R0, v; tol=1e-6, maxiter=1000)
R = R0
for i in 1:maxiter
R_new = R - f(R, v) / df(R, v)
if abs(R_new - R) < tol
return R_new
end
R = R_new
end
error("Newton's method did not converge")
end
# Array to store final converged values of R
# This is the value along the intial hypersurface
R_0 = Float64[]
v = LinRange(0.0000, 150, 110001)
# Initial conditions
let R_initial = 0.025 # Initial guess for R
for v in v
R_solution = newton(f, df, R_initial, v)
R_initial = R_solution
push!(R_0, R_solution) # Update R0 to the last converged value of R
end
end
using PlotlyJS
plot(R_0, v)
function Initial_Function(R)
H_0 = R^2 * (0.0525 - R)^2
end
H_initial = Float64[]
for R in R_0
if R <= 0.0525
H = Initial_Function(R)
push!(H_initial, H)
end
if R > 0.0525
H = 0
push!(H_initial, H)
end
end
using PlotlyJS
plot(R_0, H_initial)
# H, R, v are defined
# We run the RK4 scheme to calculate the value of R across the grid.
u = range(-100, 10, length = 110001)
v_range = range(0,150,length = 110001)
using Distributed
addprocs(4)
@everywhere using SharedArrays
R_grid = SharedArray{Float64}(110001,110001)
R_grid[1,:] = R_0
function dR_du(u, R)
g = (R^2/6)*(u*R + (u^2 * R^2)/9)
return (- 0.5 * g)
end
function rk4(u, R, du)
k1 = dR_du(u, R)
k2 = dR_du(u + 0.5*du, R+0.5*du*k1)
k3 = dR_du(u + 0.5*du, R + 0.5*du*k2)
k4 = dR_du(u + du, R+ du*k3)
return R + (du/6) * (k1 + 2*k2 + 2*k3 + k4)
end
Threads.@threads for j in 1:110001
for i in 2:110001
u_vals = u[i-1]
du = u[i] - u_vals
R_grid[j, i] = rk4(u_vals, R_grid[j, i-1], du)
end
end
I expect to find the value of R along the entire u and v grid. You can see I have calculated the value of R_0 which is along u = -100 and v = 0. Runge-Kutta Scheme is used then to solve for the rest of the grid points after we have arrived at R_0 using the Newton scheme to solve a certain equation.
I have tried to setup up a parallel process but my laptop crashes each time.
I am trying to solve coupled ODEs, but I present a small part of the problem here. I have tried to solve for R along the entire grid. The grid size is too big to run conventional methods, hence I am trying to parallelise the same in Julia but I am still unable to run the code within my Mac environment. I use M1 with 8GB RAM and have tried multiple methods to solve the problem. Could you please let me know, given the below code, if I am doing anything wrong?
function f(R, v)
numerator = R * exp(1 / (3 - 50 * R))
denominator = (2 + (200 * R) / 3)^(4 / 9) * (2 - (100 * R) / 3)^(5 / 9)
return v - (numerator / denominator)
end
function df(R, v)
# Approximate the derivative using central difference method
h = 1e-6
return (f(R + h, v) - f(R - h, v)) / (2 * h)
end
# Newton's method implementation
function newton(f, df, R0, v; tol=1e-6, maxiter=1000)
R = R0
for i in 1:maxiter
R_new = R - f(R, v) / df(R, v)
if abs(R_new - R) < tol
return R_new
end
R = R_new
end
error("Newton's method did not converge")
end
# Array to store final converged values of R
# This is the value along the intial hypersurface
R_0 = Float64[]
v = LinRange(0.0000, 150, 110001)
# Initial conditions
let R_initial = 0.025 # Initial guess for R
for v in v
R_solution = newton(f, df, R_initial, v)
R_initial = R_solution
push!(R_0, R_solution) # Update R0 to the last converged value of R
end
end
using PlotlyJS
plot(R_0, v)
function Initial_Function(R)
H_0 = R^2 * (0.0525 - R)^2
end
H_initial = Float64[]
for R in R_0
if R <= 0.0525
H = Initial_Function(R)
push!(H_initial, H)
end
if R > 0.0525
H = 0
push!(H_initial, H)
end
end
using PlotlyJS
plot(R_0, H_initial)
# H, R, v are defined
# We run the RK4 scheme to calculate the value of R across the grid.
u = range(-100, 10, length = 110001)
v_range = range(0,150,length = 110001)
using Distributed
addprocs(4)
@everywhere using SharedArrays
R_grid = SharedArray{Float64}(110001,110001)
R_grid[1,:] = R_0
function dR_du(u, R)
g = (R^2/6)*(u*R + (u^2 * R^2)/9)
return (- 0.5 * g)
end
function rk4(u, R, du)
k1 = dR_du(u, R)
k2 = dR_du(u + 0.5*du, R+0.5*du*k1)
k3 = dR_du(u + 0.5*du, R + 0.5*du*k2)
k4 = dR_du(u + du, R+ du*k3)
return R + (du/6) * (k1 + 2*k2 + 2*k3 + k4)
end
Threads.@threads for j in 1:110001
for i in 2:110001
u_vals = u[i-1]
du = u[i] - u_vals
R_grid[j, i] = rk4(u_vals, R_grid[j, i-1], du)
end
end
I expect to find the value of R along the entire u and v grid. You can see I have calculated the value of R_0 which is along u = -100 and v = 0. Runge-Kutta Scheme is used then to solve for the rest of the grid points after we have arrived at R_0 using the Newton scheme to solve a certain equation.
I have tried to setup up a parallel process but my laptop crashes each time.
As mentioned in the comments your matrix is way to big.
You should look into mechanisms for out-of-core computation, which are designed for exactly this.
Classical ones include:
Blocking: instead of allocating the entire matrix, you can have each thread allocate a row, or whatever block size you can so that block&size*num_cores is less than your ram. On completion they put it to disk and free the memory. This is easy if the rows can be distributed easily, can en more difficult if all of them need access to all of it at the same time.
The second is memory mapping (mmap) the file, this is a Linux mechanism that gives you something that looks like a memory block, and you can then map to your array, but instead when you write to a[][], the OS caches it for a while and when memory is full writes it to disk. This approach can be complicated, as the mmap file is shared, which means you need synchronization control (locks) to ensure threads don't step on each other when writing and reading. With that the algorithm should definitely work, but, if you have access patterns that are very random, and a hard drive, this is a well know way to tank your performance as hdds don't perform well with random Io.
Other methods exist for this with different tradeoffs or requirements, and would not require any change to the algorithm.