Broadcasting

Broadcasting is only defined for MoYeArrays with static sizes.

In-place broadcasting preserves the original layout.

Out-of-place broadcasting always returns an owning array of a compact layout with the same shape and the stride ordered the same.

julia> using MoYe
julia> a = MoYeArray{Float64}(undef, @Layout((3,2), (2,1)))3×2 MoYeArray{Float64, 2, ArrayEngine{Float64, 6}, StaticLayout{2, Tuple{Static.StaticInt{3}, Static.StaticInt{2}}, Tuple{Static.StaticInt{2}, Static.StaticInt{1}}}} with indices _1:_3×_1:_2: 6.9457e-310 0.0 2.76236e-318 0.0 0.0 6.9457e-310
julia> fill!(a, 1.0);
julia> a .* 33×2 MoYeArray{Float64, 2, ArrayEngine{Float64, 6}, StaticLayout{2, Tuple{Static.StaticInt{3}, Static.StaticInt{2}}, Tuple{Static.StaticInt{2}, Static.StaticInt{1}}}} with indices _1:_3×_1:_2: 3.0 3.0 3.0 3.0 3.0 3.0
julia> a .+ a3×2 MoYeArray{Float64, 2, ArrayEngine{Float64, 6}, StaticLayout{2, Tuple{Static.StaticInt{3}, Static.StaticInt{2}}, Tuple{Static.StaticInt{2}, Static.StaticInt{1}}}} with indices _1:_3×_1:_2: 2.0 2.0 2.0 2.0 2.0 2.0
julia> b = MoYeArray{Float64}(undef, @Layout((3,), (2,))) |> zeros!; # Create a vector
julia> a .- b3×2 MoYeArray{Float64, 2, ArrayEngine{Float64, 6}, StaticLayout{2, Tuple{Static.StaticInt{3}, Static.StaticInt{2}}, Tuple{Static.StaticInt{2}, Static.StaticInt{1}}}} with indices _1:_3×_1:_2: 1.0 1.0 1.0 1.0 1.0 1.0

On GPU

(In-place) broadcasting on device should just work:

julia> function f()
           a = MoYeArray{Float64}(undef, @Layout((3,2)))
           fill!(a, one(eltype(a)))
           a .= a .* 2
           @cushow sum(a)
           b = CUDA.exp.(a)
           @cushow sum(b)
           return nothing
       end
f (generic function with 1 method)

julia> @cuda f()
sum(a) = 12.000000
sum(b) = 44.334337
CUDA.HostKernel{typeof(f), Tuple{}}(f, CuFunction(Ptr{CUDA.CUfunc_st} @0x0000026e00ca1af0, CuModule(Ptr{CUDA.CUmod_st} @0x0000026e15cfc900, CuContext(0x0000026da1fff8b0, instance e5a1871b578f5adb))), CUDA.KernelState(Ptr{Nothing} @0x0000000204e00000))