Broadcasting
Broadcasting is only defined for MoYeArray
s with static sizes.
In-place broadcasting preserves the original layout.
Out-of-place broadcasting always returns an owning array of a compact layout with the same shape and the stride ordered the same.
julia> using MoYe
julia> a = MoYeArray{Float64}(undef, @Layout((3,2), (2,1)))
3×2 MoYeArray{Float64, 2, ArrayEngine{Float64, 6}, StaticLayout{2, Tuple{Static.StaticInt{3}, Static.StaticInt{2}}, Tuple{Static.StaticInt{2}, Static.StaticInt{1}}}} with indices _1:_3×_1:_2: 6.91606e-310 5.0e-324 2.76237e-318 5.0e-324 4.0e-323 6.9161e-310
julia> fill!(a, 1.0);
julia> a .* 3
3×2 MoYeArray{Float64, 2, ArrayEngine{Float64, 6}, StaticLayout{2, Tuple{Static.StaticInt{3}, Static.StaticInt{2}}, Tuple{Static.StaticInt{2}, Static.StaticInt{1}}}} with indices _1:_3×_1:_2: 3.0 3.0 3.0 3.0 3.0 3.0
julia> a .+ a
3×2 MoYeArray{Float64, 2, ArrayEngine{Float64, 6}, StaticLayout{2, Tuple{Static.StaticInt{3}, Static.StaticInt{2}}, Tuple{Static.StaticInt{2}, Static.StaticInt{1}}}} with indices _1:_3×_1:_2: 2.0 2.0 2.0 2.0 2.0 2.0
julia> b = MoYeArray{Float64}(undef, @Layout((3,), (2,))) |> zeros!; # Create a vector
julia> a .- b
3×2 MoYeArray{Float64, 2, ArrayEngine{Float64, 6}, StaticLayout{2, Tuple{Static.StaticInt{3}, Static.StaticInt{2}}, Tuple{Static.StaticInt{2}, Static.StaticInt{1}}}} with indices _1:_3×_1:_2: 1.0 1.0 1.0 1.0 1.0 1.0
On GPU
(In-place) broadcasting on device should just work:
julia> function f()
a = MoYeArray{Float64}(undef, @Layout((3,2)))
fill!(a, one(eltype(a)))
a .= a .* 2
@cushow sum(a)
b = CUDA.exp.(a)
@cushow sum(b)
return nothing
end
f (generic function with 1 method)
julia> @cuda f()
sum(a) = 12.000000
sum(b) = 44.334337
CUDA.HostKernel{typeof(f), Tuple{}}(f, CuFunction(Ptr{CUDA.CUfunc_st} @0x0000026e00ca1af0, CuModule(Ptr{CUDA.CUmod_st} @0x0000026e15cfc900, CuContext(0x0000026da1fff8b0, instance e5a1871b578f5adb))), CUDA.KernelState(Ptr{Nothing} @0x0000000204e00000))