Any
Type in a Customized StructRecently, I was trying to optimize some Julia code I wrote which consumed a lot of memory and was quite slow. Following the performance tips in the official manual (here), in particular, tips on Type Declarations and Type Stability (here), I was able to pinpoint one of the bottlenecks in terms of memory consumption: the usage of Any
type in a customized struct.
My original code can be essentially simplified as follows.
struct MyBadType
a::Array{Any}
end
When I create a new instance of MyBadType
using a decently large vector, the memory consumption looks like this.
my_bad_instance = MyBadType(collect(1:100_000))
# MyBadType(Any[1, 2, 3, 4, 5, 6, 7, 8, 9, 10 … 99991, 99992, 99993, 99994, 99995, 99996, 99997, 99998, 99999, 100000])
varinfo(r"my_bad_instance")
# name size summary
# ––––––––––––––– ––––––––– –––––––––
# my_bad_instance 1.526 MiB MyBadType
Now, a way to fix the memory consumption is to specify exactly what type the vector should be, which shrinks the memory to roughly one half of the original in this particular case.
Note there is an implicit type conversion here: the vector collect(1:100_000)
is converted from Vector{Int64}
to Vector{Float64}
when it is passed to the struct.
struct MyGoodType
a::Array{Float64}
end
my_good_instance = MyGoodType(collect(1:100_000))
# MyGoodType([1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0 … 99991.0, 99992.0, 99993.0, 99994.0, 99995.0, 99996.0, 99997.0, 99998.0, 99999.0, 100000.0])
varinfo(r"my_good_instance")
# name size summary
# –––––––––––––––– ––––––––––– ––––––––––
# my_good_instance 781.297 KiB MyGoodType
An even better approach is to pass the type as a parameter to the struct, which allows the user to specify the type of the vector when creating a new instance.
struct MyBetterType{T}
a::Array{T}
end
my_better_instance = MyBetterType(collect(1:100_000))
# MyBetterType{Int64}([1, 2, 3, 4, 5, 6, 7, 8, 9, 10 … 99991, 99992, 99993, 99994, 99995, 99996, 99997, 99998, 99999, 100000])
varinfo(r"my_better_instance")
# name size summary
# –––––––––––––––––– ––––––––––– –––––––––––––––––––
# my_better_instance 781.297 KiB MyBetterType{Int64}
Notice now I can actually pass different types of numbers (Int64
, Float64
, etc.) without changing the struct definition.
my_better_instance_float = MyBetterType(rand(100_000))
# MyBetterType{Float64}([0.3799954925351946, 0.09394504229817657, 0.9777517079162116, 0.016242447464765775, 0.990499183701726, 0.4424052738990033, 0.7675470847284869, 0.13617624001850415, 0.2265064097636631, 0.8815719623179058 … 0.19912083332829478, 0.23186628406715581, 0.23339379488864898, 0.3912429397894809, 0.39492205317362883, 0.5167797738208597, 0.389639621406601, 0.8499027356361832, 0.36235758448613853, 0.8784435419352293])
varinfo(r"my_better_instance_float")
# name size summary
# –––––––––––––––––––––––– ––––––––––– –––––––––––––––––––––
# my_better_instance_float 781.297 KiB MyBetterType{Float64}