Previously we had a two issues when building cuda executables
that required separable compilation. The first was that we didn't
propagate FLAGS causing any -arch / -gencode flags to be dropped, and
secondly generators such as ninja would use the CXX language flags
instead of CUDA when the executable was mixed language.