Estimation of cost differences for fixed error between timestepping and spacetime timeslab approaches

Balancing temporal and spatial error in a timestepping scheme

Per Fidkowski 2017 JCP:
$$
|\delta \bar{J}^{ts}| \approx |\delta \bar{J}^s| + |\delta \bar{J}^t|
$$
where the terms represent the total output error in a timestepping simulation,
the output error due to the spatial discretization, and the output error due to
the temporal discretization.

We can discretize the spatial domain using a Galerkin discretization, resulting
in the following relationship for the output error:
$$
|\delta \bar{J}^s| \sim \left( N_\mathrm{DOF}^s \right)^{- \frac{2(p + 1)}{d}}
$$
where $p$ is the formal order of the spatial scheme. Meanwhile, the temporal discretization is timestepped, and thus should have
output error according to:
$$
|\delta \bar{J}^t| \sim \left( N_\mathrm{DOF}^t \right)^{-(r + 1)}
$$
with $r$ being the formal order of the timestepping scheme.

We are interested in the expense of a simulation at a given amount of output
error, with the temporal and spatial errors balanced:
$$
|\delta \bar{J}^t| \approx |\delta \bar{J}^s| \approx \frac{1}{2} |\delta
\bar{J}^{ts}|
$$
This holds up to a constant factor when:
$$
\left( N_\mathrm{DOF}^t \right)^{-(r + 1)} \sim
\left( N_\mathrm{DOF}^s \right)^{- \frac{2(p + 1)}{d}}
$$
$$
N_\mathrm{DOF}^t \sim \left( N_\mathrm{DOF}^s \right)^{\frac{2(p + 1)}{d
(r + 1)}}
$$

The total degrees of freedom of a timestepping scheme are equal to:
$$
N_\mathrm{DOF}^{ts}= N_\mathrm{DOF}^{t} N_\mathrm{DOF}^{s}
$$
therefore:
$$
N_\mathrm{DOF}^{ts} \sim \left( N_\mathrm{DOF}^s \right)^{\frac{2(p + 1)}{d
(r + 1)} + 1}
$$
and
$$
N_\mathrm{DOF}^s \sim \left( N_\mathrm{DOF}^{ts} \right)^{\frac{1}{\frac{2(p
+ 1)}{d (r + 1)} + 1}}
$$
We can write, under the assumptions, that:
$$
|\delta \bar{J}^{ts}| \sim \left( N_\mathrm{DOF}^s \right)^{- \frac{2(p
+ 1)}{d}} \sim \left( N_\mathrm{DOF}^{ts} \right)^{- \frac{\frac{2(p
+ 1)}{d}}{\frac{2(p + 1)}{d (r + 1)} + 1}}
$$
There's a lot going on here... let's simplify by taking the logarithm
$$
\log |\delta \bar{J}^{ts}| \sim - \frac{\frac{2(p + 1)}{d}}{\frac{2(p + 1)}{d
(r + 1)} + 1} \log \left( N_\mathrm{DOF}^{ts} \right)
$$
$$
\log |\delta \bar{J}^{ts}| \sim - \frac{2 (p + 1)(r + 1)}{2 (p + 1) + d (r + 1)} \log \left( N_\mathrm{DOF}^{ts} \right) \
$$

Spacetime error

Spacetime error is much simpler to express:
$$
|\delta \bar{J}^{st}| \sim \left( N_\mathrm{DOF}^{st} \right)^{- \frac{2(p
+ 1)}{d + 1}}
$$
taking the logarithm:
$$
\log |\delta \bar{J}^{st}| \sim - \frac{2(p + 1)}{d + 1} \log
\left( N_\mathrm{DOF}^{st} \right)
$$

Comparison

When the spacetime and timestepping errors are about the same:
$$
|\delta \bar{J}^{st}| \approx |\delta \bar{J}^{ts}|
$$
we must have:
$$
-\frac{2(p + 1)}{d + 1} \log \left( N_\mathrm{DOF}^{st} \right) \sim
- \frac{2 (p + 1)(r + 1)}{2 (p + 1) + d (r + 1)} \log \left( N_\mathrm{DOF}^{ts} \right)
$$
$$
\log \left( N_\mathrm{DOF}^{st} \right) \sim \frac{(r + 1)(d + 1)}{2 (p + 1)
+ d (r + 1)} \log \left( N_\mathrm{DOF}^{ts} \right)
$$
$$
N_\mathrm{DOF}^{st} \sim \left( N_\mathrm{DOF}^{ts} \right)^{\frac{(r + 1)
(d + 1)}{2 (p + 1) + d (r + 1)}}
$$
Let's look at the exponent there:
$$
\frac{(r + 1)(d + 1)}{2 (p + 1) + d (r + 1)}= \frac{r + 1 + rd + d}{2p + 2 + rd + d}
$$
$r$ is the order of the timestepping scheme, $p$ is the order of the spatial
discretization as well as the spacetime discretization, $d$ is the physical
dimension of the problem. $d \geq 1$, $p \geq 0$, and $r \geq 0$.

Interestingly, when $r + 1 > 2p + 2$, the exponent will be greater than one, and
the asymptotic costs of the timestepping scheme will be less than the spacetime
scheme. However, when that is not the case, the spacetime scheme will have a
lower asymptotic cost. Typically, the highest order scheme we will use in time
is $r= 4$, in which case, the spacetime approach is going to be asymptotically
more efficient in terms of total degrees of freedom per unit error for $p \geq 2$. For $p= 0$, a timestepping scheme must use $r > 1$ to outpace the spacetime approach; for $p= 1$, $r > 3$ must apply, and for $p \geq 2$ demands $r \geq 5$. All of this implies that, at least asymptotically, a high-order spacetime adaptive approach will outpace the timestepping approach in the long run.

Of course, when $r + 1 = 2p + 2$, the crucial part will be the constant
coefficient that leads the terms, but this gives a forum for evaluating whether or not the spacetime approach is more cost-effective than the timestepping approach: comparing a order $p$ spacetime scheme with a timestepping scheme with order $2p + 1$ and spatial order $p$. Because of the results seen in Masayuki Yano's PhD thesis, Figure 7-4, there is reason to believe that the $\mathcal{O}(1)$ scaling of error in capturing a feature of width $\delta << L$ (where $L$ is the size of the domain) will apply to the spacetime problem, and this will likely lead to improved performance of the adaptive timeslab approach relative to the timestepping approach of spatially adapted grids for a wide class of problems of interest.