# Jackknife resampling

In statistics, the jackknife is a resampling technique especially useful for variance and bias estimation. The jackknife pre-dates other common resampling methods such as the bootstrap. The jackknife estimator of a parameter is found by systematically leaving out each observation from a dataset and calculating the estimate and then finding the average of these calculations. Given a sample of size $n$ , the jackknife estimate is found by aggregating the estimates of each $(n-1)$ -sized sub-sample.

The jackknife technique was developed by Maurice Quenouille (1924–1973) from 1949 and refined in 1956. John Tukey expanded on the technique in 1958 and proposed the name "jackknife" because, like a physical jack-knife (a compact folding knife), it is a rough-and-ready tool that can improvise a solution for a variety of problems even though specific problems may be more efficiently solved with a purpose-designed tool.

The jackknife is a linear approximation of the bootstrap.

## Estimation

The jackknife estimate of a parameter can be found by estimating the parameter for each subsample omitting the i-th observation. For example, if the parameter to be estimated is the population mean of x, we compute the mean ${\bar {x}}_{i}$ for each subsample consisting of all but the i-th data point:

${\bar {x}}_{i}={\frac {1}{n-1}}\sum _{j=1,j\neq i}^{n}x_{j},\quad \quad i=1,\dots ,n.$ These n estimates form an estimate of the distribution of the sample statistic if it were computed over a large number of samples. In particular, the mean of this sampling distribution is the average of these n estimates:

${\bar {x}}={\frac {1}{n}}\sum _{i=1}^{n}{\bar {x}}_{i}.$ One can show explicitly that this ${\bar {x}}$ equals the usual estimate ${\frac {1}{n}}\sum _{i=1}^{n}x_{i}$ , so the real point emerges for higher moments than the mean. A jackknife estimate of the variance of the estimator can be calculated from the variance of this distribution of ${\bar {x}}_{i}$ :

$\operatorname {Var} ({\bar {x}})={\frac {n-1}{n}}\sum _{i=1}^{n}({\bar {x}}_{i}-{\bar {x}})^{2}={\frac {1}{n(n-1)}}\sum _{i=1}^{n}(x_{i}-{\bar {x}})^{2}.$ ## Bias estimation and correction

The jackknife technique can be used to estimate the bias of an estimator calculated over the entire sample. Say ${\hat {\theta }}$ is the calculated estimator of the parameter of interest based on all ${n}$ observations. Let

${\hat {\theta }}_{\mathrm {(.)} }={\frac {1}{n}}\sum _{i=1}^{n}{\hat {\theta }}_{(i)}$ where ${\hat {\theta }}_{(i)}$ is the estimate of interest based on the sample with the i-th observation removed, and ${\hat {\theta }}_{\mathrm {(.)} }$ is the average of these "leave-one-out" estimates. The jackknife estimate of the bias of ${\hat {\theta }}$ is given by:

${\widehat {\text{bias}}}_{\mathrm {(\theta )} }=(n-1)({\hat {\theta }}_{\mathrm {(.)} }-{\hat {\theta }})$ and the resulting bias-corrected jackknife estimate of $\theta$ is given by:

${\hat {\theta }}_{\text{jack}}={\hat {\theta }}-{\widehat {\text{bias}}}_{\mathrm {(\theta )} }=n{\hat {\theta }}-(n-1){\hat {\theta }}_{\mathrm {(.)} }$ This removes the bias in the special case that the bias is $O(n^{-1})$ and removes it to $O(n^{-2})$ in other cases.