Linear prediction (LP) analysis is a ubiquitous analysis technique in current speech technology. The basis of LP analysis is the source-filter production model of speech. For voiced sounds in particular, the filter is assumed to be an all-pole linear filter and the source is considered to be a semi-periodic impulse train which is zero most of the times, i.e., the source is a sparse time series. LP analysis results in the estimation of the all-pole filter parameters representing the spectral shape of the vocal tract. The accuracy of this estimation can be evaluated by observing the extent in which the residuals (the prediction error) of the corresponding prediction filter resemble the hypothesized source of excitation [1] (a perfect impulse train in case of voiced speech). However, it is shown in [1] that even when the vocal tract filter follows an actual all-pole model, this criterion of goodness is not fulfilled by the classical minimum variance predictor. Despite the theoretic physical significance, such sparse representation forms the basis for many applications in speech technology. For instance, a class of efficient parametric speech coders are based on the search for a sparse excitation sequence feeding the LP synthesizer [2].

It is argued in [3] that the reason behind the failure of the classical method in providing such sparse representation is that it relies on the minimization of *l*_{2}-norm of prediction error. It is known that the *l*_{2}-norm criterion is highly sensitive to the outliers [4], i.e., the points having considerably larger norms of error. Hence, *l*_{2}-norm error minimization favors solutions with many small non-zero entries rather than the sparse solutions having the fewest possible non-zero entries [4]. Hence, *l*_{2}-norm is not an appropriate objective function for the problems where sparseness constraints are incorporated. Indeed, the ideal solution for sparse residual recovery is to directly minimize the cardinality of this vector, i.e., the *l*_{0}-norm of prediction error which yields a combinatorial optimization problem. Instead, to alleviate the exaggerative effect of *l*_{2}-norm criterion at points with large norms of error, it is usual to consider the minimization of *l*_{1}-norm as it puts less emphasis on outliers. *l*_{1}-norm can be regarded as a convex relaxation of the *l*_{0}-norm and its minimization problem can be re-casted into a linear program and solved by convex programming techniques [5].

The *l*_{1}-norm minimization of residuals is already proven to be beneficial for speech processing [6–8]. In [6], the stability issue of *l*_{1}-norm linear programming is addressed and a method is introduced for both having an intrinsically stable solution as well as keeping the computational cost down. The approach is based the Burg method for autoregressive parameters estimation using the least absolute forward-backward error.

In [7], the authors have compared the Burg method with their *l*_{1}-norm minimization method using the modern interior points method and shown that the sparseness is not preserved with the Burg method. Later, they have proposed a re-weighted *l*_{1}-norm minimization approach in [8], to enhance the sparsity of the residuals and to overcome the mismatch between *l*_{0}-norm minimization and *l*_{1}-norm minimization while keeping the problem solvable with convex programming tools. Initially the *l*_{1}-norm minimization problem is solved using the interior points method and then the resulted residuals are used iteratively, to re-weight the *l*_{1}-norm objective function such that less weight is given to the points having larger residual norms. The optimization problem is thus iteratively approaching the solution for the ideal *l*_{0}-norm objective function. We also mention that, an interesting review is made in [9, 10], on several solvers for the general problem of mixed *l*_{
p
}*l*_{0}-norm minimization in the context of piece-wise constant function approximation, which indeed their adaptation to the problem of sparse linear prediction analysis can be beneficial (particularly the stepwise jump penalization algorithm, which is shown to be highly efficient and reliable in detection of sparse events).

In this article, we propose a new and efficient solution to sparse LP analysis which is based on weighting of the *l*_{2}-norm objective function so as to maintain the computational tractability of the final optimization problem and to avoid the computational burden of convex programming. The weighting function plays the most important role in our solution in maintaining the sparsity of the resulting residuals. We first extract from the speech signal itself, the points having the potential of attaining largest norms of residuals (the glottal closure instants) and then we construct the weighting function such that the prediction error is relaxed on these points. Consequently, the weighted *l*_{2}-norm objective function can be minimized by the solution of normal equations of liner least squares problem. We show that our closed-form solution provides better sparseness properties compared to the *l*_{1}-norm minimization using the interior points method. Also, to show the usefulness of such sparse representation, we use the resulting prediction coefficients inside a multi-pulse excitation (MPE) coder and we show that the corresponding multi-pulse excitation source provides slightly better synthesis quality compared to the estimated excitation of the classical minimum variance synthesizer.

The article is organized as follows. In Section 2, we provide the general formulation of the LP analysis problem. In Section 3, we briefly review previous studies on sparse LP analysis and the numerical motivations behind them. We present our efficient solution in Section 4. In Section 5, the experimental results are presented and finally in Section 6, we draw our conclusion and perspectives.