In this paper, a study on the classiﬁcation performance of a vector space model (VSM) and of latent semantic indexing (LSI) applied to the task of spamﬁltering is summarized. Based on a feature set used in the extremely widespread, de-facto standard spamﬁltering system SpamAssassin, a vector space model and latent semantic indexing are applied for classifying e-mail messages as spam or not spam. The test data sets used are partly from the official TREC 2005 data set and partly self collected. The investigation of LSI for spamﬁltering summarized here evaluates the relationship between two central aspects: (i) the truncation of the SVD in LSI and (ii) the resulting classiﬁcation performance in this speciﬁc application context. It is shown that a surprisingly large amount of truncation is often possible without heavy loss in classiﬁcation performance. This forms the basis for good and extremely fast approximate (pre-) classiﬁcation strategies, which are very useful in practice. The approaches investigated in this paper are shown to compare favorably to two important alternatives: (i) They achieve better classiﬁcation results than SpamAssassin, and (ii) they are better and more robust than a related LSI-based approach using textual features which has been proposed earlier.