Skip to content

Pruning Random Forest with Orthogonal Matching Trees

In this paper we propose a new method to reduce the size of Breiman’s Random Forests. Given a RandomForest and a target size, our algorithm builds a linear combination of trees which minimizes the training error. Selected trees, as well as weights of the linear combination are obtained by means of the Orthogonal Matching Pursuit algorithm. We test our method on many public benchmark datasets both on regression and binary classification, and we compare it to other pruning techniques. Experiments show that our technique performs significantly better or equally good on many datasets1. We also discuss the benefit and short-coming of learning weights for the pruned forest which lead us to propose to use a non-negative constraint on the OMP weights for better empirical results.

Luc Giffon, Charly Lamothe, Léo Bouscarrat, Paolo Milanesi, Farah Cherfaoui, and Sokol Ko, Pruning Random Forest with Orthogonal Matching Trees, Proc. of CAP 2020.

Click here to access the paper.

Releated Posts

Insights From Flink Forward 2024

In October, our CTO Sabri Skhiri attended the Flink Forward conference, held in Berlin, which marked the 10-year anniversary of Apache Flink. This event brought together experts and enthusiasts in the field of stream processing to discuss the latest advancements, challenges, and future trends. In this article, Sabri will delve into some of the keynotes and talks that took place during the conference, highlighting the noteworthy insights and innovations shared by Ververica and industry leaders.
Read More

Internships 2025

This document presents internships supervised by our consulting department or by our research & development department. Each project is an opportunity to feel both empowered and responsible for your own professional development and for your contribution to the company.
Read More