Skip to content
Home
About Us
Resources
Profiles Metrics
Authors Directory
Institutions Directory
Top Authors
Top Institutions
Top Sponsors
AI Digest
Contact Us
Menu
Home
About Us
Resources
Profiles Metrics
Authors Directory
Institutions Directory
Top Authors
Top Institutions
Top Sponsors
AI Digest
Contact Us
Home
About Us
Resources
Profiles Metrics
Authors Directory
Institutions Directory
Top Authors
Top Institutions
Top Sponsors
AI Digest
Contact Us
Menu
Home
About Us
Resources
Profiles Metrics
Authors Directory
Institutions Directory
Top Authors
Top Institutions
Top Sponsors
AI Digest
Contact Us
Publication Details
AFRICAN RESEARCH NEXUS
SHINING A SPOTLIGHT ON AFRICAN RESEARCH
mathematics
Calibrating random forests for probability estimation
Statistics in Medicine, Volume 35, No. 22, Year 2016
Notification
URL copied to clipboard!
Description
Probabilities can be consistently estimated using random forests. It is, however, unclear how random forests should be updated to make predictions for other centers or at different time points. In this work, we present two approaches for updating random forests for probability estimation. The first method has been proposed by Elkan and may be used for updating any machine learning approach yielding consistent probabilities, so-called probability machines. The second approach is a new strategy specifically developed for random forests. Using the terminal nodes, which represent conditional probabilities, the random forest is first translated to logistic regression models. These are, in turn, used for re-calibration. The two updating strategies were compared in a simulation study and are illustrated with data from the German Stroke Study Collaboration. In most simulation scenarios, both methods led to similar improvements. In the simulation scenario in which the stricter assumptions of Elkan's method were not met, the logistic regression-based re-calibration approach for random forests outperformed Elkan's method. It also performed better on the stroke data than Elkan's method. The strength of Elkan's method is its general applicability to any probability machine. However, if the strict assumptions underlying this approach are not met, the logistic regression-based approach is preferable for updating random forests for probability estimation. © 2016 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd.
Authors & Co-Authors
Dankowski, Theresa
Germany, Lubeck
Universitätsklinikum Schleswig-holstein Campus Lübeck
Ziegler, Andreas E.
Germany, Lubeck
Universitätsklinikum Schleswig-holstein Campus Lübeck
Germany, Lubeck
Universität zu Lübeck
Germany, Berlin
Deutsches Zentrum Für Herz-kreislauf-forschung E. V.
South Africa, Durban
University of Kwazulu-natal
Statistics
Citations: 41
Authors: 2
Affiliations: 4
Identifiers
Doi:
10.1002/sim.6959
ISSN:
02776715
e-ISSN:
10970258
Research Areas
Health System And Policy
Noncommunicable Diseases