Selecting the number of categories of the lymph node ratio in cancer research: A bootstrap-based hypothesis test
MetadataShow full item record
The high impact of the lymph node ratio as a prognostic factor is widely established in colorectal cancer, and is being used as a categorized predictor variable in several studies. However, the cut-off points as well as the number of categories considered differ considerably in the literature. Motivated by the need to obtain the best categorization of the lymph node ratio as a predictor of mortality in colorectal cancer patients, we propose a method to select the best number of categories for a continuous variable in a logistic regression framework. Thus, to this end, we propose a bootstrap-based hypothesis test, together with a new estimation algorithm for the optimal location of the cut-off points called BackAddFor, which is an updated version of the previously proposed AddFor algorithm. The performance of the hypothesis test was evaluated by means of a simulation study, under different scenarios, yielding type I errors close to the nominal errors and good power values whenever a meaningful difference in terms of prediction ability existed. Finally, the methodology proposed was applied to the CCR-CARESS study where the lymph node ratio was included as a predictor of five-year mortality, resulting in the selection of three categories.