My latest work on the glmnet
package has just been pushed to CRAN! In this release (v4.1), we extend the scope of regularized Cox models to include (start, stop] data and strata variables. In addition, we provide the survfit
method for plotting survival curves based on the model (as the survival
package does).
Why is this a big deal? As explained in Therneau and Grambsch (2000), the ability to work with start-stop responses opens the door to fitting regularized Cox models with:
- time-dependent covariates,
- time-dependent strata,
- left truncation,
- multiple time scales,
- multiple events per subject,
- independent increment, marginal, and conditional models for correlated data, and
- various forms of case-cohort models.
glmnet v4.1
is now available on CRAN here. We have reorganized the package’s vignettes, with the new functionality described in the vignette “Regularized Cox Regression” (PDF version/web version). Don’t hesitate to reach out if you have questions.
(Note: This is joint work with Trevor Hastie, Balasubramanian Narasimhan and Rob Tibshirani.)
Thanks for the update, and really timely it is. I am currently working on regularization to obtain best lambda value for Ridge Regression. A vignette on that will be appreciated, so many copies examples , confusing one’s abound.
Good work.
Regards,
Ibiloye
LikeLike
Hi, the main glmnet vignette would be most helpful for you. You can use cross-validation (CV) to find the value of lambda that gives smallest CV error (see here: https://glmnet.stanford.edu/articles/glmnet.html#cross-validation).
LikeLike
Hi Kenneth,
Thank you so much for your work on updating the Cox model functionality of the glmnet package. It is great timing as I have many possible predictors to include in my Cox model and would love to use the lasso penalty to reduce them. My question is, in the survival package, there is an “id” argument in the coxph function that allows you to identify which rows belong to the same patient when you have time-dependent covariates with multiple lines. How can you separate out patients in the glmnet version with time-dependent covariates? I don’t see an equivalent argument. Should you consider each patient to be a strata and use the stratifySurv function?
Thank you,
Natalie
LikeLike
I’m not familiar with what the “id” argument does in survival::coxph… Considering each patient as its own stratum means that each patient has its own baseline hazard rate, which doesn’t seem correct to me.
I guess until I figure out how “id” is used to fit the Cox model it’ll be hard for me to advise. For now, glmnet doesn’t have such an argument, and probably won’t have one until there is a compelling use case for it.
LikeLike