Workshop on methods for studying cancer patient survival with application in Stata

This one-day workshop was held on Thursday September 6 in conjunction with the Nordic and Baltic Stata Users Group meeting on Friday September 7.

Date: Thursday September 6, 2007

08:30 AM–09:00 AM
Registration (and coffee) at the main
entrance to the Department of Medical
Epidemiology and Biostatistics
Nobels väg 12A [directions]

09:00 AM–12:00 PM Morning session
12:00 PM–1:00 PM Lunch
1:00 PM–5:00 PM Afternoon session

Venue: Karolinska Institutet
Department of Medical Epidemiology and Biostatistics
Nobels väg 12A
SE-171 77 Stockholm
Cost: Free
Registration: Closed

The aim of the workshop is to discuss and compare approaches to estimating and modeling relative survival (excess mortality) and their application in Stata. The workshop will include presentations from invited speakers who have developed methodology implemented in Stata.

The workshop is aimed at participants who have prior knowledge of methods for population-based cancer survival analysis or have a very strong background in biostatistics and Stata.

08:30-09:00 Registration and coffee/sandwich
09:00-09:15 Paul Dickman: introduction and welcome
09:15-09:40 Bernard Rachet: estimating relative survival using the strel command, including period estimation
09:45-10:15 Paul Dickman: estimating relative survival using the strs command, including period estimation
10:15-10:45 COFFEE
10:45-11:10 Paul Dickman: modelling excess mortality using step functions
11:15-11:40 Bernard Rachet: modelling excess mortality using fractional polynomials and spline functions (on the log hazard scale)
11:45-13:00 LUNCH
13:00-13:30 Chris Nelson: modelling excess mortality using spline functions on the log cumulative hazard scale, the strsrcs command
13:30-14:10 Discussion: comparison of different approaches to modelling the baseline hazard (invited discussant: Patrick Royston)
14:10-14:40 COFFEE
14:40-15:25 Paul Lambert: estimating cure models, the strsmix and strsnmix commands
15:30-16:15 Ula Nur: modelling relative survival in the presence of incomplete data
16:20-17:00 Discussion of Paul and Ula's presentations and directions for future research and development (invited discussant: Patrick Royston)
18:30- Dinner at restaurant två (Rörstrandsgatan 9A)

The primary focus will be on approaches to modeling excess mortality, particularly when proportional excess hazards are not appropriate. Many of the approaches to modeling differ only in the approach to modeling the baseline hazard. The various approaches will be discussed from both a theoretical and practical approach (i.e., implementation using Stata). All speakers will use standard datasets in their presentations to facilitate comparison of the various approaches. These datasets will be distributed to all participants in advance and Stata code presented during the workshop will be available on this web site. We will discuss proposals for future methodological development and possible collaboration, as well as possibilities for standardizing the Stata commands (at least to the extent it is possible).

About the presenters

Paul Dickman joined the Department of Medicial Epidemiology and Biostatistics at Karolinska Institutet in 1999 where he conducts research in epidemiology and biostatistics with particular focus on cancer epidemiology. He has long been interested in the analysis of cancer patient survival, the topic of his 1997 doctoral thesis where he studied with Professor Timo Hakulinen. His primary interests lie in statistical methods for estimating and modelling relative survival. He has published widely in the field of cancer patient survival, is a coauthor of the Stata strs command for estimating and modelling relative survival, and taught courses in cancer survival analysis in eight different countries.

Paul Lambert is a senior lecturer in Medical Statistics in the Centre for Biostatistics and Genetic Epidemiology, Department of Health Sciences at the University of Leicester. Over the last few years Paul's main research interest has been in developing methods for modelling relative survival. In particular modelling time-dependent covariate effects, incorporating period analysis in statistical models and the estimation and modelling of 'cure' in population-based cancer studies. He has developed methods to use fractional polynomial in relative survival models for both the baseline excess hazard and time-dependent covariate effects (using the mfp command) and has also developed software to fit cure models for relative survival (strsmix and strsnmix). His other interests include the use of Bayesian methods in medical research, evidence synthesis and hierarchical models.

Chris Nelson is a postgraduate student at the University of Leicester. He has extended the flexible parametric model for censored survival data proposed by Royston and Parmar to enable modelling of excess mortality and written the Stata strsrcs for estimating the model using Stata. The model provides smooth estimates of the relative survival and excess mortality rates by using restricted cubic splines on the log cumulative excess hazard scale.

Ula Nur studied for her first degree in Statistics and Computer science at the University of Khartoum, Sudan. She then completed an MSc in Medical Statistics, at the London School of Hygiene and Tropical Medicine. She worked at the Department of Neurosciences and Mental Health, Imperial College London, as a Research Associate in Medical Statistics for three years. For her doctoral thesis she investigated the impact of methods of handling missing data on estimates in cohort studies, with a focus on multiple imputations. She was awarded a PhD in Biostatistics from the University of Leeds in 2004. She joined the Cancer Survival Group as a Lecturer in Cancer Survival in December 2005. Ula’s current research is on geographical and socio-economic inequalities in cancer survival, and modelling relative survival in the presence of incomplete data.

Bernard Rachet is a clinical senior lecturer in cancer epidemiology in the Non-Communicable Disease Epidemiology Unit, Epidemiology and Public Health Department at the London School of Hygiene and Tropical Medicine. Qualified in medicine, he completed an MSc in epidemiology, then a PhD in epidemiology at the International Agency for Research on Cancer (IARC), France. Before joining the London School in 2002, he worked with Professor Jack Siemiatycki in the Epidemiology and Biostatistics Unit of INRS-IAF, Montreal, Canada, for three years. His work mainly focussed on cancer risks associated with occupational and environmental exposures, and on the flexible modelling of complex dose-response relations and time-dependent changes in relative risks. Since 2005, he is co-principal investigator with Professor Michel Coleman in a new five-year C ancer Research UK Programme Grant focussed on cancer survival. With the Cancer Survival Group, he is carrying out a wide range of projects to quantify, describe and explain patterns and trends in cancer survival by socio-economic group, geographic area and ethnicity, as well as extending the methodology and tools for survival analysis. He is also responsible for the development of strel, a STATA program for relative survival analysis He co-organised several courses on cancer survival at the London School.

Introductory/background reading

Dickman PW, Coviello E, Hills M. Estimating and modelling relative survival. Stata Journal 2007 (in press) [Full text in PDF format]

Dickman PW, Adami HO. Interpreting trends in cancer patient survival. Journal of Internal Medicine 2006;260:103-117. [Medline] [Full text in PDF format]

Dickman PW, Sloggett A, Hills M, Hakulinen T. Regression models for relative survival. Statistics in Medicine 2004; 23:51-64. [Medline] [Full text in PDF format]

Relevant papers on modelling the baseline hazard using flexible functions

Nelson C, Lambert PC, Squire IB, Jones DR. Flexible Parametric Models for Relative Survival, with Application in Coronary Heart Disease. Statistics in Medicine 2007 (in press). [Full text in PDF format]

Relevant papers on cure fraction models in population-based cancer survival analysis

Lambert PC. Modeling of the cure fraction in survival studies. Stata Journal 2007 (in press) [Full text in PDF format]

Lambert PC, Thompson JR, Weston CL, Dickman PW. Estimating and modelling the cure fraction in population-based cancer survival analysis. Biostatistics 2007;8:576-94. [Medline] [Full text in PDF format]

Lambert PC, Dickman PW, Osterlund P, Andersson T, Sankila R, Glimelius B. Temporal trends in the proportion cured for cancer of the colon and rectum: A population-based study using data from the Finnish Cancer Registry. Int J Cancer. 2007 (in press). [Medline] [Full text in PDF format]

Presentation handouts

Bernard Rachet. Estimating relative survival using the strel command, including period estimation

Paul Dickman. Estimating and modelling relative survival using the -strs- package

Bernard Rachet. Modelling excess mortality using fractional polynomials and spline functions (on the log hazard scale)

Paul Lambert. Models for Estimating "Cure" from Cancer

Chris Nelson. Using Spline Functions on the Log Cumulative Hazard scale, the STRSRCS Command

Ula Nur. Modeling relative survival in the presence of incomplete data

Patrick Royston. Invited discussion

Stata commands and data files for the workshop

Stata 9 was used during the workshop. The Stata commands, data files, and do files used during the workshop are distributed as two Stata packages, one package for the cure modelling commands and one for all other files. Once installed, the packages can be updated using -adoupdate-. I suggest you create a new directory, set it as the Stata working directory, and issue the following Stata commands.

net install, all
net install, all

The strel command can be downloaded here (registration required).

The packages contains the following:

Stata packages
strs (Paul Dickman et al.)
strsrcs (Chris Nelson)
strsmix and strsnmix (Paul Lambert)
ice (Patrick Royston)
mim (JC Galati, P Royston & JB Carlin)

Data sets
Two data sets kindly provided by the Finnish cancer registry (colon carcinoma and skin melanoma)
Finnish general population mortality rates (popmort.dta)

Life table estimation using strs and save data for modelling (period analysis)

Modelling excess mortality using Poisson regression using a step function for the baseline hazard

Modelling the colon carcinoma data using strsrcs - setup dataset - use models to get survival and hazard - calculate excess hazard rate ratios - use age as continuous

Modelling the colon carcinoma data using cure models

Modelling the colon carcinoma data in the presence of incomplete data
Note: the above do file calls the -ice-, -mim-, and -mvpatterns- packages. -ice- and -mim- are installed with the workshop package. -mvpatterns- can be installed as follows:
net install dm91, from(

Comparison of various approaches to modelling the colon carcinoma data

These two files were provided by Paul Lambert. The first fits various proportional excess hazards models and compares estimates of the excess hazard ratio and also plots the different estimates of the baseline hazard. The second fits non proportional excess hazards for age group and plots the different estimates of the excess hazard ratio for age group 4. Need to install xpredict.

log excess hazard ratios and standard errors for various PEH models
 Variable |  strsrcs   strsnmix    piecewise     FP       splines 
  agegrp2 |    0.0798     0.0760      0.0816     0.0800     0.0797
          |    0.0644     0.0645      0.0644     0.0644     0.0644
  agegrp3 |    0.2047     0.1979      0.2087     0.2033     0.2029
          |    0.0594     0.0595      0.0595     0.0595     0.0595
  agegrp4 |    0.5262     0.5154      0.5506     0.5362     0.5350
          |    0.0601     0.0602      0.0601     0.0601     0.0601
   female |   -0.0084    -0.0031     -0.0039    -0.0069    -0.0065
          |    0.0258     0.0259      0.0258     0.0257     0.0257
 year8594 |   -0.1859    -0.1926     -0.1934    -0.1842    -0.1850
          |    0.0250     0.0251      0.0250     0.0249     0.0250
                                                      legend: b/se