Data
The user uploads his/her own data, otherwise, can use the example data from the DIVINE project.
If wanted a csv file with the new data is uploaded. The csv file must have the following format:
To understand better the variables x_time and x_status, the following figure illustrates the path of two specific patients from the DIVINE dataset, \(\text{id}=8\) and \(\text{id}=1\) respectively.
The first patient is admitted in the hospital without severe pneumonia (nopneum_time=0, nopneum_status=1), he/she is diagnosed with severe pneumonia after 1 day in hospital (pneum_time=1, pneum_status=1), he/she needs non-invasive mechanical ventilation at day 2 (NIMV_time=2, NIMV_status=1) and invasive mechanical ventilation at day 3 (IMV_time=3, IMV_status=1) and finally he/she dies at day 10 (death_time=10, death_status=1). Consequently, that patient has not reached the states and and both states are censored at time 10 (reco_time=10, reco_status=0 and dcharg_time=10, dcharg_status=0).
The second patient has a better evolution since, even if he/she needs non-invasive mechanical ventilation at day 7.5, he/she recovers from severe pneumonia at day 13 and he/she is discharged at day 23.
Despite the time until each state is analysed through the difference between the date of entry in the previous and actual state, in the abovementioned patient we can observe that the time until non-invasive mechanical ventialtion is 7.5 days. This is because the dataset has patients that enter in two states the same day, and this is not allowed in MSMs. To solve this problem, we made a half day imputation for the patients that enter in two states at the same day. Consequently, as the second patient was diagnosed with severe pneumonia and needed non-invasive mechanical ventilation the seventh day, we made the half day imputation obtaining NIMV_time=7.5.
A table showing the data (uploaded data or example data) is returned.
If wanted the user can change the labels of the states and covariates using the boxes. It is very important to click the save state/covariate label button, otherwise the new label would not be saved.
For the covariates of the dataset the following plots are returned:
The descriptive table gives some information about the covariates:
Model specification
The user specifies the transitions which have to take into account in the model, the transition specific covariates to take into account in the model and the time that wants to take into account in the model as well as the time unit.
It is very important to click on the save model specification button, otherwise the model would not be saved.
In the Define the transitions box using the pull-down menu the output and input states are selected and with the add/delete buttons each transition is created/deleted.
This app does not allow to include bidirectional transitions or loops in the multistate model. If a loop is detected a popup will appear indicating that a loop have been introduced so the last transition won’t be added.
In the Multistate model diagram box the updated diagram of the defined model is returned. This diagram is automatically updated when a transition is added or deleted. In order to make the diagram more understandable the states are plotted in different colors: orange (initial states), blue (transient) and magenta (absorbing).
In the Number of events for each transition box the number of events for each transition appear. As loops are not allowed, the number of events that make each transition can be interpreted as the number of individuals that make each transition, because individuals only make each transition of their path once.
In the Time specification box the follow-up time and the time units for the plots are selected.
In the Covariates per transition box the covariates of interest for each transition are selected. Those selected covariates will be taken into account to fit the model and as potential characteristics to make the predictions.
de Wreede LC, Fiocco M, and Putter H (2011). mstate: An R Package for the Analysis of Competing Risks and Multi-State Models. Journal of Statistical Software, Volume 38, Issue 7.
Exploring the data
The user receives some descriptive information of the data.
In this section no input is needed, the states of the model will be used.
For each initial or transient state of the model a boxplot is returned representing the length of stay in each of those states.
Those boxplots give the user the following information:
Mody A, Lyons PG, Vazquez Guillamet C, et al (2020). The Clinical Course of Coronavirus Disease 2019 in a US Hospital System: A Multistate Analysis. American Journal of Epidemiology, Volume 190, No. 4.
Exploring the data
The user receives some descriptive information of the data.
In this section no input is needed, the states of the model will be used.
The cumulative incidence plot shows the proportion of individuals who achieve an absorbing state at any particular time.
The cumulative incidence plot indicates the proportion of individuals who go to each absorbing state at any particular time.
Exploring the data
The user receives some descriptive information of the data.
There are three different inputs:
Two groups of instantaneous hazard plots are returned: one for the transitions that start in the selected state and other for the transitions that end in the selected state.
If a covariate has been selected, this covariate is used to stratify the data. If a numeric covariate is chosen, in each group two plots will appear: one for individuals with a value above the median value of the covariate, and another for the individuals with a value under the median value of the covariate. If a categorical covariate is chosen, for each category of the covariate in each group one graph will appear.
If for a specific transition there are not enough individuals to estimate the instantaneous hazard, the instantaneous hazards of that transition won’t appear in the graph.
The instantaneous hazard indicates which is the risk of going through a specific transition just at moment t. Consequently, when no covariates are selected, we can interpret the graph as the risk of transition for the general population. When a specific covariate is selected, the app uses this as a stratifying covariate, so in one side we will have the risk of transition for a concrete group of individuals (e.g., women, age above the median age) and in the other side we will have the risk of transition for other group of individuals (e.g., men, age below the median age).
de Wreede LC, Fiocco M, and Putter H (2011). mstate: An R Package for the Analysis of Competing Risks and Multi-State Models. Journal of Statistical Software, Volume 38, Issue 7.
Mody A, Lyons PG, Vazquez Guillamet C, et al (2020). The Clinical Course of Coronavirus Disease 2019 in a US Hospital System: A Multistate Analysis. American Journal of Epidemiology, Volume 190, No. 4.
Model output
The user decides which type of model wants to fit and the model is fitted including the previously selected transition specific covariates. Some forest plots are returned and the model validation can be done. Finally, the user can compare different fitted models.
Selection of the type of model. For the moment only a Cox model is available.
Take into account that the previously selected transition specific covariates will be considered when fitting the model.
If wanted, the user can compute the logarithmic score clicking on the compute the logarithmic score button. This computation takes some time.
If in the model some covariates are taken into account, three tables are returned with the most important information of the model.
In the first table the following information is collected:
In the second table the value of the log likelihood of the fitted and the null model are shown, as well as the Akaike information criterion (AIC) of the fitted model.
In the third table the following information is collected:
If a null model is fitted, that is, a model without any covariate, only one table is returned. This table contains the value of the log likelihood of the null model.
If the user clicks on the compute the logarithmic score button, this score will be returned.
Table 1:
Positive coefficient -> instantaneous risk increases with each unit of the covariate. For example, if we compare two patients with the same characteristics, but one is 50 years old and the other 65 years old, and the value of the coefficient that we obtain is \(\text{coef} = 0.02\), the 65 years old has \(\exp((65-50) \times \text{coef}) = \exp(15 \times 0.02) = 1.35\) times more risk of transition than the 50 years old.
Negative coefficient -> instantaneous risk decreases with each unit of the covariate. For example, if we compare two patients with the same characteristics, but one is 55 years old and the other 60 years old, and the value of the coefficient that we obtain is \(\text{coef} = -0.04\), the 60 years old has \(\exp((60-55) \times \text{coef}) = \exp(5 \times (-0.04)) = 0.82\) times less risk of transition than the one with 55 years, or what is the same, the one with 55 years has \(1/\exp((60-55) \times \text{coef}) = 1/0.82 = 1.22\) times more risk of transition than the 60 years old.
Table 2:
The log likelihood of a model is used to compare the fitting of different models. We assume that the model with the higher log likelihood provides a better fit of the data.
The AIC of a model is used to compare different models. The one with a lower AIC is considered to be better than the other.
Table 3:
The three tests that appear in that table analyze if the coefficients of the model can be assumed different from 0. In the test column the value of the test is represented, in the df column the number of estimated coefficientes and in the column p-value the corresponding p-value is shown. Is important to take into account that if \(\text{p-value} < 10^{-6}\), the app will consider \(\text{p-value} = 0\).
Logarithmic score:
It is not possible to interpret, but if we want to chose the model with a better predictive performance, we need to chose the one with a lower logarithmic score.
de Wreede LC, Fiocco M, and Putter H (2011). mstate: An R Package for the Analysis of Competing Risks and Multi-State Models. Journal of Statistical Software, Volume 38, Issue 7.
Meira-Machado L, de Uña-Álvarez J, Cadarso-Suárez C, Andersen PK (2009). Multi-state models for the analysis of time-to-event data. Statistical Methods in Medical Research, Volume 18, Issue 2.
Model output
The user decides which type of model wants to fit and the model is fitted including the previously selected transition specific covariates. Some forest plots are returned and the model validation can be done. Finally, the user can compare different fitted models.
The user needs to choose two things related with the graph:
A forest plot is returned representing the estimated hazard ratios and their confidence intervals for the covariates taken into account in the selected transition.
As those plots represents the estimated coefficients, if no covariate is selected there won’t be any graph in this section.
The forest plot provides the hazard ratios and their confidence intervals of the covariates taken into account in the selected transition. If the covariate is categorical, these hazard ratios compare each category with respect to the reference category of that covariate, while if the covariate is numeric, the hazard ratios are computed for an increment of one unit in that covariate. As one unit increment might not be easily interpretable, the app permits to introduce the units you wish to be taken into account in the graph.
Based on that graph, we say that the covariate has an effect on the transition if the confidence interval of this specific covariate does not cover the 0, and we say that the covariate does not have a significant effect otherwise.
Mody A, Lyons PG, Vazquez Guillamet C, et al (2020). The Clinical Course of Coronavirus Disease 2019 in a US Hospital System: A Multistate Analysis. American Journal of Epidemiology, Volume 190, No. 4.
Model output
The user decides which type of model wants to fit and the model is fitted including the previously selected transition specific covariates. Some forest plots are returned and the model validation can be done. Finally, the user can compare different fitted models.
The user can analyze wheteher the different assumptions reached to fit the Cox model hold.
The user needs to select:
A graph of the martingale residuals for the selected covariate and transition are returned.
As those plots represents the residuals for a concrete covariate, if no covariate is selected there won’t be any graph in this section.
The martingale residuals serve to determine the best transformation for a covariate in such a way that it optimally explains the time to an individual passes through a certain transition. To find the best transformation for the covariate \(Z_q\) in the transition \(k \rightarrow l\), the martingale-based residual from a Cox model adjusted with the other \(p-1\) covariates need to be computed. Then, the graphic of martingale residuals respect to the value of the covariate \(Z_{i,q}\) are represented with a smoothed curve of the points trajectory along the x-axis. If the smoothed curve is reasonably linear, the covariate \(Z_q\) does not require any further transformation in the transition \(k \rightarrow l\).
The user can analyze wheteher the different assumptions reached to fit the Cox model hold.
The user needs to choose the transition to analyze.
The plots of the dfbetas residuals for the selected transition are returned.
As those plots represents the residuals for the different covariates, if no covariate is selected there won’t be any graph in this section.
We plot the dfbetas residuals versus \(Z_{i,q}\) to determine the influence of the individual \(i\) in the estimation of the coefficients of the transition \(k \rightarrow l\). That is, those residuals represent the difference between the estimator obtained when adjusting the Cox model for the transition \(k \rightarrow l\) considering all the individuals, \(\hat{\boldsymbol{\beta}}\), and the estimator from the model without taking into account the individual \(i\), \(\hat{\boldsymbol{\beta}}_{(i)}\). So, those individuals far away from the others have a higher influence on the model estimates. The ideal situation would be that more or less all the points appear in the same area.
Take into account that those dfbetas residuals are standardized, hence they take values in \([-1,1]\).
The user can analyze whether the different assumptions reached to fit the Cox model hold.
The user needs to choose the transition to analyze.
The plots of the Schoenfeld residuals for the selected transition are returned.
As those plots represents the residuals for the different covariates, if no covariate is selected there won’t be any graph in this section.
The Schoenfeld residuals determine the difference between the observed and expected value of the covariate \(Z_q\) in each transitioning time between states \(k\) and \(l\).
Schoenfeld residuals for each individual are represented with a smoothed curve of the points. A line at 0 is added. If the confidence interval of the smoothed curve covers the 0 line, the proportionality of the hazard can be assumed.
Rizopoulos, D. (2018). Biostatistical Methods II: Classical Regression Models (EP03) Survival Analysis. Course material.
Model output
The user decides which type of model wants to fit and the model is fitted including the previously selected transition specific covariates. Some forest plots are returned and the model validation can be done. Finally, the user can compare different fitted models.
The user can save and upload information of the sessions using the save and load buttons.
A table is returned showing the information of all the fitted models.
If wanted, the user can save that information to use in other session clicking on the save button. The user needs to save the session id that is shown in order to upload that information in other session.
To upload the information of other session, the user needs to introduce the session id and clicj on the load button.
Predictions
The user can make predictions over one or two new individuals.
The inputs can be divided in three blocks:
Two groups of outputs are returned:
As the predictions are made based on the characteristics of a new patient, if no covariate is selected there won’t be any output in this section.
In the first group of outputs the probabilities of being in each state are returned. So, with those values we can get a better idea of how the new individual will be after the selected time period.
In the second group of outputs the transitions probability plot is obtained. This plot can be received in a stacked or non-stacked way, despite both plots give the same information: for the new patient, which is the probability of being in each state along time. In the non-stacked plot, the curves indicate which are those probabilities but in the stacked plot, in order to obtain those probabilities the height of each color need to be analyzed.
de Wreede LC, Fiocco M, and Putter H (2011). mstate: An R Package for the Analysis of Competing Risks and Multi-State Models. Journal of Statistical Software, Volume 38, Issue 7.