Modelling Bilateral Lymphatic Spread#
In the quickstart guide for this package, we have shown how to model the lymphatic tumr progression in head and neck cancer. But we have done so only unilaterally. However, depending on the lateralization of the primary tumor, we may not only see ipsilateral (i.e., to the side where the tumor is located), but also to the contralateral (i.e., the other) side.
To capture this, we have developed an extension that is implemented in the lymph.models.Bilateral
class. It shares a lot of the same API with the lymph.models.Unilateral
class but also has some specialties. Let’s have a look:
Importing#
Nothing new here, we just import the package.
import lymph
Graph#
As before, we define a graph structure. Note that you only need to define this for one side. The other side’s graph is automatically mirrored. If you explicitly want to make the two sides asymmetric, you may do this by providing different graphs to the ipsi_kwargs
and contra_kwargs
in the constructor.
graph_dict = {
('tumor', 'T') : ['I', 'II', 'III', 'IV'],
('lnl' , 'I') : ['II'],
('lnl' , 'II') : ['III'],
('lnl' , 'III'): ['IV'],
('lnl' , 'IV') : []
}
For a more detailed explanation of how this graph should be defined, look at the unilateral quickstart guide.
model = lymph.models.Bilateral(graph_dict)
model
<lymph.models.bilateral.Bilateral at 0x7f010840ed10>
Parameterization#
Since we now need to distribute the parameters to both sides, the assignment gets a little more tricky: If we want to set the spread rate for e.g. the ipsilateral edge from LNL II
to III
, we now need to pass it as follows:
model.set_params(ipsi_IItoIII_spread=0.123)
model.get_params()
{'ipsi_TtoI_spread': 0.0,
'ipsi_TtoII_spread': 0.0,
'ipsi_TtoIII_spread': 0.0,
'ipsi_TtoIV_spread': 0.0,
'contra_TtoI_spread': 0.0,
'contra_TtoII_spread': 0.0,
'contra_TtoIII_spread': 0.0,
'contra_TtoIV_spread': 0.0,
'ItoII_spread': 0.0,
'IItoIII_spread': 0.123,
'IIItoIV_spread': 0.0}
So, prefixing a parameter with ipsi_
or contra_
causes it to be sent to only the respective side. Of course, you can still set all parameters at once:
model.set_params(spread=0.234)
model.get_params()
{'ipsi_TtoI_spread': 0.234,
'ipsi_TtoII_spread': 0.234,
'ipsi_TtoIII_spread': 0.234,
'ipsi_TtoIV_spread': 0.234,
'contra_TtoI_spread': 0.234,
'contra_TtoII_spread': 0.234,
'contra_TtoIII_spread': 0.234,
'contra_TtoIV_spread': 0.234,
'ItoII_spread': 0.234,
'IItoIII_spread': 0.234,
'IIItoIV_spread': 0.234}
Any thinkable combination of setting groups of parameters is possible: All ipsilateral params at once, all tumor spreads at once, all contralateral lnl spreads together, and so on…
model.set_params(ipsi_spread=0.77)
model.set_lnl_spread_params(spread=0.543)
model.get_params()
{'ipsi_TtoI_spread': 0.77,
'ipsi_TtoII_spread': 0.77,
'ipsi_TtoIII_spread': 0.77,
'ipsi_TtoIV_spread': 0.77,
'contra_TtoI_spread': 0.234,
'contra_TtoII_spread': 0.234,
'contra_TtoIII_spread': 0.234,
'contra_TtoIV_spread': 0.234,
'ItoII_spread': 0.543,
'IItoIII_spread': 0.543,
'IIItoIV_spread': 0.543}
Note
Did you notice the LNL spread parameters are not prefixed with ipsi_
and contra_
? This is because we set the LNL spread to be symmetric via the is_symmetric["lnl_spread"] = True
parameter in the constructor of the class. If you change this, the model will have separate parameters for the two sides.
Modalities#
Setting the modalities works exactly as in the Unilateral
case. The Bilateral
class provides the same API for getting and setting the modalities and delegates this to the two sides.
model.set_modality("MRI", spec=0.63, sens=0.81)
model.set_modality("PET", spec=0.86, sens=0.79)
model.get_all_modalities()
{'MRI': Clinical(spec=0.63, sens=0.81, is_trinary=False),
'PET': Clinical(spec=0.86, sens=0.79, is_trinary=False)}
print(model.get_modality("PET").confusion_matrix)
[[0.86 0.14]
[0.21 0.79]]
Data / Observations#
The data loading APi is also the same compared to the Unilateral
class. The only difference is that one now does not need to specify which side
to load, since it will automatically load the ipsi
and contra
side.
import pandas as pd
dataset_url = "https://raw.githubusercontent.com/rmnldwg/lydata/main/2021-usz-oropharynx/data.csv"
example_cols = [
("patient", "#", "age"),
("patient", "#", "hpv_status"),
("tumor", "1", "t_stage"),
("PET", "ipsi", "I"),
("PET", "ipsi", "II"),
("PET", "ipsi", "III"),
("PET", "ipsi", "IV"),
("MRI", "ipsi", "I"),
("MRI", "ipsi", "II"),
("MRI", "ipsi", "III"),
("MRI", "ipsi", "IV"),
]
usz_oropharynx = pd.read_csv(dataset_url, header=[0,1,2])
usz_oropharynx[example_cols]
patient | tumor | PET | MRI | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
# | 1 | ipsi | ipsi | ||||||||
age | hpv_status | t_stage | I | II | III | IV | I | II | III | IV | |
0 | 59 | True | 3 | False | True | False | False | False | True | False | False |
1 | 75 | True | 2 | NaN | NaN | NaN | NaN | False | False | True | False |
2 | 87 | True | 4 | True | True | True | True | True | True | True | True |
3 | 87 | False | 4 | False | True | True | True | False | True | True | True |
4 | 70 | True | 2 | False | True | False | False | NaN | NaN | NaN | NaN |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
282 | 49 | False | 4 | True | True | True | False | NaN | NaN | NaN | NaN |
283 | 67 | True | 4 | False | True | True | False | False | True | True | False |
284 | 44 | False | 4 | False | True | True | False | NaN | NaN | NaN | NaN |
285 | 67 | False | 4 | False | True | True | False | NaN | NaN | NaN | NaN |
286 | 76 | True | 2 | False | True | True | False | NaN | NaN | NaN | NaN |
287 rows × 11 columns
model.replace_all_modalities({})
model.set_modality("PET", spec=0.86, sens=0.79)
model.load_patient_data(usz_oropharynx)
model.ipsi.patient_data["_model"]
MRI | FNA | pCT | ... | PET | CT | pathology | # | ||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
I | II | III | IV | I | II | III | IV | I | II | ... | IV | I | II | III | IV | I | II | III | IV | t_stage | |
0 | False | True | False | False | NaN | True | True | NaN | False | True | ... | False | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | late |
1 | False | False | True | False | NaN | True | NaN | NaN | False | True | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | early |
2 | True | True | True | True | NaN | NaN | NaN | NaN | True | True | ... | True | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | late |
3 | False | True | True | True | NaN | True | True | True | False | True | ... | True | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | late |
4 | NaN | NaN | NaN | NaN | NaN | True | NaN | NaN | False | True | ... | False | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | early |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
282 | NaN | NaN | NaN | NaN | NaN | True | NaN | NaN | True | True | ... | False | True | True | False | False | NaN | NaN | NaN | NaN | late |
283 | False | True | True | False | NaN | NaN | NaN | NaN | False | True | ... | False | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | late |
284 | NaN | NaN | NaN | NaN | NaN | True | True | NaN | False | True | ... | False | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | late |
285 | NaN | NaN | NaN | NaN | NaN | NaN | False | NaN | False | True | ... | False | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | late |
286 | NaN | NaN | NaN | NaN | NaN | True | NaN | NaN | False | True | ... | False | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | early |
287 rows × 25 columns
model.contra.patient_data["_model"]
MRI | FNA | pCT | ... | PET | CT | pathology | # | ||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
I | II | III | IV | I | II | III | IV | I | II | ... | IV | I | II | III | IV | I | II | III | IV | t_stage | |
0 | False | False | False | False | NaN | False | False | NaN | False | False | ... | False | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | late |
1 | False | False | False | False | NaN | True | NaN | NaN | False | True | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | early |
2 | True | True | True | True | NaN | NaN | NaN | NaN | True | True | ... | True | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | late |
3 | False | False | False | False | NaN | NaN | NaN | NaN | False | False | ... | False | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | late |
4 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | False | False | ... | False | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | early |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
282 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | False | False | ... | False | False | False | False | False | NaN | NaN | NaN | NaN | late |
283 | False | False | False | False | NaN | NaN | NaN | NaN | False | False | ... | False | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | late |
284 | NaN | NaN | NaN | NaN | NaN | False | NaN | NaN | False | True | ... | False | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | late |
285 | NaN | NaN | NaN | NaN | NaN | False | NaN | NaN | False | True | ... | False | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | late |
286 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | False | False | ... | False | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | early |
287 rows × 25 columns
Distribution over Diagnose Times#
Just as with the modalities, the distributions over diagnose times are delegated to the two sides via the exact same API as in the Unilateral
model:
import numpy as np
import scipy as sp
rng = np.random.default_rng(42)
max_time = model.max_time
time_steps = np.arange(max_time+1)
p = 0.3
early_prior = sp.stats.binom.pmf(time_steps, max_time, p)
model.set_distribution("early", early_prior)
def late_binomial(support: np.ndarray, p: float = 0.5) -> np.ndarray:
"""Parametrized binomial distribution."""
return sp.stats.binom.pmf(support, n=support[-1], p=p)
model.set_distribution("late", late_binomial)
params_dict = model.get_params(as_dict=True, as_flat=True)
params_dict
{'ipsi_TtoI_spread': 0.77,
'ipsi_TtoII_spread': 0.77,
'ipsi_TtoIII_spread': 0.77,
'ipsi_TtoIV_spread': 0.77,
'contra_TtoI_spread': 0.234,
'contra_TtoII_spread': 0.234,
'contra_TtoIII_spread': 0.234,
'contra_TtoIV_spread': 0.234,
'ItoII_spread': 0.543,
'IItoIII_spread': 0.543,
'IIItoIV_spread': 0.543,
'late_p': 0.5}
Notice the additional parameter late_p
that determines the shape of the late diagnse time distribution.
Note
You cannot set the diagnose time distributions asymmetrically! With the modalities this may make sense (although it is not really supported, you may try), but for the diagnose times, this will surely break!
Likelihood#
And again we have the same API as before:
test_probabilities = {p: rng.random() for p in params_dict}
llh = model.likelihood(given_params=test_probabilities, log=True)
ipsi_llh = model.ipsi.likelihood(log=True)
contra_llh = model.contra.likelihood(log=True)
print(f"log-likelihood is {ipsi_llh:.2f} + {contra_llh:.2f} = {llh:.2f}")
log-likelihood is -993.10 + -1053.27 = -1947.65
Note that the two side’s likelihoods do not perfectly sum up. This is expected! A patient’s ipsi- and a contralateral diagnosis were diagnosed at the same time, not separately. They are thus not equally likely as if they were observed independently.