Modelling Bilateral Lymphatic Spread#

In the quickstart guide for this package, we have shown how to model the lymphatic tumr progression in head and neck cancer. But we have done so only unilaterally. However, depending on the lateralization of the primary tumor, we may not only see ipsilateral (i.e., to the side where the tumor is located), but also to the contralateral (i.e., the other) side.

To capture this, we have developed an extension that is implemented in the lymph.models.Bilateral class. It shares a lot of the same API with the lymph.models.Unilateral class but also has some specialties. Let’s have a look:

Importing#

Nothing new here, we just import the package.

import lymph

Graph#

As before, we define a graph structure. Note that you only need to define this for one side. The other side’s graph is automatically mirrored. If you explicitly want to make the two sides asymmetric, you may do this by providing different graphs to the ipsi_kwargs and contra_kwargs in the constructor.

graph_dict = {
    ('tumor', 'T')  : ['I', 'II', 'III', 'IV'], 
    ('lnl'  , 'I')  : ['II'], 
    ('lnl'  , 'II') : ['III'], 
    ('lnl'  , 'III'): ['IV'], 
    ('lnl'  , 'IV') : []
}

For a more detailed explanation of how this graph should be defined, look at the unilateral quickstart guide.

model = lymph.models.Bilateral(graph_dict)
model
<lymph.models.bilateral.Bilateral at 0x7f010840ed10>

Parameterization#

Since we now need to distribute the parameters to both sides, the assignment gets a little more tricky: If we want to set the spread rate for e.g. the ipsilateral edge from LNL II to III, we now need to pass it as follows:

model.set_params(ipsi_IItoIII_spread=0.123)
model.get_params()
{'ipsi_TtoI_spread': 0.0,
 'ipsi_TtoII_spread': 0.0,
 'ipsi_TtoIII_spread': 0.0,
 'ipsi_TtoIV_spread': 0.0,
 'contra_TtoI_spread': 0.0,
 'contra_TtoII_spread': 0.0,
 'contra_TtoIII_spread': 0.0,
 'contra_TtoIV_spread': 0.0,
 'ItoII_spread': 0.0,
 'IItoIII_spread': 0.123,
 'IIItoIV_spread': 0.0}

So, prefixing a parameter with ipsi_ or contra_ causes it to be sent to only the respective side. Of course, you can still set all parameters at once:

model.set_params(spread=0.234)
model.get_params()
{'ipsi_TtoI_spread': 0.234,
 'ipsi_TtoII_spread': 0.234,
 'ipsi_TtoIII_spread': 0.234,
 'ipsi_TtoIV_spread': 0.234,
 'contra_TtoI_spread': 0.234,
 'contra_TtoII_spread': 0.234,
 'contra_TtoIII_spread': 0.234,
 'contra_TtoIV_spread': 0.234,
 'ItoII_spread': 0.234,
 'IItoIII_spread': 0.234,
 'IIItoIV_spread': 0.234}

Any thinkable combination of setting groups of parameters is possible: All ipsilateral params at once, all tumor spreads at once, all contralateral lnl spreads together, and so on…

model.set_params(ipsi_spread=0.77)
model.set_lnl_spread_params(spread=0.543)
model.get_params()
{'ipsi_TtoI_spread': 0.77,
 'ipsi_TtoII_spread': 0.77,
 'ipsi_TtoIII_spread': 0.77,
 'ipsi_TtoIV_spread': 0.77,
 'contra_TtoI_spread': 0.234,
 'contra_TtoII_spread': 0.234,
 'contra_TtoIII_spread': 0.234,
 'contra_TtoIV_spread': 0.234,
 'ItoII_spread': 0.543,
 'IItoIII_spread': 0.543,
 'IIItoIV_spread': 0.543}

Note

Did you notice the LNL spread parameters are not prefixed with ipsi_ and contra_? This is because we set the LNL spread to be symmetric via the is_symmetric["lnl_spread"] = True parameter in the constructor of the class. If you change this, the model will have separate parameters for the two sides.

Modalities#

Setting the modalities works exactly as in the Unilateral case. The Bilateral class provides the same API for getting and setting the modalities and delegates this to the two sides.

model.set_modality("MRI", spec=0.63, sens=0.81)
model.set_modality("PET", spec=0.86, sens=0.79)
model.get_all_modalities()
{'MRI': Clinical(spec=0.63, sens=0.81, is_trinary=False),
 'PET': Clinical(spec=0.86, sens=0.79, is_trinary=False)}
print(model.get_modality("PET").confusion_matrix)
[[0.86 0.14]
 [0.21 0.79]]

Data / Observations#

The data loading APi is also the same compared to the Unilateral class. The only difference is that one now does not need to specify which side to load, since it will automatically load the ipsi and contra side.

import pandas as pd

dataset_url = "https://raw.githubusercontent.com/rmnldwg/lydata/main/2021-usz-oropharynx/data.csv"
example_cols = [
    ("patient", "#", "age"),
    ("patient", "#", "hpv_status"),
    ("tumor", "1", "t_stage"),
    ("PET", "ipsi", "I"),
    ("PET", "ipsi", "II"),
    ("PET", "ipsi", "III"),
    ("PET", "ipsi", "IV"),
    ("MRI", "ipsi", "I"),
    ("MRI", "ipsi", "II"),
    ("MRI", "ipsi", "III"),
    ("MRI", "ipsi", "IV"),
]

usz_oropharynx = pd.read_csv(dataset_url, header=[0,1,2])
usz_oropharynx[example_cols]
patient tumor PET MRI
# 1 ipsi ipsi
age hpv_status t_stage I II III IV I II III IV
0 59 True 3 False True False False False True False False
1 75 True 2 NaN NaN NaN NaN False False True False
2 87 True 4 True True True True True True True True
3 87 False 4 False True True True False True True True
4 70 True 2 False True False False NaN NaN NaN NaN
... ... ... ... ... ... ... ... ... ... ... ...
282 49 False 4 True True True False NaN NaN NaN NaN
283 67 True 4 False True True False False True True False
284 44 False 4 False True True False NaN NaN NaN NaN
285 67 False 4 False True True False NaN NaN NaN NaN
286 76 True 2 False True True False NaN NaN NaN NaN

287 rows × 11 columns

model.replace_all_modalities({})
model.set_modality("PET", spec=0.86, sens=0.79)
model.load_patient_data(usz_oropharynx)
model.ipsi.patient_data["_model"]
MRI FNA pCT ... PET CT pathology #
I II III IV I II III IV I II ... IV I II III IV I II III IV t_stage
0 False True False False NaN True True NaN False True ... False NaN NaN NaN NaN NaN NaN NaN NaN late
1 False False True False NaN True NaN NaN False True ... NaN NaN NaN NaN NaN NaN NaN NaN NaN early
2 True True True True NaN NaN NaN NaN True True ... True NaN NaN NaN NaN NaN NaN NaN NaN late
3 False True True True NaN True True True False True ... True NaN NaN NaN NaN NaN NaN NaN NaN late
4 NaN NaN NaN NaN NaN True NaN NaN False True ... False NaN NaN NaN NaN NaN NaN NaN NaN early
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
282 NaN NaN NaN NaN NaN True NaN NaN True True ... False True True False False NaN NaN NaN NaN late
283 False True True False NaN NaN NaN NaN False True ... False NaN NaN NaN NaN NaN NaN NaN NaN late
284 NaN NaN NaN NaN NaN True True NaN False True ... False NaN NaN NaN NaN NaN NaN NaN NaN late
285 NaN NaN NaN NaN NaN NaN False NaN False True ... False NaN NaN NaN NaN NaN NaN NaN NaN late
286 NaN NaN NaN NaN NaN True NaN NaN False True ... False NaN NaN NaN NaN NaN NaN NaN NaN early

287 rows × 25 columns

model.contra.patient_data["_model"]
MRI FNA pCT ... PET CT pathology #
I II III IV I II III IV I II ... IV I II III IV I II III IV t_stage
0 False False False False NaN False False NaN False False ... False NaN NaN NaN NaN NaN NaN NaN NaN late
1 False False False False NaN True NaN NaN False True ... NaN NaN NaN NaN NaN NaN NaN NaN NaN early
2 True True True True NaN NaN NaN NaN True True ... True NaN NaN NaN NaN NaN NaN NaN NaN late
3 False False False False NaN NaN NaN NaN False False ... False NaN NaN NaN NaN NaN NaN NaN NaN late
4 NaN NaN NaN NaN NaN NaN NaN NaN False False ... False NaN NaN NaN NaN NaN NaN NaN NaN early
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
282 NaN NaN NaN NaN NaN NaN NaN NaN False False ... False False False False False NaN NaN NaN NaN late
283 False False False False NaN NaN NaN NaN False False ... False NaN NaN NaN NaN NaN NaN NaN NaN late
284 NaN NaN NaN NaN NaN False NaN NaN False True ... False NaN NaN NaN NaN NaN NaN NaN NaN late
285 NaN NaN NaN NaN NaN False NaN NaN False True ... False NaN NaN NaN NaN NaN NaN NaN NaN late
286 NaN NaN NaN NaN NaN NaN NaN NaN False False ... False NaN NaN NaN NaN NaN NaN NaN NaN early

287 rows × 25 columns

Distribution over Diagnose Times#

Just as with the modalities, the distributions over diagnose times are delegated to the two sides via the exact same API as in the Unilateral model:

import numpy as np
import scipy as sp

rng = np.random.default_rng(42)

max_time = model.max_time
time_steps = np.arange(max_time+1)
p = 0.3

early_prior = sp.stats.binom.pmf(time_steps, max_time, p)
model.set_distribution("early", early_prior)
def late_binomial(support: np.ndarray, p: float = 0.5) -> np.ndarray:
    """Parametrized binomial distribution."""
    return sp.stats.binom.pmf(support, n=support[-1], p=p)
model.set_distribution("late", late_binomial)
params_dict = model.get_params(as_dict=True, as_flat=True)
params_dict
{'ipsi_TtoI_spread': 0.77,
 'ipsi_TtoII_spread': 0.77,
 'ipsi_TtoIII_spread': 0.77,
 'ipsi_TtoIV_spread': 0.77,
 'contra_TtoI_spread': 0.234,
 'contra_TtoII_spread': 0.234,
 'contra_TtoIII_spread': 0.234,
 'contra_TtoIV_spread': 0.234,
 'ItoII_spread': 0.543,
 'IItoIII_spread': 0.543,
 'IIItoIV_spread': 0.543,
 'late_p': 0.5}

Notice the additional parameter late_p that determines the shape of the late diagnse time distribution.

Note

You cannot set the diagnose time distributions asymmetrically! With the modalities this may make sense (although it is not really supported, you may try), but for the diagnose times, this will surely break!

Likelihood#

And again we have the same API as before:

test_probabilities = {p: rng.random() for p in params_dict}

llh = model.likelihood(given_params=test_probabilities, log=True)
ipsi_llh = model.ipsi.likelihood(log=True)
contra_llh = model.contra.likelihood(log=True)

print(f"log-likelihood is {ipsi_llh:.2f} + {contra_llh:.2f} = {llh:.2f}")
log-likelihood is -993.10 + -1053.27 = -1947.65

Note that the two side’s likelihoods do not perfectly sum up. This is expected! A patient’s ipsi- and a contralateral diagnosis were diagnosed at the same time, not separately. They are thus not equally likely as if they were observed independently.