Modelling Bilateral Lymphatic Spread#

In the quickstart guide for this package, we have shown how to model the lymphatic tumr progression in head and neck cancer. But we have done so only unilaterally. However, depending on the lateralization of the primary tumor, we may not only see ipsilateral (i.e., to the side where the tumor is located), but also to the contralateral (i.e., the other) side.

To capture this, we have developed an extension that is implemented in the lymph.models.Bilateral class. It shares a lot of the same API with the lymph.models.Unilateral class but also has some specialties. Let’s have a look:

Importing#

Nothing new here, we just import the package.

import lymph

Graph#

As before, we define a graph structure. Note that you only need to define this for one side. The other side’s graph is automatically mirrored. If you explicitly want to make the two sides asymmetric, you may do this by providing different graphs to the ipsi_kwargs and contra_kwargs in the constructor.

graph_dict = {
    ('tumor', 'T')  : ['I', 'II', 'III', 'IV'], 
    ('lnl'  , 'I')  : ['II'], 
    ('lnl'  , 'II') : ['III'], 
    ('lnl'  , 'III'): ['IV'], 
    ('lnl'  , 'IV') : []
}

For a more detailed explanation of how this graph should be defined, look at the unilateral quickstart guide.

model = lymph.models.Bilateral(graph_dict)
model

<lymph.models.bilateral.Bilateral at 0x7f010840ed10>

Parameterization#

Since we now need to distribute the parameters to both sides, the assignment gets a little more tricky: If we want to set the spread rate for e.g. the ipsilateral edge from LNL II to III, we now need to pass it as follows:

model.set_params(ipsi_IItoIII_spread=0.123)
model.get_params()

{'ipsi_TtoI_spread': 0.0,
 'ipsi_TtoII_spread': 0.0,
 'ipsi_TtoIII_spread': 0.0,
 'ipsi_TtoIV_spread': 0.0,
 'contra_TtoI_spread': 0.0,
 'contra_TtoII_spread': 0.0,
 'contra_TtoIII_spread': 0.0,
 'contra_TtoIV_spread': 0.0,
 'ItoII_spread': 0.0,
 'IItoIII_spread': 0.123,
 'IIItoIV_spread': 0.0}

So, prefixing a parameter with ipsi_ or contra_ causes it to be sent to only the respective side. Of course, you can still set all parameters at once:

model.set_params(spread=0.234)
model.get_params()

{'ipsi_TtoI_spread': 0.234,
 'ipsi_TtoII_spread': 0.234,
 'ipsi_TtoIII_spread': 0.234,
 'ipsi_TtoIV_spread': 0.234,
 'contra_TtoI_spread': 0.234,
 'contra_TtoII_spread': 0.234,
 'contra_TtoIII_spread': 0.234,
 'contra_TtoIV_spread': 0.234,
 'ItoII_spread': 0.234,
 'IItoIII_spread': 0.234,
 'IIItoIV_spread': 0.234}

Any thinkable combination of setting groups of parameters is possible: All ipsilateral params at once, all tumor spreads at once, all contralateral lnl spreads together, and so on…

model.set_params(ipsi_spread=0.77)
model.set_lnl_spread_params(spread=0.543)
model.get_params()

{'ipsi_TtoI_spread': 0.77,
 'ipsi_TtoII_spread': 0.77,
 'ipsi_TtoIII_spread': 0.77,
 'ipsi_TtoIV_spread': 0.77,
 'contra_TtoI_spread': 0.234,
 'contra_TtoII_spread': 0.234,
 'contra_TtoIII_spread': 0.234,
 'contra_TtoIV_spread': 0.234,
 'ItoII_spread': 0.543,
 'IItoIII_spread': 0.543,
 'IIItoIV_spread': 0.543}

Note

Did you notice the LNL spread parameters are not prefixed with ipsi_ and contra_? This is because we set the LNL spread to be symmetric via the is_symmetric["lnl_spread"] = True parameter in the constructor of the class. If you change this, the model will have separate parameters for the two sides.

Modalities#

Setting the modalities works exactly as in the Unilateral case. The Bilateral class provides the same API for getting and setting the modalities and delegates this to the two sides.

model.set_modality("MRI", spec=0.63, sens=0.81)
model.set_modality("PET", spec=0.86, sens=0.79)
model.get_all_modalities()

{'MRI': Clinical(spec=0.63, sens=0.81, is_trinary=False),
 'PET': Clinical(spec=0.86, sens=0.79, is_trinary=False)}

print(model.get_modality("PET").confusion_matrix)

[[0.86 0.14]
 [0.21 0.79]]

Data / Observations#

The data loading APi is also the same compared to the Unilateral class. The only difference is that one now does not need to specify which side to load, since it will automatically load the ipsi and contra side.

import pandas as pd

dataset_url = "https://raw.githubusercontent.com/rmnldwg/lydata/main/2021-usz-oropharynx/data.csv"
example_cols = [
    ("patient", "#", "age"),
    ("patient", "#", "hpv_status"),
    ("tumor", "1", "t_stage"),
    ("PET", "ipsi", "I"),
    ("PET", "ipsi", "II"),
    ("PET", "ipsi", "III"),
    ("PET", "ipsi", "IV"),
    ("MRI", "ipsi", "I"),
    ("MRI", "ipsi", "II"),
    ("MRI", "ipsi", "III"),
    ("MRI", "ipsi", "IV"),
]

usz_oropharynx = pd.read_csv(dataset_url, header=[0,1,2])
usz_oropharynx[example_cols]

	patient		tumor	PET				MRI
	#		1	ipsi				ipsi
	age	hpv_status	t_stage	I	II	III	IV	I	II	III	IV
0	59	True	3	False	True	False	False	False	True	False	False
1	75	True	2	NaN	NaN	NaN	NaN	False	False	True	False
2	87	True	4	True	True	True	True	True	True	True	True
3	87	False	4	False	True	True	True	False	True	True	True
4	70	True	2	False	True	False	False	NaN	NaN	NaN	NaN
...	...	...	...	...	...	...	...	...	...	...	...
282	49	False	4	True	True	True	False	NaN	NaN	NaN	NaN
283	67	True	4	False	True	True	False	False	True	True	False
284	44	False	4	False	True	True	False	NaN	NaN	NaN	NaN
285	67	False	4	False	True	True	False	NaN	NaN	NaN	NaN
286	76	True	2	False	True	True	False	NaN	NaN	NaN	NaN

287 rows × 11 columns

model.replace_all_modalities({})
model.set_modality("PET", spec=0.86, sens=0.79)

model.load_patient_data(usz_oropharynx)
model.ipsi.patient_data["_model"]

	MRI				FNA				pCT		...	PET	CT				pathology				#
	I	II	III	IV	I	II	III	IV	I	II	...	IV	I	II	III	IV	I	II	III	IV	t_stage
0	False	True	False	False	NaN	True	True	NaN	False	True	...	False	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	late
1	False	False	True	False	NaN	True	NaN	NaN	False	True	...	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	early
2	True	True	True	True	NaN	NaN	NaN	NaN	True	True	...	True	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	late
3	False	True	True	True	NaN	True	True	True	False	True	...	True	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	late
4	NaN	NaN	NaN	NaN	NaN	True	NaN	NaN	False	True	...	False	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	early
...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...
282	NaN	NaN	NaN	NaN	NaN	True	NaN	NaN	True	True	...	False	True	True	False	False	NaN	NaN	NaN	NaN	late
283	False	True	True	False	NaN	NaN	NaN	NaN	False	True	...	False	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	late
284	NaN	NaN	NaN	NaN	NaN	True	True	NaN	False	True	...	False	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	late
285	NaN	NaN	NaN	NaN	NaN	NaN	False	NaN	False	True	...	False	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	late
286	NaN	NaN	NaN	NaN	NaN	True	NaN	NaN	False	True	...	False	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	early

287 rows × 25 columns

model.contra.patient_data["_model"]

	MRI				FNA				pCT		...	PET	CT				pathology				#
	I	II	III	IV	I	II	III	IV	I	II	...	IV	I	II	III	IV	I	II	III	IV	t_stage
0	False	False	False	False	NaN	False	False	NaN	False	False	...	False	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	late
1	False	False	False	False	NaN	True	NaN	NaN	False	True	...	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	early
2	True	True	True	True	NaN	NaN	NaN	NaN	True	True	...	True	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	late
3	False	False	False	False	NaN	NaN	NaN	NaN	False	False	...	False	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	late
4	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	False	False	...	False	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	early
...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...
282	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	False	False	...	False	False	False	False	False	NaN	NaN	NaN	NaN	late
283	False	False	False	False	NaN	NaN	NaN	NaN	False	False	...	False	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	late
284	NaN	NaN	NaN	NaN	NaN	False	NaN	NaN	False	True	...	False	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	late
285	NaN	NaN	NaN	NaN	NaN	False	NaN	NaN	False	True	...	False	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	late
286	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	False	False	...	False	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	early

287 rows × 25 columns

Distribution over Diagnose Times#

Just as with the modalities, the distributions over diagnose times are delegated to the two sides via the exact same API as in the Unilateral model:

import numpy as np
import scipy as sp

rng = np.random.default_rng(42)

max_time = model.max_time
time_steps = np.arange(max_time+1)
p = 0.3

early_prior = sp.stats.binom.pmf(time_steps, max_time, p)
model.set_distribution("early", early_prior)

def late_binomial(support: np.ndarray, p: float = 0.5) -> np.ndarray:
    """Parametrized binomial distribution."""
    return sp.stats.binom.pmf(support, n=support[-1], p=p)

model.set_distribution("late", late_binomial)
params_dict = model.get_params(as_dict=True, as_flat=True)
params_dict

{'ipsi_TtoI_spread': 0.77,
 'ipsi_TtoII_spread': 0.77,
 'ipsi_TtoIII_spread': 0.77,
 'ipsi_TtoIV_spread': 0.77,
 'contra_TtoI_spread': 0.234,
 'contra_TtoII_spread': 0.234,
 'contra_TtoIII_spread': 0.234,
 'contra_TtoIV_spread': 0.234,
 'ItoII_spread': 0.543,
 'IItoIII_spread': 0.543,
 'IIItoIV_spread': 0.543,
 'late_p': 0.5}

Notice the additional parameter late_p that determines the shape of the late diagnse time distribution.

Note

You cannot set the diagnose time distributions asymmetrically! With the modalities this may make sense (although it is not really supported, you may try), but for the diagnose times, this will surely break!

Likelihood#

And again we have the same API as before:

test_probabilities = {p: rng.random() for p in params_dict}

llh = model.likelihood(given_params=test_probabilities, log=True)
ipsi_llh = model.ipsi.likelihood(log=True)
contra_llh = model.contra.likelihood(log=True)

print(f"log-likelihood is {ipsi_llh:.2f} + {contra_llh:.2f} = {llh:.2f}")

log-likelihood is -993.10 + -1053.27 = -1947.65

Note that the two side’s likelihoods do not perfectly sum up. This is expected! A patient’s ipsi- and a contralateral diagnosis were diagnosed at the same time, not separately. They are thus not equally likely as if they were observed independently.