Skip to content

Conversation

@nspope
Copy link
Contributor

@nspope nspope commented Jan 25, 2026

Some infrastructure for supporting mutation rate maps

@nspope
Copy link
Contributor Author

nspope commented Jan 25, 2026

please take a glance at this when you have a chance @hyanwong . The plan is ultimately to internally transform the tree sequence to a "mutational unit" coordinate system (which might imply removing some edges/mutations in zero-rate intervals), do the dating, then copy the dates/posteriors over the the tree sequence in the original coordinate system. There's some tricky details to sort out, so I'm going to do this in pieces.

Copy link
Member

@hyanwong hyanwong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Two very minor comments.

@hyanwong
Copy link
Member

This is a great start to being able to test a more flexible definition of mutation rates. I see that the function can be used to mask out a region by setting the underlying rate to NaN, but might we eventually also want a more standard masking approach, e.g. which doesn't change the underlying coordinate system?

I think this is fine for the time being, though.

@nspope
Copy link
Contributor Author

nspope commented Jan 26, 2026

Thanks!

but might we eventually also want a more standard masking approach, e.g. which doesn't change the underlying coordinate system?

Not sure I'm understanding what you mean-- we won't be changing the coordinate system in the output. The coordinate system will only be changed internally for the EP algorithm, to express things in terms of mutational units, which lets us support ratemaps with minimal changes to the code. For example, even if the mutation rate is constant, unsequenced intervals could be modeled as zero-rate intervals: this is similar to using delete_intervals but doesn't "chop up" edges that contain small gaps.

There's already tskit infrastructure for "hard masking" via delete_intervals, so I don't think we need to do anything there.

@nspope nspope force-pushed the mutation-rate-map branch from fa90cbd to cfac191 Compare January 26, 2026 21:34
@nspope nspope force-pushed the mutation-rate-map branch from cfac191 to de19258 Compare January 26, 2026 21:44
@nspope nspope added this pull request to the merge queue Jan 26, 2026
Merged via the queue into tskit-dev:main with commit 4689864 Jan 26, 2026
8 checks passed
@hyanwong
Copy link
Member

Thanks!

but might we eventually also want a more standard masking approach, e.g. which doesn't change the underlying coordinate system?

Not sure I'm understanding what you mean-- we won't be changing the coordinate system in the output. The coordinate system will only be changed internally for the EP algorithm, to express things in terms of mutational units, which lets us support ratemaps with minimal changes to the code. For example, even if the mutation rate is constant, unsequenced intervals could be modeled as zero-rate intervals: this is similar to using delete_intervals but doesn't "chop up" edges that contain small gaps.

There's already tskit infrastructure for "hard masking" via delete_intervals, so I don't think we need to do anything there.

Ah, this is a good plan, internally. Thanks for merging. Also I see you suggested this as an internal approach for tskit "soft masking", which seems like a good start.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants