Toggle navigation
Home
▼ Details
Products and pricing
Chart gallery
User stories
Text analytics
CDC NAMCS Library
Blog
Tutorials
Contact
Sign in
Post Editor
← All blog posts
View post
Save
<p>How do we describe the distribution of time intervals when some aren't yet complete?</p> <p>The Kaplan–Meier Survival Estimator is a non-parametric curve that describes the empirical survival function given observed interval to-date. </p> <p>Importantly it is designed to handle "censored" data where the intervals are observed before they are known to be complete.</p> <p><img src="/images/blog/2018-07-18-kaplan/kaplan-meier-global-toyota.png" style="max-width: 540px"/></p> <!--more--> <h2 id="background">Background</h2> <p>Survival data arises arises in many scenarios:</p> <ul> <li>Lifespan of mutual funds, given many are still active</li> <li>Duration of therapy when many patients are still being treated</li> <li>Production span of cars when many are still produced</li> </ul> <p>The challenge is that in many samples we have start dates but many intervals are still continuing. Observed durations are are clipped at the time of the survey.<br>So a simple average or even median of observed durations when those durations may seriously underestimate the total time interval.</p> <p>Survival data is well established in clinical trials and other settings.</p> <h2 id="kaplan-meier-estimator">Kaplan-Meier Estimator</h2> <p>The Kaplan-Meier Survival Estimator is a simple robust approach to reporting empirical survival distributions with censored data.</p> <p>The estimator is given by:</p> <p> <img src="/images/blog/2018-07-18-kaplan/kaplan-meier-formula.png" style="max-width: 200px; margin:auto; display: block"/></p> <ul> <li><i>n<sub>i</sub></i> is number of intervals known to have survived to time <i>t<sub>i</sub></i></li> <li><i>d<sub>i</sub></i> is number of intervals known to have stopped at time <i>t<sub>i</sub></i></li> </ul> <h2 id="kaplan-meier-estimator-in-protobi">Kaplan-Meier Estimator in Protobi</h2> <p>Protobi can calculate Kaplan-Meier survival curves from start dates and end dates.</p> <p>Above is the distribution of production run lengths for specific auto models. One curve shows the global overall curve and one shows just Toyota models.</p> <p>Closed circles show termination events. Open circles show right-censored events where an interval is observed but still continuing.</p> <p>Below the chart is a table showing the median survival time, here defined as the time of the first event where empiric survival falls below 50%.</p> <p>Closed circles show termination events. Open circles show right-censored events</p>
Publishing
Date
Status
Published
Draft
Slug
edit
Content
Thumbnail
Categories
Manage
Release
Features
Datasets
Surveys
Tips
NAMCS
Applications
Crosstab
Tutorial
Design
Concepts
Segmentation
Examples
Blog Test Category
Tools
AI Database
Checking AI...
Convert to MD
Danger zone
Delete