Toggle navigation
Home
▼ Details
Products and pricing
Chart gallery
User stories
Text analytics
CDC NAMCS Library
Blog
Tutorials
Contact
Sign in
Post Editor
← All blog posts
View post
Save
<p>How do we describe the distribution of time intervals when some aren't yet complete?</p> <p>The Kaplan–Meier Survival Estimator is a non-parametric curve that describes the empirical survival function given observed interval to-date. </p> <p>Importantly it is designed to handle "censored" data where the intervals are observed before they are known to be complete.</p> <p><img src="/images/blog/2018-07-18-kaplan/kaplan-meier-global-toyota.png" style="max-width: 540px"/></p> <!--more--> <h2 id="background">Background</h2> <p>Survival data arises arises in many scenarios:</p> <ul> <li>Lifespan of mutual funds, given many are still active</li> <li>Duration of therapy when many patients are still being treated</li> <li>Production span of cars when many are still produced</li> </ul> <p>The challenge is that in many samples we have start dates but many intervals are still continuing. Observed durations are are clipped at the time of the survey.<br>So a simple average or even median of observed durations when those durations may seriously underestimate the total time interval.</p> <p>Survival data is well established in clinical trials and other settings.</p> <h2 id="kaplan-meier-estimator">Kaplan-Meier Estimator</h2> <p>The Kaplan-Meier Survival Estimator is a simple robust approach to reporting empirical survival distributions with censored data.</p> <p>The estimator is given by:</p> <p> <img src="/images/blog/2018-07-18-kaplan/kaplan-meier-formula.png" style="max-width: 200px; margin:auto; display: block"/></p> <ul> <li><i>n<sub>i</sub></i> is number of intervals known to have survived to time <i>t<sub>i</sub></i></li> <li><i>d<sub>i</sub></i> is number of intervals known to have stopped at time <i>t<sub>i</sub></i></li> </ul> <h2 id="kaplan-meier-estimator-in-protobi">Kaplan-Meier Estimator in Protobi</h2> <p>Protobi can calculate Kaplan-Meier survival curves from start dates and end dates.</p> <p>Above is the distribution of production run lengths for specific auto models. One curve shows the global overall curve and one shows just Toyota models.</p> <p>Closed circles show termination events. Open circles show right-censored events where an interval is observed but still continuing.</p> <p>Below the chart is a table showing the median survival time, here defined as the time of the first event where empiric survival falls below 50%.</p> <p>Closed circles show termination events. Open circles show right-censored events</p>
Date
Status
Published
Draft
Slug
edit
Thumbnail
Categories
Manage
Release
Features
Datasets
Surveys
Tips
NAMCS
Applications
Crosstab
Tutorial
Design
Concepts
Segmentation
Examples
Blog Test Category
Delete
Convert to MD