Search form

Surveillance, Epidemiology and End Results (SEER)

Database Name: Surveillance, Epidemiology, and End Results (SEER)


Data Source

Population-based cancer registry covering 28% of the US population. Data have been collected since 1973. The SEER program is overseen by the National Cancer Institute.

Overview of data contents

Demographic information including age, sex, race/ethnicity, registry location, county level socioeconomic status, and marital status. Clinic variables include histology, stage, grade, and site-specific factors. Basic treatment is also included. These items include first primary surgery, receipt of radiation, lymph node sampling. Treatment dates and chemotherapy are not included.

Patient ages included

0-85+ years

Practice setting

Data are collected by registry staff from medical records. Follow-up data are primarily from the National Death Index.

Date range available


Relevant Work

Example publications

Etzioni, R., et al., Overdiagnosis due to prostate-specific antigen screening:lessons from U.S. prostate cancer incidence trends. J. Natl Cancer Inst, 2002. 94(13):p.981-90.

Chen, A.Y., A Jemal, and E.M. Ward, Increasing incidence of differentiated thyroid cancer in the United States, 1988-2005. Cancer, 2009. 115(16):p.3801-7.

Chaturvedi, A.K., et al., Incidence trends for human papillomavirus-related and -unrelated oral squamous cell carcinomas in the United States. J Clin Oncol, 2008. 26(4):p.612-9.


Cost estimate(s)

Data are free for download after signing a user agreement.

Contact/website information


Ease of use

SEER has its own data software called SEER*Stat which is user-friendly and easy to download. There are online tutorials for using SEER*Stat as well. Data can be exported into SAS which may require more storage space.

Data analysis

Researchers may want to excluded cases identified through death certificates. Site-specific factors vary by site and can be useful, however, there are varying degrees to complete data. AJCC Stage was not collected on all sites until the 2000's so this may also pose challenges.


Data are trustworthy and widely used. These data are very useful in assessing incidence and mortality patterns given its population-based nature. SEER data have low loss to follow up and cause-specific survival. Temporal trends are also easily assessed in these data. It over-samples minorities so the data can be used to assess racial/ethnic disparities.


Treatment data are complete (non-missing) but not very detailed. No data on co-morbidities, insurance, and individual-level socioeconomic status.