About
Removing barriers to public health data — one variable at a time.
About This Platform
BRFSS Data Explorer was built to solve a specific problem: each annual BRFSS release is distributed as a SAS XPT file exceeding 1 GB, a format that cannot be opened in any general-purpose tool. Researchers without a SAS or SPSS license must either purchase one (typically hundreds to thousands of dollars per year) or rely on institutional infrastructure they may not have access to.
This burden falls hardest on the people who most need the data: students, early-career epidemiologists, faculty at minority-serving institutions, public health workers at community and county health departments, and researchers studying historically underserved and minority communities. Closing data access gaps is a precondition for closing health outcome gaps.
The platform addresses this by:
- Variable-level selection — choose only the columns relevant to your research question instead of downloading and storing the entire dataset.
- Instant CSV download — the output is a standard comma-separated file usable in R, Python, Excel, Stata, or any tool you already know.
- Codebook-driven metadata — every variable is presented with its label, the original survey question, and the meaning of its coded values, so users understand exactly what they are downloading.
About the Data
The Behavioral Risk Factor Surveillance System (BRFSS) is a state-based telephone survey coordinated by the U.S. Centers for Disease Control and Prevention (CDC) in partnership with state and territorial health departments. It collects data from over 400,000 adult interviews each year, making it the largest continuously conducted health survey system in the world.
- Years covered on this platform: 2011 through 2024
- Source: U.S. Centers for Disease Control and Prevention (CDC) and state health departments
- Coverage: All 50 U.S. states, the District of Columbia, and participating territories
- Topic areas include: chronic disease (diabetes, hypertension, cardiovascular disease, cancer), demographic and socioeconomic characteristics, health behaviors (tobacco, alcohol, physical activity, nutrition), preventive care and screenings, mental health, and social determinants of health.
All BRFSS data is publicly available from the CDC website at no cost. This platform does not re-license or repackage the data — it simply makes the existing public data easier to subset and download.
How to Use the Explorer
Watch the short walkthrough below to see how to select a year, browse health topic groups, choose variables, and download your filtered dataset as a CSV file.
Data Quality Control
Every dataset published on this platform has passed a rigorous multi-stage quality control procedure before being made available for download. Datasets that do not pass all checks are locked and inaccessible to users until issues are resolved.
Structural Integrity
Confirms that all topic-specific data files were created successfully, that every file contains the correct number of respondent records, and that no files are missing or empty. For the 2011 dataset, this verified 72 topic files each containing exactly 506,467 records.
Variable Alignment
Cross-references variables between the original SAS source file (XPT format), the PDF codebook, and all derived CSV files. Any variable present in the source but absent from the derived files is flagged as an error.
Statistical Validation
For every numeric variable, the minimum, maximum, mean, and null count in each derived CSV file are compared against the original XPT source within a tolerance of 0.0001%. This check identified and resolved floating-point precision loss in complex survey weight variables (such as final sample weights used in population-level inference) that standard CSV export would have silently corrupted.
Row-Level Value Verification
For every variable and every row, values in the derived CSV files are compared directly against the source XPT file. String variables are compared using a normalisation procedure that correctly handles SAS-format empty strings, leading zeros in coded values, and numeric type coercion — three edge cases that produce false mismatches in naive comparisons but represent no actual data corruption.
Only after all four checks pass with zero errors is a dataset unlocked and made available for download. The platform will not serve data from a year that has not completed this process.
Data Access & Citation
When using BRFSS data in publications, presentations, or reports, please cite the original source. A standard citation format is:
For full methodology, questionnaires, weighting documentation, and the official annual survey data, visit the CDC BRFSS site:
https://www.cdc.gov/brfss/index.html →
Disclaimer
BRFSS Explorer is an independent research tool developed to facilitate access to publicly available data collected by the CDC's Behavioral Risk Factor Surveillance System. This platform is not affiliated with, endorsed by, or operated by the Centers for Disease Control and Prevention (CDC) or any federal agency.
All data originates from CDC public-use files. A documented, multi-stage quality control procedure was applied to verify data integrity prior to publication on this platform. Users are encouraged to independently verify any variables critical to their analysis against the original CDC source files, available at www.cdc.gov/brfss.