Overall Stats API #

The Overall Stats API provides project-wide statistics about Project Sidewalk's data collection efforts in Oradell, NJ, including total distance covered, label counts by type, user participation metrics, and data quality indicators.

Overall Stats API Preview #

Below is a live preview of the Overall Stats API data for Oradell, NJ:

Loading project statistics...

Endpoint #

Retrieve overall statistics for the entire Project Sidewalk dataset.

GET /v3/api/overallStats

Examples#

/v3/api/overallStats?filetype=json Get overall stats for Oradell, NJ in JSON (default)

/v3/api/overallStats?filetype=csv Get overall stats for Oradell, NJ in CSV

Quick Download #

Download overall statistics data directly in your preferred format:

Query Parameters#

This endpoint accepts the following optional query parameters.

Parameter	Type	Description
`filterLowQuality`	`boolean`	When set to `true`, excludes data from low-quality contributors to provide more reliable statistics. Default: `false` (includes all data).
`filetype`	`string`	Specify the output format. Options: `json` (default), `csv`.

Responses#

Success Response (200 OK)#

On success, the API returns an HTTP 200 OK status code and the requested data in the specified filetype format.

JSON Format (Default) #

Returns a JSON object with comprehensive project statistics:

{
    "launch_date": "2021-06-15T00:00:00Z",
    "avg_timestamp_last_100_labels": "2023-09-25T14:32:47Z",
    "km_explored": 1834.26,
    "km_explored_no_overlap": 1523.75,
    "user_counts": {
        "all_users": 4287,
        "labelers": 3892,
        "validators": 895,
        "registered": 3456,
        "anonymous": 831,
        "turker": 214,
        "researcher": 42
    },
    "labels": {
        "label_count": 183427,
        "label_count_with_severity": 162320,
        "avg_label_timestamp": "2021-05-10T20:20:25.147504Z",
        "avg_age_of_image_when_labeled": "672 days",
        "CurbRamp": {
            "count": 72964,
            "count_with_severity": 61837,
            "severity_mean": 1.2,
            "severity_sd": 0.5
        },
        "NoCurbRamp": {
            "count": 35682,
            "count_with_severity": 31245,
            "severity_mean": 3.8,
            "severity_sd": 0.9
        },
        // Some label types like NoSidewalk and Signal don't have severity ratings, so severity fields are null.
        "NoSidewalk": {
            "count": 61837,
            "count_with_severity": null,
            "severity_mean": null,
            "severity_sd": null
        },
        ... // Remaining label types
    },
    "validations": {
        "total_validations": 125834,
        "Overall": {
            "validated": 151196,
            "agreed": 127564,
            "disagreed": 23632,
            "accuracy": 0.84,
            "has_a_validation": 162974
        },
        "CurbRamp": {
            "validated": 54321,
            "agreed": 48923,
            "disagreed": 5398,
            "accuracy": 0.90,
            "has_a_validation": 59422
        },
        "NoCurbRamp": {
            "validated": 21456,
            "agreed": 18237,
            "disagreed": 3219,
            "accuracy": 0.85,
            "has_a_validation": 26100
        },
        ... // Remaining label types
    },
    "ai_stats": {
        "Overall": {
            "human_majority_vote": {
                "ai_yes_human_concurs": 37,
                "ai_yes_human_differs": 3,
                "ai_no_human_differs": 2,
                "ai_no_human_concurs": 37
            },
            "admin_majority_vote": {
                "ai_yes_human_concurs": 127,
                "ai_yes_human_differs": 54,
                "ai_no_human_differs": 27,
                "ai_no_human_concurs": 134
            }
        },
        "CurbRamp": {
            "human_majority_vote": {
                "ai_yes_human_concurs": 20,
                "ai_yes_human_differs": 0,
                "ai_no_human_differs": 0,
                "ai_no_human_concurs": 2
            },
            "admin_majority_vote": {
                "ai_yes_human_concurs": 42,
                "ai_yes_human_differs": 2,
                "ai_no_human_differs": 6,
                "ai_no_human_concurs": 48
            }
        },
        ... // Remaining label types
    }
}

JSON Field Descriptions #

The response includes the following fields:

Field	Type	Description
`launch_date`	`string`	ISO 8601 formatted date when Project Sidewalk was launched in this city.
`avg_timestamp_last_100_labels`	`string`	ISO 8601 formatted average timestamp of the 100 most recent labels, indicating data recency.
`km_explored`	`number`	Total kilometers of streets explored by all users, including overlapping streets.
`km_explored_no_overlap`	`number`	Total kilometers of unique streets explored, excluding overlapping streets.
`user_counts.all_users`	`integer`	Total number of users who have contributed to Project Sidewalk.
`user_counts.labelers`	`integer`	Number of users who have participated in explore/labeling tasks.
`user_counts.validators`	`integer`	Number of users who have participated in validation tasks.
`user_counts.registered`	`integer`	Number of users who have created accounts on Project Sidewalk.
`user_counts.anonymous`	`integer`	Number of anonymous users.
`user_counts.turker`	`integer`	Number of users from crowdsourcing platforms.
`user_counts.researcher`	`integer`	Number of users with the researcher role (includes all Admins).
`labels`	`object`	Statistics about label counts and severity ratings by label type.
`labels.label_count`	`integer`	Total number of accessibility labels placed by all users.
`labels.label_count_with_severity`	`integer`	Total number of labels placed by all users with an associated severity rating.
`avg_label_timestamp`	`string`	ISO 8601 formatted average timestamp when labels were created.
`labels.avg_age_of_image_when_labeled`	`string`	The average, across all labels, of the age of the image when the label was placed (in days).
`labels.[type].count`	`integer`	Total number of labels of this type.
`labels.[type].count_with_severity`	`integer \| null`	Number of labels of this type that have severity ratings, or null if no severity ratings exist.
`labels.[type].severity_mean`	`number \| null`	Mean severity rating for this label type, or null if no severity ratings exist.
`labels.[type].severity_sd`	`number \| null`	Standard deviation of severity ratings for this label type, or null if insufficient data.
`validations`	`object`	Statistics about validation accuracy by label type.
`validations.total_validations`	`integer`	Total number of validation judgments made across all labels.
`validations.[type].validated`	`integer`	Number of labels of this type that have been validated as either "correct" or "incorrect" through majority vote; a label is not included if the number of agree and disagree votes are equal.
`validations.[type].agreed`	`integer`	Number of labels of this type that have been validated as "correct" through majority vote.
`validations.[type].disagreed`	`integer`	Number of labels of this type that have been validated as "incorrect" through majority vote
`validations.[type].accuracy`	`number \| null`	Calculated accuracy rate (agreed / validated) for this label type, or null if no validations.
`validations.[type].validated`	`integer`	Number of labels of this type that have received any validation votes.
`ai_stats`	`object`	Statistics human agreement with AI validations.
`ai_stats.[type].[vote]`	`object`	Vote can be either "human" for majority vote across all users, or "admin" for majority vote across admin users.
`ai_stats.[type].[vote].ai_yes_human_concurs`	`object`	Number of labels where AI voted yes and human users voted yes more often than no.
`ai_stats.[type].[vote].ai_yes_human_differs`	`object`	Number of labels where AI voted yes but human users voted no more often than yes.
`ai_stats.[type].[vote].ai_no_human_differs`	`object`	Number of labels where AI voted no but human users voted yes more often than no.
`ai_stats.[type].[vote].ai_no_human_concurs`	`object`	Number of labels where AI voted no and human users voted no more often than yes.

CSV Format #

If filetype=csv is specified, the response body will be CSV data with key-value pairs. Each row represents a different statistic or metric:

Launch Date,2021-06-15
Recent Labels Average Timestamp,2023-09-25T14:32:47Z
KM Explored,1834.26
KM Explored Without Overlap,1523.75
Total User Count,4287
Explore User Count,3892
Validate User Count,895
Registered User Count,3456
Anonymous User Count,831
Turker User Count,214
Researcher User Count,42
Total Label Count,183427
Total Label Count With Severity,162320
Average Label Timestamp,2021-05-10T20:20:25.147504Z
Average Age of Image When Labeled,672 days
CurbRamp Count,72964
CurbRamp Count With Severity,61837
CurbRamp Severity Mean,1.2
CurbRamp Severity SD,0.5
...
Total Validations,125834
Overall Labels Validated,151196
Overall Agreed Count,127564
Overall Disagreed Count,23632
Overall Accuracy,0.84
Overall Labels With a Validation,162974
CurbRamp Labels Validated,54321
CurbRamp Agreed Count,48923
CurbRamp Disagreed Count,5398
CurbRamp Accuracy,0.90
CurbRamp Labels With a Validation,59422
...
Overall AI Yes and Human Majority Vote Concurs,37
Overall AI Yes but Human Majority Vote Differs,3
Overall AI No but Human Majority Vote Differs,2
Overall AI No and Human Majority Vote Concurs,37
Overall AI Yes and Admin Majority Vote Concurs,127
Overall AI Yes but Admin Majority Vote Differs,54
Overall AI No but Admin Majority Vote Differs,27
Overall AI No and Admin Majority Vote Concurs,134
CurbRamp AI Yes and Human Majority Vote Concurs,20
CurbRamp AI Yes but Human Majority Vote Differs,0
CurbRamp AI No but Human Majority Vote Differs,0
CurbRamp AI No and Human Majority Vote Concurs,2
CurbRamp AI Yes and Admin Majority Vote Concurs,42
CurbRamp AI Yes but Admin Majority Vote Differs,2
CurbRamp AI No but Admin Majority Vote Differs,6
CurbRamp AI No and Admin Majority Vote Concurs,48
...

CSV Format Description #

In CSV format, each row represents a specific metric in a key-value format:

Top-level statistics are represented directly as named rows (e.g., "KM Explored,1834.26")
Nested statistics like user_counts and labels are flattened into multiple rows with descriptive names
For each label type, label severity stats are presented as four rows: Count, Count With Severity, Severity Mean, and Severity SD
For each label type, validation stats are presented as four rows: Labels Validated, Agreed Count, Disagreed Count, and Accuracy
This flat structure makes the data easy to parse and analyze in spreadsheet applications

Error Responses#

If an error occurs, the API will return an appropriate HTTP status code and a JSON response body containing details about the error.

400 Bad Request: Invalid parameter values.
404 Not Found: The requested resource does not exist.
500 Internal Server Error: An unexpected error occurred on the server.
503 Service Unavailable: The server is temporarily unable to handle the request.

Error Response Body #

Error responses include a JSON body with the following structure:

{
    "status": 400, // HTTP Status Code
    "code": "INVALID_PARAMETER", // Machine-readable error code
    "message": "Invalid value for filetype parameter. Expected 'csv' or 'json'.", // Human-readable description
    "parameter": "filetype" // Optional: The specific parameter causing the error
}

Data Analysis Tips #

The Overall Stats API provides a comprehensive view of Project Sidewalk data. Here are some suggestions for effectively using this data:

Consider using filterLowQuality=true for more reliable analysis, especially when examining severity ratings
Compare accuracy rates across label types to identify which accessibility issues might be more ambiguous or difficult to detect
Use km_explored vs. km_explored_no_overlap to understand the level of redundancy in data collection
Look at avg_timestamp_last_100_labels to gauge how recently the data has been updated
Analyze the ratio of validators to explorers to understand community participation patterns

Related APIs

For more detailed analysis, consider using the Overall Stats API in conjunction with:

User Stats API - Get statistics for individual users and their contributions
Raw Labels API - Access the individual label data with geographic information
Label Types API - Get information about the different types of accessibility issues
Cities API - See all cities where Project Sidewalk is deployed

Terms of Use #

If you use Project Sidewalk data in your research, please cite the following paper (awarded 🏆 Best Paper at CHI 2019):

Manaswi Saha, Michael Saugstad, Hanuma Teja Maddali, Aileen Zeng, Ryan Holland, Steven Bower, Aditya Dash, Sage Chen, Anthony Li, Kotaro Hara, and Jon Froehlich. 2019. Project Sidewalk: A Web-based Crowdsourcing Tool for Collecting Sidewalk Accessibility Data At Scale. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems (CHI '19). Association for Computing Machinery, New York, NY, USA, Paper 62, 1–14. Link to paper

Contribute#

Project Sidewalk is an open-source project created by the Makeability Lab and hosted on GitHub. We welcome your contributions! If you found a bug or have a feature request, please open an issue on GitHub.

You can also email us at sidewalk@cs.uw.edu

Project Sidewalk in Your City!#

If you are interested in bringing Project Sidewalk to your city, please read our Wiki page.