Chicago Crime Scene
Arrest Prediction
Chicago Crime Scene
Arrest Prediction
Chicago Crime Scene
Project Goals
Goal: Provide predictive guidance of likelihood of arrest.
Plan: determine rate of arrest per crime reports.
Method: use the dataset from the Chicago Data Portal.
Chicago Data Portal
Dataset available via the Socrata Open Data API
(and bulk downloads).
Developed a SODA to SQL schema converter for import.
SODA access permits daily updates.
Initial Models
Initial modeling seemed to show performance above baseline (of 80% accuracy).
Client wasn't necessarily impressed.
Much about Metrics
Accuracy: Client already knows "most of the time" they don't get arrested.
Concern #1: If I say no arrest, and I get that wrong, they lose "staff". I need to be Sensitive of that.
Concern #2: If I say arrest, and I get that wrong, they lose "business". So I need to be Precise.
Crime by Communities
The dataset calls these "Location Areas".
These are the colloquial parts of town.
This is also the most general breakout.
Crime by Wards
The political districts
A bit more granular
Politicians draw much more convoluted shapes.
Crime by the Beat
Distrcit and Beat per the CPD.
District and Beat are they only purely heirarchical geo features.
Beat is much more granular.
Chicago Crime Scene
Geo Data was unpersuasive.
Time Data was meager.
Two remaining features proved very useful:
Location DESCRIPTION
Crime DESCRIPTION
Chicago Crime Scene
Model Results
Arrest Predictor
Appendix
Chicago Crime Scene
Chicago Crime Scene
Chicago Crime Scene
Chicago Crime Scene
Chicago Crime Scene
Chicago Crime Scene