Presenter: Cynthia Rudin
Abstract: I will describe two methods and applications for pattern detection, where patterns are grown from a seed of a few items:
1) Growing a List: The next generation of search engines should not simply retrieve URLs, but should aim at retrieving information. We designed a system that leads into this next generation, leveraging information from across the Internet to grow an authoritative list on almost any topic, starting from a seed.
2) Crime Series Detection: In joint work with the Cambridge Police Department, we designed a method called "Series Finder" that detects patterns of crime that are committed by the same individual or group. The method is tested on a decade's worth of housebreak data from Cambridge, MA.
Series Finder is a supervised pattern detection algorithm. Time permitting, I will provide statistical learning theoretic guarantees for supervised pattern detection methods.
Collaborators are: Benjamin Letham, Katherine Heller, Tong Wang, Daniel Wager, Rich Sevieri, and Jonathan Huggins
References are here:
Growing a List
Boston public radio interview about the project "A New Way to Google":
Learning to Detect Patterns of Crime
Boston Globe article "Cambridge Police Look at Math to Solve Crimes":
Towards a Theory of Pattern Discovery