-
If you are citizen of an European Union member nation, you may not use this service unless you are at least 16 years old.
-
You already know Dokkio is an AI-powered assistant to organize & manage your digital files & messages. Very soon, Dokkio will support Outlook as well as One Drive. Check it out today!
|
SP11OPIM410672
Page history
last edited
by shawndra@... 13 years ago
DSS/Data Mining and Machine Learning for Business Intelligence
OPIM 410/672: Spring 2011
Classtimes: (410/672.002) MW 10:30 - 12, (672.001) MW 1:30 - 3
First/Last Class: Jan 12 - April 25
Classroom: JMHH G86, G88
Instructor: Shawndra Hill
Office Hours: Friday 2-5pm (sign up for time slot on webcafe) , or by appointment.
Email: shawndra@wharton.upenn.edu (subject: [DSS class] … <- note!)
Telephone: email me
TA: Santiago Gallino (sgallino@wharton.upenn.edu)
Prerequisites: None
Text: Data Mining Techniques, Second Edition by Michael Berry and Gordon Linoff Wiley, 2004 ISBN: 0-471-47064-3. See slides for links to resources as well.
Remaining Project Due Dates
Group Formation (Email Your Group Members to Me) |
0 |
1/28/2010 |
2-page proposal/Must have data!Consulting Session |
5 |
2/13/2011 at 5pm Schedule consulting appt! |
Mid-semester presentation/Proposal refinement |
10 |
2/28/2011 and 3/2/2011*(send presentation by 2/27/2010 10pm EST, presentation order will be determined by a random selection process) |
Feedback to colleagues |
5 |
3/2/2011* |
Full presentation |
20 |
4/18/2011 – 4/25/2010 (send presentation by 4/17/2011 10pm EST) |
DM Report |
50 |
5/4/2011 (Note: This is the major deliverable. You may hand it in earlier if you like) |
Reviews/Contribution Report |
10 |
Last Day of Finals (5/10/2011) |
Supporting Documents
Outside Resources
Student Comments/Posts:
Human Subjects Disclosure: The completion of some of the assignments in this course may result in data of value for research on data mining/machine learning. If the data generated in the class are used in research, no information will be revealed about the identities of individuals or about the specific intellectual content of student work.
News and Announcements
Session Outline
Week
|
Lecture 1
|
Lecture 2
|
Monday
|
Wednesday
|
1
|
|
Jan12:
Introduction to the Course
Required Reading:
Chapter 1 and 2
|
|
Jan. 12
|
2
|
MLK
|
Jan 19:
Introduction to Data Mining/Classification/EDA
Required Reading:
Chapter 1 and 2
Reference Reading:
How Verizon Cut Customer Churn, Das M., Financial Express, 10-2003
Mining Business Databases, Brachman R.J, Khabaza, T., Klosgen W., Piatetsky-Shapiro, G. and Simoudis, E. Communications of the ACM, 1996, 39:11, pp.42-48
12 IT Skills That Employers Can’t Say No To, Brandel, M., Computerworld, 7
11-2007
|
NO CLASS
|
Jan. 19
|
3
|
Jan 24:
Classification: Recursive partitioning and Decision Trees
Required Reading:
Chapters 3,6 (pp 165 - 194)
Reference Reading:
Recursive Portfolio Selection with Decision Trees, Anton Andriyashin, Wolfgang Härdle, Roman Timofeev
Our Technology And Data, Farecast article
How To Buy Data Mining: A Framework For Avoiding Costly Project Pitfalls In Predictive Analytics, Eric A. King, E.A., DM Review, October 2005
An Insurance Policy For Low Airfares, Tedeschi, B., NY Times, January 22, 2007
|
Jan 26:
Classification: Recursive partitioning and Decision Trees
Required Reading:
Chapter 6 (pp 165 - 194)
Recursive Portfolio Selection with Decision Trees, Anton Andriyashin, Wolfgang Härdle, Roman Timofeev
Reference Reading:
Joined-up thinking, The Economist, Apr 4th 2007
Taking Retailers' Cues Harrah's Taps Into the Science of Gambling, WSJ, 11-22-2004
|
Jan. 25
HW1 Due (Jan 24)
|
Jan. 27
|
4
|
Jan 31:
Classification Model Evaluation
Required Reading:
Chapter 4 (pp 95-108)
Reference Reading:
Crafting Papers on Machine Learning, P. Langley
The Case Against Accuracy Estimation for Comparing Classifiers, Provost, F., T. Fawcett, and R. Kohavi, In Proceedings of the Fifteenth International Conference on Machine Learning (ICML-98).
|
Feb 2:
Cost Sensitive Learning
Required Reading:
Chapter 4 (pp 95-108)
The Relationship Between Default Prediction And Lending Profits:Integrating ROCAnalysis And Loan Pricing, Stein, R., Journal of Banking & Finance,29 (2005) 1213-1236
|
HW2 Due (Jan 30)
|
|
5
|
Feb 7:
Naïve Bayes
Required Reading:
Chapter 8: pp.257-271
Reference Reading:
Learning and Evaluating Classifers under Sample Selection Bias, Zadrozny, B.
On the Optimality of the Simple Bayesian Classifier under Zero-One Loss, Domingos P. and Pazzani, M. , Machine Learning, 29, 103-130, 1997
What You Need To Know About Bayesian Spam Filtering ,Tschabitscher, H.
A Plan For Spam,Zdziarski, J.
The State Of Spam, A monthly Report, Generated by Symantec Messaging and Web Security, February 2007
Spam And The Ongoing Battle For The Inbox Goodman, J., G.V. Cormack, and D. Heckerman, Communications of the ACM, February 2007, Vol.50,No. 2, pp. 25-33
|
Feb 9: LAB in JMHH Room 380
|
Project Proposal Due
(Feb 6)
HW3 Due (Feb 6)
|
|
6
|
Feb 14:
Association Rules/k-nearest Neighbor/Clustering
Required Reading
Chapter 9: Pages 287-315
Chapter 8: pp 257 - 271
Chapter 11: 349-365
Reference Reading:
TBA
|
Feb 16: LAB JMHH Room 380
Reference Reading:
Weka Tutorial
An Intelligent Assistant for the Knowledge Discovery Process: An Ontology-based Approach, Bernstein, A., Provost, F., Hill, S. IEEE Transactions on Knowledge and Data Engineering 17(4), pp. 503-518, 2005. (PDF)
|
|
|
7
|
Feb 21
Genetic Algorithms
Required Reading
Chapter 13 (or become familiar with GAs using online resources)
Reference Reading:
Discovering Interesting Patterns For Investment Decision Making With GLOWER – A Genetic Learner Overlaid With Entropy Reduction, Dhar, V., D. Chou, and F. Provost, DataMining and Knowledge Discovery, Vol. 4, No. 4/October, 2000
|
Feb 23
Neural Networks
Required Reading
Chapter 7 (or become familiar with Neural Networks using onine resources)
|
HW4 Due (Feb 20)
|
|
8
|
Feb 28:
Group Presentations (Progress Report)
|
Mar 2:
Group Presentations (Progress Report)
|
|
|
|
Spring Break
|
Spring Break
|
|
|
9
|
Mar 14:
Data Mining Competitions (Work on competition in class)
|
Mar 16:
Relational Learning
(come with an open mind)
Recommended Reading
Social Graph-iti, Oct 18th 2007
On Facebook, Scholars Link Up With Data, Stephanie Rosenbloom, December 17, 2007
Friend Accepted, The Economist, Oct 25th 2007
Six Degrees of Messaging, NatureNews, Katharine Sanderson, March 13, 2008
|
HW5Due (Mar 16)
|
|
10
|
Mar 21:
Text Mining
|
Mar 23:
Recommendation Systems/Collaborative Filtering
Required Reading
Amazon.com Recommendations: Item-to-Item Collaborative Filtering, Linden, G., B.Smith, & J. York, IEEE Computer Society, IEEE Internet Computing, Jan./Feb. 2003, pp. 76-80
Speaking out: Amazon.com's Jeff Bezos, The McGraw-Hill Companies, BusinessWeek Online, August 25, 2003
Netflix Prize Still Awaits a Movie Seer, Katie Hafner, NY Times, June 4,2007
You Want Innovation? Offer A Prize,Leonhardt, D., NY Times, Economix section, January 31, 2007
MySpace to Discuss Effort to Customize Ads, Brad Stone, NY Times, September 18, 2007
Assignments:
OUT: Assignment 5
Suggested Reading
The Economist - A different game - 02-27-2010 The Economist - All too much - 02-27-2010 The Economist - Clicking for gold - 02-27-2010 The Economist - Data, data everywhere - 02-27-2010 The Economist - Leaders The data deluge - 02-27-2010 The Economist - Needle in a haystack - 02-27-2010 The Economist - New rules for big data - 02-27-2010 The Economist - The open society - 02-27-2010
|
|
|
11
|
Mar 28:
Guest Speaker: Cong Yu, Google
|
Mar 30:
Large Scale Mining/Cloud Computing
Recommended:
http://computer.howstuffworks.com/cloud-computing.htm
http://www.zdnet.com/blog/hinchcliffe/eight-ways-that-cloud-computing-will-change-business/488
Check out these hadoop tutorials!
Lecture 1 in a 5 part Series: http://www.youtube.com/watch?v=yjPBkvYh-ss&feature=relmfu
|
|
|
12
|
Apr 4:
Guest Speaker:
Chris Volinsky, AT&T Labs Research
|
Apr 6:
Guest Speaker:
Greg Levitt 33 Across
|
|
|
13
|
April 11:
Guest Speaker:
Nick Lim, Sonamine
|
April 13
Guest Speaker:
Sheldon Gilbert, Proclivity Systems
|
|
|
14
|
Group Presentations 1
|
Group Presentations 2
|
|
|
15
|
Class Optional -- Use for extra office hours
|
|
|
|
SP11OPIM410672
|
Tip: To turn text into a link, highlight the text, then click on a page or file from the list above.
|
|
|
|
|
Comments (0)
You don't have permission to comment on this page.