Stat 107: Data Science Discovery featured in UC Berkeley Data Science Newsletter

Data Science Education at University of Illinois Urbana-Champaign

Date

06/04/20

In the recent May, 2020 issue of UC Berkeley's Data Science Newsletter, Statistics 107: Data Science Discovery - initially led by Karle Flanagan and Wade Fagen-Ulmschneider in its inception during the Spring 2019 term - was featured as part of UC Berkeley's School Spotlight feature. The School Spotlight feature is intended to showcase the latest schools that UC Berkeley have worked with on implementing Data Science programs. Stat 107 will be taught by Ha Khanh Nguyen for the Fall 2020 term. 

With permission from UC Berkeley's Data Science Education Support, the following is the School Spotlight feature on Stat 107 as it was originally published:

 

Data Science Education at University of Illinois Urbana-Champaign

An Example in Using Github instead of JupyterHub

 

Stat 107: Data Science Discovery is the University of Illinois Urbana-Champaign’s (UIUC) foundational data science course. Led by Professor Wade Fagen-Ulmschneider and Professor Karle Flanagan, Stat 107 is designed with no prerequisites with the goal that any student at UIUC is able to gain a comprehensive introduction to the “next BIG thing at Illinois”. 

Development of the course officially began in Fall 2018, shortly after UIUC attended the National Workshop in Data Science Education in Summer 2018 and learned about Data 8: Foundations of Data Science at Berkeley. UIUC shared Berkeley’s enthusiasm in offering an introductory course in data science that was accessible at a large scale. The pilot offering of Stat 107 was in Spring 2019 with 20 students from 20 different majors, with a massive growth to 300 students in its Fall 2019 offering, coinciding with its offering as a general education requirement.

While Stat 107 is modeled off of Data 8 in terms of curriculum, its infrastructure is based on differing philosophies. Programming in Python in the class is based on the pandas package rather than Berkeley’s datascience package, and local deployment of notebooks through Github is favored over JupyterHub as a means for deploying assignments. Traditionally, classes that utilize the datascience package implement the package throughout the class with the aim of flattening the perceived steepness of the programming learning curve -- a concern stemming from there being no formal prerequisites aside from high school mathematics. However, UIUC considers introducing students to pandas from the get-go as a more suitable option for bringing industry-relevant experience to the class. Empirically, students only struggle for the first two weeks with the learning curve, which is accompanied by close attention and support from the course staff. 

The theme of gearing students towards industry-related tools and skills is also exemplified with the usage of Github; the instructors give out starter code for pulling Jupyter notebooks from the course repository for students to follow through the course of the semester, with an explanation of the theory behind the code given in the second half of the semester. Furthermore, all exams are open-book, open-Google, and open-resource in general to mirror the workflow techniques and collaboration present in most of the industry.

UIUC is currently working towards a fully established data science program. Existing related programs include the B.S. in Statistics & CS degree and the CS+X degree, which allows students to specialize in one of 10+ concentrations in diverse fields such as advertising, chemistry, or music. UIUC hopes to kick off its data science specific programs by offering a minor in the near future. Over the next 3-5 years, UIUC also plans on expanding its 4 connector courses, which weave together core concepts and approaches from Data 8 with complementary ideas or areas like psychology, cognitive science, and business. These courses come with Stat 100: Statistics, an introductory statistics course, as a prerequisite. UIUC plans to include Stat 107 and Stat 100 in their data science degree curriculum, which will also include courses in statistics, computer science, mathematics, data ethics, and information.

 

 

Directory

aberaAnil
Bera
tfliaoTim
Liao
wendychoWendy
Cho
bigdogRobert
Brunner
xshaoXiaofeng
Shao
kdarmstrKevin
Armstrong
lhubertLawrence
Hubert
ymwMichelle
Wang
jwbowersJake
Bowers
ytliuYuk Tung
Liu
jmbliss2Jennifer
Anderson Bliss
rodrgzzsSandra
Rodriguez-Zas
aaron5Aaron
Thompson
psdeyPartha
Dey
rfengRunhuan
Feng
warnowTandy
Warnow
rsongRenming
Song
r-sowersRichard
Sowers
shjSheldon
Jacobson
dgsDouglas
Simpson
mdbanksMelissa
Banks
sc1706Sabyasachi
Chatterjee
xhchenXiaohui
Chen
yuguoYuguo
Chen
sculpeppSteven
Culpepper
dalpiaz2David
Dalpiaz
jeffdougJeffrey
Douglas
fellouriGeorgios
Fellouris
firemanEllen
Fireman
kflanKarle
Flanagan
glosemeyDarren
Glosemeyer
kinson2Christopher
Kinson
liboBo
Li
liangfFeng
Liang
jimardenJohn
Marden
martinseAdam
Martinsek
d-monradDitlev
Monrad
naveenNaveen Naidu
Narisetty
thp2Trevor
Park
sportnoyStephen
Portnoy
ravat1Uma
Ravat
stepanovAlexey
Stepanov
w-stout1William
Stout
dungerDavid
Unger
sdzhaoSihai
Zhao
rqzhuRuoqing
Zhu
cjaCarolyn
Anderson
mikegMichael
Grossman
farzadkFarzad
Kamalabadi
rkoenkerRoger
Koenker
pmoulinPierre
Moulin
vcsVictoria
Stodden
czhaiChengxiang
Zhai
mcampos3Mauricio
Campos
ychen409Yinyin
Chen
xiangc5Xiang
Cui
yujiad2Yujia
Deng
sarahef2Sarah
Formentini
weih2Wei
Han
trevorh2Trevor
Harris
kel6Ke
Li
li228Yutong
Li
yanl5Yan
Liu
yingl7Ying
Liu
jloyal2Joshua
Loyal
nute2Michael
Nute
dsass2Danielle
Sass
robintu2Robin
Tu
rwang52Runmin
Wang
ywang426Yihe
Wang
awarner5Austin
Warner
tengwu2Teng
Wu
feixue4Fei
Xue
xyang104Xinming
Yang
wyin5Wenjing
Yin
ayu12Albert
Yu
myu17Mengjia
Yu
yubaiy2Yubai
Yuan
syun13Sooin
Yun
yangfan3Yangfan
Zhang
changbo2Changbo
Zhu
wfchongAlfred
Chong
sanmiSanmi
Koyejo
alipkaAlexander
Lipka
vvvVenugopal
Veeravalli
jmzhangJinming
Zhang
deddelDirk
Eddelbuettel
songjJun
Song
shenyan3Shen
Yan
tianyiq3Tianyi
Qu
yy84Yun
Yang
linvillCharles
Linville
hyoeunHyoeun
Lee
masmthAaron
Smith
barbehe2Alton
Barbehenn
sayanc3Sayan
Chakrabarty
ac34Anamitra
Chaudhuri
yifanc10Yifan
Chen
sh26Shishuang
He
ziheliu2Zihe
Liu
rongt3Rong
Tang
mw33Mengchen
Wang
huiqinx2Huiqin
Xin
tx8Tianning
Xu
yimingx4Yiming
Xing
zihaoy3Zihao
Yang
rzhou14Ruixuan
Zhou
zarnsyJoseph
Zarnsy
ehammockEmma
Hammock
lbravoLelys
Bravo De Guenni
kfindleyKelly
Findley
dje13Daniel
Eck
xinranliXinran
LI
afrank00Andrea
Franklin
yuxuan15Yuxuan
Liu
jessemb2Jesse
Bowers
ec17Eduardo
Cardenas-Torres
kaustav2Kaustav
Chakraborty
hanjiag2Hanjia
Gao
rcg4Robert
Garrett
rofii2Rofi
Islam
jkuo11Jasmine
Kuo
davidl11David
Lundquist
ojha4Abhishek
Ojha
gangq2Gang
Qiao
saha12Diptarka
Saha
ysu17Yongchang
Su
aytonks2Adam
Tonks
ewayman2Eric
Wayman
therenw2Theren
Williams
dw12Dongxiao
Wu
rentian2Rentian
Yao
wenzhuo3Wenzhuo
Zhou
zhizhenzZhizhen
Zhao
milenkovOlgica
Milenkovic
hknguyenHa Khanh
Nguyen
wafWade
Fagen-Ulmschneider