• Home
  • Action
  • Events
  • Core Values
  • Connect
  • KDD
  • SIGKDD

    ACM SIGKDD
    Seattle Chapter

    Our goal is to promote community of Data Scientists, Statisticians, Machine Learning experts/researchers and practitioners from both industry and academia by organizing talks from leaders in the field, regular meetups with fun data science activities and workshops in the greater Seattle area.

  • Discussed at KDD 2013 and Established in March, 2014

     
     
     
    • Great support from local community and help from ACM and SIGKDD
    • Special thanks to Ying Li, Johannes Gehrke, Raghu Ramakrishnan, Bing Liu, and Zarina Strakhan

  • Events

    Recent Activities / Presentations / Talks

    The Science Behind Predicting Voice Elicited Emotions

    Wednesday, May 18th, 2016
    (5:30 PM - 7:30 PM)

    Location: Madrona Venture Group
    999 3rd Ave, Seattle

    34th floor


    It’s not what you say; it’s how you say it!

    Meet local data scientists, data enthusiasts, developers, and otherwise cool people while learning about the science behind voice analytics and how they are being applied at Jobaline Inc. in Kirkland.


    Jobaline Inc.’s Chief Data Scientist Dr. Ying Li presents the latest research in her talk: The Science Behind Predicting Voice Elicited Emotions Hosted at the Madrona Venture Group offices in Seattle.

     

    • 5:30pm: Doors open, come enjoy Pizza & refreshments, socialize pre-talk 
    • 6:15 - 7:00pm: Dr. Li presents 
    • 7:00pm: Q & A, post-presentation social

     

    Dr. Li will present the research, product development and eventual deployment, of Voice Analyzer developed at Jobaline that analyzes voice data and predicts human emotions elicited by the paralinguistic elements of voices. She will give an overview of the raw data, the data processing steps, and the prediction algorithms we experimented with, and the deployed system.

     

    She will present case studies where, given a voice clip, models predict the degree in which a listener will find themselves feeling “engaged” or “soothed”. The technology is deployed into Jobaline products for assisting companies to hire workers in the service industries where customers’ emotional response to workers’ voices may affect the service outcome.

     

    Message from the Speaker:

    Dr. Ying Li

    Building on my personal dedicated practice of data science in multiple industries since 1998, and in the spirit of sharing with the community, the last quarter of this talk will present a set of learnt principles for the Practice of Data Science, enumerate the current states of practices through examples, anticipate an optimal future for which the practitioners of data science should be prepared for and contribute to, in the hope that a disciplined practice of data science will truly deserve the hyped social and economical attention, and more importantly will scale and maximize to new potentials.

    Data Club Meetup-Data Science from Scratch (Gradient Descent/Logistic Regression)

    Wednesday, September 2nd 2015
    (6:30 PM - 8:30 PM)

    Location: Bellevue City Hall

    Room: 1E-120


    Abstract:

    In continuation from our last meet-up, we will be covering two more chapters from Joel Grus' book, Data Science from Scratch: First Principles with Python. Following Joel’s format, we will first go over a brief theoretical description of the algorithms and then collaboratively code them in Python. The two chapters we will work as examples are Gradient Descent and Logistic Regression.

    Regardless to one's level of programming expertise, one should gain a good understanding of these two algorithms after this meet-up.


    About Kushal Lakhotia:

    Kushal is an engineer in Bing's Web Search Relevance team at Microsoft where he works on ranking. He tweets at @hikushalhere.

    Data Club Meetup - Data Science from Scratch (Naive Bayes and Neural Networks)

    Wednesday, August 5th 2015
    (6:30 PM - 8:30 PM)

    Location: Bellevue City Hall

    Room: 1E-120

    Abstract

    In continuation from our last meet-up, we will continue to work more chapters from Joel Grus' book, Data Science from Scratch: First Principles with Python. Following Joel’s format, we will first go over a brief theoretical description of the algorithms and then collaboratively code them in pure python. The two chapters we will work as examples are Naïve Bayes and Neural Networks.

     

    Regardless to one's level of programming expertise, one should gain a deeper understanding of these two algorithms after this meet-up.

     

    About Kevin Mueller:

    Kevin is a current graduate student at the University of Washington studying applied mathematics. He is currently interning at Jobaline where he assists Dr. Ying Li with developing Jobaline’s voice analyzer.

     

    Please bring your laptop, if you want to code along. You should also have python and matplotlib installed.

    Data Club Meetup #5 - Data Science from Scratch and Clustering Application

    Wednesday, June 24th 2015
    (6:30 PM - 8:30 PM)

    Location: Jobaline Headquarters

    620 Kirkland Way

    Suite 208

    Kirkland, WA

    Abstract

    Everyone wants to either be a data scientist or hire a data scientist. Yet we spend very little time thinking about the best way to teach (or learn) data science. Should one start with math and stats? Or instead, should they just dive right into machine learning? Do they need to learn all the tools? I've tried them all and more. During this meetup, I'll give examples of what's worked and what hasn't and share some broader thoughts about tech education.

     

    In particular, we will work through this problem as example: K-means clustering is a popular machine learning technique for identifying “clusters” in data sets. It’s also pretty simple to understand and implement. In this meetup, we’ll learn how the algorithm works, implement it in Python, and use it to “posterize” pictures.

     

    About Joel

    Joel is the author of "Data Science from Scratch: First Principles with Python". He works as a software engineer at Google. Before that he was a data scientist at several startups, where he first learned and then taught data science. He spends more time than is healthy thinking about pedagogy.

     

    Please bring your laptop, if you want to code along. You should also have python and matplotlib installed.

     

    Pizza and soft drinks will be sponsored by Jobaline.

    Data Club Meetup #4

    Tuesday, June 9th 2015
    (6:30 PM - 8:30 PM)

    Location: Bellevue City Hall

    450 110th Ave NE

    Bellevue, WA 98004

     

    Topics for This Session

    In the world of Big Data, analytics systems have benefited greatly from the ability to scale horizontally. Systems like Hadoop have been widely used to perform distributed batch processing on massive data sets, but there is a growing need in the industry to do the same scale of processing except in a real-time streaming fashion. Apache Storm is one such framework that enables this kind of processing. In this session, Brandon will introduce the core concepts of streaming distributed processing using Storm, the architecture of a Storm cluster, and show you what it takes to build your first Storm topology.

     

    About Storm

    Apache Storm is an open-source distributed realtime computation system used in the industry by companies like Twitter, Spotify, Expedia and others. Storm makes it easy to reliably process unbounded streams of data, doing for
    realtime processing what Hadoop did for batch processing.

     

    About Brandon

    Brandon O’Brien is a Data Engineer working at Expedia who is leveraging Storm to build a real time travel market analytics platform called Expedia Insights. Contact: https://www.linkedin.com/in/brandonjobrien

    Please bring your laptop, if you want to implement code.

    Data Club Meetup #3

    Wednesday, April 22nd, 2015
    (6:30 PM - 8:30 PM)

    Location: Bellevue City Hall (Room: 1E-120)

    450 110th Ave NE

    Bellevue, WA 98004

    Objectives

    We will meet to discuss/share data mining and machine learning (ML) techniques/tools.

    We will also analyze public datasets, and build data mining and ML models/applications.

    Topics for This Session

    We would cover following topics. 

    1. Public medical survey data (at patient level after treatment)
    2. A demo of supervised learning applied to the above data, with detailed steps
    Please bring your laptop, if you want to implement code.

    Directions and Parking: http://www.ci.bellevue.wa.us/parking-directions.htm

    Bellevue City Hall provides complimentary parking, however, the visitor parking lot fills quickly. There are several “pay for parking” lots in the immediate vicinity should the lot be full.

    David Kasik, Boeing use of Visualization and Visual Analytics

    Tuesday, April 14th, 2015

    Speaker Bio

    Dave Kasik is Boeing's Senior Technical Fellow in visualization and interactive techniques and is pioneering the use of visual analytics to help extract more information from complex non-geometric data. Visual analytics supplements more traditional analytic techniques (like statistics and data mining) with a human’s ability to use vision to find anomalies and detect trends. He is exploring emerging visual analytics tools in areas as diverse as safety and marketing.

    Dave earned his Masters in Computer Science from the University of Colorado in 1972 and a Bachelor’s in Quantitative Studies from the Johns Hopkins University in 1970. He’s an ACM Fellow and involved in professional activities with both ACM and IEEE.

     

    Abstract:

    The talk would be centered around impact of increasing amount of data on visualization, difference between Data Analysis and Data Analytics, motivation, trends, desired skills and more - similar to what Dave talked to KD Nuggets

    http://www.kdnuggets.com/2015/02/interview-david-kasik-boeing-data-analytics.html

    Data Club Meetup #1 & #2

    Thursday, March 19, 2015
    Monday, March 30th, 2015
    We will meet to discuss/share data mining and machine learning (ML) techniques/tools.We will also analyze public datasets, and build data mining and ML models/applications.

    Participants will be able to build/accumulate a portfolio of data science work. A portfolio is best acknowledged if it is displayed to the public. For this reason and for the benefit of participants at Data Club, we consider what we will be working on in the Data Club to be public domain.

    Sum-Product Networks: Deep Models with Tractable Inference by Dr. Pedro Domingos

    Tuesday, March 3, 2015
    Abstract:
    Big data makes it possible in principle to learn very rich probabilistic models, but inference in them is prohibitively expensive. Since inference is typically a subroutine of learning, in practice learning such models is very hard. Sum-product networks (SPNs) are a new model class that squares this circle by providing maximum flexibility while guaranteeing tractability. In contrast to Bayesian networks and Markov random fields, SPNs can remain tractable even in the absence of conditional independence. SPNs are defined recursively: an SPN is either a univariate distribution, a product of SPNs over disjoint variables, or a weighted sum of SPNs over the same variables. It's easy to show that the partition function, all marginals and all conditional MAP states of an SPN can be computed in time linear in its size. SPNs have most tractable distributions as special cases, including hierarchical mixture models, thin junction trees, and nonrecursive probabilistic context-free grammars. I will present generative and discriminative algorithms for learning SPN weights, and an algorithm for learning SPN structure. SPNs have achieved impressive results in a wide variety of domains, including object recognition, image completion, collaborative filtering, and click prediction. Our algorithms can easily learn SPNs with many layers of latent variables, making them arguably the most powerful type of deep learning to date. (Joint work with Rob Gens and Hoifung Poon.)
    Dr. Domingos received an undergraduate degree (1988) and M.S. in Electrical Engineering and Computer Science (1992) from IST, in Lisbon. He received an M.S. (1994) and Ph.D. (1997) in Information and Computer Science from the University of California at Irvine. He spent two years as an assistant professor at IST, before joining the faculty of the University of Washington in 1999. He’s the author or co-author of over 200 technical publications in machine learning, data mining, and other areas. He’s a winner of the SIGKDD Innovation Award, the highest honor in data science. He’s a AAAI Fellow, and he's received a Sloan Fellowship, an NSF CAREER Award, a Fulbright Scholarship, an IBM Faculty Award, several best paper awards, and other distinctions. He’s a member of the editorial board of the Machine Learning journal, co-founder of the International Machine Learning Society, and past associate editor of JAIR. He was program co-chair of KDD-2003 and SRL-2009, and he’s served on the program committees of AAAI, ICML, IJCAI, KDD, NIPS, SIGMOD, UAI, WWW, and others.

    The Future of Data Mining Talk by Dr. Oren Etzioni
     

    Tuesday, October 28, 2014
    Abstract:

    Deep learning has catapulted to the front page of the New York Times, formed the core of the so-called 'Google brain,' and achieved impressive results in vision, speech recognition, and elsewhere. Yet building intelligent systems requires us to go way beyond the capabilities of deep learning and today's data-mining systems. The future of the Big Data paradigm lies in extending these powerful methods to acquire knowledge from text, databases, diagrams, images, and video. We also need to reason tractably using this acquired knowledge to make sense of the world, and to draw novel conclusions. My talk will describe research at the new Allen Institute for AI aimed at building this next generation of intelligent systems. This will be a more in-depth version of my KDD 2014 keynote talk.
     
    Speaker Bio: Dr. Oren Etzioni is Chief Executive Officer of the Allen Institute for Artificial Intelligence. He’s been a Professor at the University of Washington's Computer Science department starting in 1991, garnering several awards including Seattle's Geek of the Year (2013), the Robert Engelmore Memorial Award (2007), the IJCAI Distinguished Paper Award (2005), AAAI Fellow (2003), and a National Young Investigator Award (1993). He was also the founder or co-founder of several companies including Farecast (sold to Microsoft in 2008) and Decide (sold to eBay in 2013), and the author of over 100 technical papers that have garnered over 21,000 citations. The goal of Oren's research is to solve fundamental problems in AI, particularly the automatic learning of knowledge from text. Oren received his Ph.D. from Carnegie Mellon University in 1991, and his B.A. from Harvard in 1986.

    Upcoming Events

    Future Speakers
    • Carlos Guestrin, CEO, GraphLab and Prof. @ UW
    • Ying Li, Chief Scientist, EV Analysis
    • Roger Barga, Director, Amazon
    • Josepha Sirosh, CVP of Machine Learning, Microsoft
    • Johannes Gehrke, Distinguished Engineer, Microsoft and Prof. @ Cornell
    • Raghu Ramakrishnan, Technical Fellow, Microsoft
    • Ronny Kohavi, Distinguished Engineer, Microsoft
    • JC Mao, Distinguished Engineer, Microsoft
  • Core Values

    Growth Plan and Activities, Current Officers, and Advisory Board


    Growth Plan and Activities
     

    • Meet-ups, lectures, and workshops in the Greater Seattle area.
    • Help organizing workshops in KDD conference such as ADKDD workshop.
    • Help and promote better data science education in the Greater Seattle area

    SIGKDD Seattle Chapter Officers

    • Pusheng Zhang (Chair)
    • Dr. Jun Yuan (Vice Chair)
    • Qiao-Lin Mao (Treasurer)
    • Fahad Shah (Event Officer)
    • Kenny Herrington (Webmaster)

    Advisory Board

    • Oren Etzioni, CEO, Allen Inst. for Artificial Intelligence and UW
    • Usama Fayyad, Chief Data Officer at Barclays Bank
    • Johannes Gehrke, Professor at Cornell and Distinguished Engineer at Microsoft
    • Anne Kao, Senior Technical Fellow, Boeing
    • Ying Li, Chief Scientist, EV Analysis Corporation
    • Raghu Ramakrishnan, Technical Fellow, Microsoft
    • Carlos Guestrin, CEO of GraphLab and Professor at UW
  • Contact / Follow Us

    We're sincerely happy to have you reach out to us, or to just have you follow us on social media!

    Email
    Twitter
    Copyright 2014
    Powered by Strikingly - Mobile-friendly website in minutes
    Create your own website with Strikingly
    ${thanksMessage}
    Contact form brand
    Powered by Strikingly - Mobile-friendly website in minutes
    Powered by Strikingly - Mobile-friendly website in minutes
    • Home
    • Action
    • Events
    • Core Values
    • Connect
    • KDD