Strata x DSSG – Real World Data Science
Coming December, we are proud to close the year with our biggest meetup yet – at Strata x Hadoop SG! It is a free event, open to all (no need for strata x hadoop tickets). For folks who wish to attend the full Strata x Hadoop conference, feel free to use our community’s discount code UGDSSG for a 20% discount on the tickets. You can also join our facebook group for data related geek discussions.
Agenda
• 6.45pm-7pm: Networking session.
• 7pm-7.20pm: Deploy Spark ML TensorFlow AI Models from Notebook to Hybrid-Cloud by Chris Fregly, Research Scientist at PipelineIO [hardcore]
• 7.20pm-7.40pm: Natural language processing using AWS Lambda by Arun Veettil, Principal Data Scientist, Starbucks [core]
• 7.40pm-8pm: How to build a text analytics engine at scale byYongzheng (Tiger) Zhang, Senior Staff, Analytics at Linkedin [core]
• 8pm-8.20pm: Interactive visualizations by Michael Freeman, University of Washington [casual]
• 8.20pm-8.40pm: Talk by Yantisa Akhadi, Project Manager, Humanitarian OpenStreetMap Team [casual]
• 8.40pm-9pm: Networking session.
Abstracts
Deploy Spark ML TensorFlow AI Models from Notebook to Hybrid-Cloud by Chris Fregly, Research Scientist at PipelineIO
In this completely 100% Open Source demo-based talk, Chris Fregly from PipelineIO will be addressing an area of machine learning and artificial intelligence that is often overlooked: the real-time, end-user-facing “serving” layer in a hybrid-cloud and on-premise deployment environment using Jupyter, NetflixOSS, Docker, and Kubernetes.
Serving models to end-users in real-time in a highly-scalable, fault-tolerant manner requires not only an understanding of machine learning fundamentals, but also an understanding of distributed systems and scalable microservices.
How to build a text analytics engine at scale by Yongzheng (Tiger) Zhang, Senior Staff, Analytics at Linkedin
In the era of Web 2.0 and Internet of Things, corporations and organizations around the globe are increasingly collecting hugeamount of data. A big portion of this data is unstructured in the form of text – from multiple channels such as product reviews, market research surveys, customer service tickets, mobile app reviews, on site feedback, and social media such as Facebook and Twitter. We will explain how to leverage big data infrastructure, machine learning, and natural language processing to build a highly performing, scalable, end-to-end text analytics platform. Such a platform enables us to listen to our customers and community, mine business insights quickly and effectively, empower business decisions for products and eventually improve our user experiences. We will also share the challenges, critical decisions and tradeoffs while building such a platform, as well as more applications and business opportunities enabled by such a platform.