Building a natural language processing library for Apache Spark

[A version of this post appears on the O’Reilly Radar.] The O’Reilly Data Show Podcast: David Talby on a new NLP library for Spark, and why model development starts after a model gets deployed to production. When I first discovered and started using Apache Spark, a majority of the use cases I used it for […]

Continue reading


How companies can navigate the age of machine learning

[A version of this post appears on the O’Reilly Radar.] To become a “machine learning company,” you need tools and processes to overcome challenges in data, engineering, and models. Over the last few years, the data community has focused on gathering and collecting data, building infrastructure for that purpose, and using data to improve decision-making. […]

Continue reading


How Ray makes continuous learning accessible and easy to scale

[A version of this post appears on the O’Reilly Radar.] The O’Reilly Data Show Podcast: Robert Nishihara and Philipp Moritz on a new framework for reinforcement learning and AI applications. Subscribe to the O’Reilly Data Show Podcast to explore the opportunities and techniques driving big data, data science, and AI. Find us on Stitcher, TuneIn, iTunes, SoundCloud, RSS. In this episode […]

Continue reading


A scalable time-series database that supports SQL

[A version of this post appears on the O’Reilly Radar.] The O’Reilly Data Show Podcast: Michael Freedman on TimescaleDB and scaling SQL for time-series. Subscribe to the O’Reilly Data Show Podcast to explore the opportunities and techniques driving big data, data science, and AI. Find us on Stitcher, TuneIn, iTunes, SoundCloud, RSS. In this episode […]

Continue reading


Architecting and building end-to-end streaming applications

[A version of this post appears on the O’Reilly Radar.] The O’Reilly Data Show Podcast: Karthik Ramasamy on Heron, DistributedLog, and designing real-time applications. Subscribe to the O’Reilly Data Show Podcast to explore the opportunities and techniques driving big data, data science, and AI. Find us on Stitcher, TuneIn, iTunes, SoundCloud, RSS. In this episode […]

Continue reading


Time-turner: Strata San Jose 2017, day 2

There are so many good talks happening at the same time that it’s impossible to not miss out on good sessions. But imagine I had a time-turner necklace and could actually “attend” 3 (maybe 5) sessions happening at the same time. Taking into account my current personal interests and tastes, here’s how my day would […]

Continue reading


Time-turner: Strata San Jose 2017, day 1

There are so many good talks happening at the same time that it’s impossible to not miss out on good sessions. But imagine I had a time-turner necklace and could actually “attend” 3 (maybe 5) sessions happening at the same time. Taking into account my current personal interests and tastes, here’s how my day would […]

Continue reading


The key to building deep learning solutions for large enterprises

[A version of this post appears on the O’Reilly Radar.] The O’Reilly Data Show Podcast: Adam Gibson on the importance of ROI, integration, and the JVM. Subscribe to the O’Reilly Data Show Podcast to explore the opportunities and techniques driving big data, data science, and AI. Find us on Stitcher, TuneIn, iTunes, SoundCloud, RSS. As […]

Continue reading


2017 will be the year the data science and big data community engage with AI technologies

[A version of this post appears on the O’Reilly Radar.] The O’Reilly Data Show Podcast: A look at some trends we’re watching in 2017. Subscribe to the O’Reilly Data Show Podcast to explore the opportunities and techniques driving big data, data science, and AI. Find us on Stitcher, TuneIn, iTunes, SoundCloud, RSS. This episode consists […]

Continue reading