I learned a lot at Data Day Texas. I live tweeted a lot of interesting bits on @RobertsPaige as I went along, but some of the most enjoyable and enlightening stuff happened at the happy hour afterward. It’s not uncommon at these events for the “downtime” to be as educational as the presentations. And fun, too.
As soon as I walked in, Charity Majors, the keynote speaker and co-founder of Hound, have me a big hug. She said she loved my live tweets. Her no-nonsense and not entirely safe for work presentation on making smart technology decisions was the highlight of the conference. Her slides are on slideshare and a lot of her major points are in Dan McFinlay’s blog post, “Choose Boring Technology.”
I collected a few hugs from some old friends as well. Ryan Templeton was there, who I’ve known through about 10 years and four companies. Last time I saw him, he was presenting at the Austin Hortonworks roadshow. He just made a jump from Hortonworks to CapGemini. My old boss, David Inbar, is CEO of his own analytics start-up now Dejalytics. I talked KNIME with Davin Potts and Michael Berthold. KNIME just got Spark support which is exciting to hear. I haven’t played around with it yet, but I love the idea of being able to use Spark data mining capabilities from KNIME’s interface.
Speaking of Spark, I had a wonderful chat with Holden Karau, O’Reilly author of great Spark books. We talked about life and careers, and that feeling that no matter how much you accomplish, there’s more that needs doing. A couple of guys from an oil and gas company presented her with a problem, while everyone had their thought processes well-lubricated with alcohol. Their seismic exploration data needed some high speed processing to provide useful visualizations, but they couldn’t get past some latency issues. They had a processor establishing itself in a Spark context, with all the necessary overhead, processing one bit of data, then tearing itself down again. Holden’s very succinct advice, “Stop doing that,” made them pause and re-think. “That might work,” was the last thing I heard from them. I suspect Holden may have just saved a huge company piles of money and time with three words. Folks with that level of practical brilliance are rare and precious gems.
O’Reilly authors were well-represented at Data Day Texas. Jay Kreps, the author of “I Heart Logs” chatted with me and Jason Patton, a consultant from Intersys. I don’t think we solved any of the world’s problems, but they were interesting folks to talk to. I learned a lot about Kafka and the reasoning behind its design in Jay’s presentation, and picked up a copy of his book. He did an excellent job of making a complex topic easy to grasp. As a presenter, I know how very tricky that is to do.
Streaming data processing is coming into its heyday, now that the Internet of Things is really hitting its stride. Spark and Kafka were the darlings of the conference. I heard a lot of unhappy grumbling about Storm, from folks actually using it, and not a peep about Flink. I also learned a fair amount about Akka that I didn’t know before.
With my job fluctuations last year, I totally missed Hadoop Summit, Strata and Big Data Tech Con. I was feeling a little out of touch. This conference did a good job of helping me fill in some gaps on what’s new, and also deepened some of my understanding of existing technologies and techniques. I think I’ve gathered enough information to do a series of posts on streaming data processors, and one on the differences between streaming, micro-batch, real-time, event-based and batch processing.
And the hugs were nice, too.