Archive for Hortonworks tag

Orc O'Malley of the Yellow Elephant clan says LLAP

Owen O’Malley on the Origins of Hadoop, Spark and a Vulcan ORC

Owen O’Malley is one of the folks I chatted with at the last Hadoop Summit in San Jose. I already discovered the first time I met him that he was the big Tolkien geek behind the naming of ORC files, as well as making sure that Not All Hadoop Users Drop ACID. In this conversation, I learned that Hadoop and Spark are both partially his fault, about the amazing performance strides Hive with ORC, Tez and LLAP have made, and that he’s a Trek geek, too.

Read more...
Happy 10 Years Hadoop

Ten Years of Hadoop, Apache Nifi and Being Alone in a Crowd

Hadoop Summit in San Jose this year celebrated Hadoop’s 10th birthday. All of the folks on stage are people who contributed to Hadoop during those 10 years. One of them is Yolanda Davis.

Yolanda and I worked together on a Hortonworks project last year. She was in charge of the user interface design and development team. I caught up with her early in the morning of the last day of Hadoop Summit, and quizzed her on this new project she’s working on that you may have heard of, Apache Nifi. As promised, here is my interview with her on the subject of Nifi and the new HDF (Hortonworks Data Flow) streaming data processing platform, which includes Nifi, Apache Kafka and Apache Storm.

Read more...
Metron Eye On Cyber Security

Cyber Security with Apache Metron and Storm

A few weeks ago at Hadoop Summit, I caught up with some friends from the project I worked on last year with Hortonworks, including Ryan Merriman who is now an Apache Metron architect. Since Apache Metron was a project I knew virtually nothing about beforehand, I quizzed Ryan about it. The conversation evolved into a discussion of the merits of Storm versus Flink and Heron, something I’ve been meaning to delve into for months here.

Read more...
Holden Karau's audience at High Performance Spark preso at Data Day Texas

Interviews with Brilliant People on Hadoop and the Future of Big Data Tech

I have been doing some very cool interviews with brilliant people, usually at events like Strata + Hadoop World and Hadoop Summit. The intention is to use their brilliant thoughts so that I don’t have to take the extra time to come up with my own. Not to mention I get the bonus of learning new things, and getting the unique perspectives of folks who really know their stuff. Nothing like learning tech from the folks who literally wrote the book on it.

Read more...
David and Goliath

Pitching Stones with David

It’s a brand new year, and I’ve got a brand new job. As of today, you’re looking at the new Product Marketing Manager for Syncsort.

It’s true. After spending half a year doing a little freelance white paper work for the Bloor Group, and documenting for Hortonworks the most complex ETL process I’ve seen in nearly two decades in the business, I’ve found a new home to settle into. I got courted by some Goliaths in the data management software and hardware space, but in the end, I chose a tech savvy David, Syncsort.

Read more...
Big Data Analytics Miss

Four Reasons Why Big Data Analytics Projects Fail, Or Do They?

A few months back, I was presenting with a friend at a Chief Data Officer summit in Dallas, and my co-presenter put up a slide that said, “60 % of all big data analytics projects fail.” Someone in the audience asked, “Why do they fail?” My friend said, “I think Paige could answer that better than I could.”

Put on the spot, three reasons that have been confirmed from multiple sources jumped immediately into my head. I used those three to answer the question. But later, when I had time to think, I realized there was one other reason that shows up repeatedly, but often gets downplayed or written off as not the REAL problem, when in my opinion, it very much is.

Read more...
Hadoop Tez, Stinger's Baby

The Tragedy of Tez

Tez is one of the marvelous ironies of the fast moving big data and open source software space, a piece of brilliant technology that was obsolete almost as soon as it was released. In the second in my series of short posts on Hadoop data processing frameworks, I’ll look at the bouncing baby born of the Stinger Initiative, and point out where it’s ugly.

Read more...

Not All Hadoop Users Drop ACID

In the age of businesses with data that lives on dozens or even hundreds of servers, expecting transactional integrity and data consistency and currency are old-fashioned notions. On Hadoop, you just have to settle for the new NoSQL standard of BASE and eventual consistency. That’s what they say. But, as usual, “they” are wrong. Not all Hadoop users have to drop ACID…

Read more...
Load More
9 of 9