Data-Centric Security Rules

On July 2, 2019 in Data Management

Originally posted on the Vertica Blog.

I just attended the Cybersecurity Summit in Dallas for the first time. I spoke a little on a small panel with a couple of customers about Voltage SecureData and Format Preserving Encryption (FPE). Then I got to go to other presentations and learned a lot. Probably the number one presentation that stuck with me, was Sid Dutta of Worldpay, the presenter who put the words “You will be breached!!” with double exclamation points on his slides, and said, “There are two kinds of enterprises, those who have been breached and know about it, and those who don’t know about it.” That really brings home the risk associated with regulations like GDPR and CCPA.

Text on white slide: Reality You will be breached!! Possibly again ...

Depending solely on perimeter security measures like firewalls is a huge mistake. Since breaches will happen anyway, you need to guard the data, not just the perimeter. Make your security data-centric by making sure the data is encrypted and useless to anyone who gets non-sanctioned access to it.

My experience with encryption in the past as a data specialist was that it made my job a hundred times harder. You have sensible data, you encrypt it, and now it’s much longer and full of nonsense characters. It doesn’t fit in the database fields it used to fit in, and it’s a useless mess with no pattern to build rules on. Plus, there’s a performance hit to encrypt it, a performance hit because it’s now larger, and a performance hit to decrypt it again in order to do anything with it. There’s no referential integrity, either, so you can’t even tell if two pieces of data pertain to the same entity.

Again and again, I ran into situations where, to debug problems, I needed to see real data. I had to sign all kinds of NDA forms, and go through a ton of hoops to show, yes, I really need to see the data or I can’t do my job. But truthfully, I didn’t need to see the data, and I didn’t want to. I just needed that data to be in its original, unaltered format.

Format Preserving Encryption (FPE) lets you store, query, prep and analyze data without ever decrypting it. What I would have given for that a few years back. Your database schema stays the same. Your query performance stays pretty close to the same. When you build data logic, you know you’re working with, for instance, a social security number or a credit card number, because of the basic format. But it doesn’t expose those numbers to the world. There’s referential integrity, so you can do analysis on the data without decrypting it.

That last bit really caught my attention.

For instance, a loyalty card analyst might want to analyze purchase history to help keep you as a loyal customer. They really don’t need to see your credit card number. They just need to know that all the purchase data they’re analyzing pertains to the same person, referential integrity. That’s all. And your credit card info is tons safer if it can go all the way through analysis and action without ever being decrypted.

Risk exposure is reduced, both for the customer, and for the analytics company. If someone does manage to break in past the firewalls and steal the data, it’s gibberish. Strict laws like GDPR don’t even require the company to inform anyone because that data can’t be used to rob you or steal your identity. It’s encrypted. It’s useless to anyone who takes it.

That is, it’s useless to anyone who doesn’t have the right key to decrypt it.

Who’s got you?

At some point in the conference, one of the attendees asked me, “But where does Voltage store the encryption keys? How do you manage them?” Sid Dutta referred to Keys and Secrets management, something he believed that most enterprises ignore or do a poor job at, as being like Superman saving Lois Lane. “You’ve got me, but who’s got you?” The keys keep your data safe, but who keeps the keys safe? Encryption is seldom cracked but often bypassed. Don't worry, miss. I've got you. You've got me. Who's got you?

Since I didn’t know the answer to the question about Voltage key management at the time, I went and asked the Voltage folks. “Where does Voltage Securedata store the keys?” Their answer stunned me. “We don’t.”

Apparently, they have a system of generating keys, which can be used in any region. It means you don’t store keys, you don’t manage keys, you don’t have to transport keys when you move data, and most importantly, no one can steal the keys. That’s a heck of an advantage.

One advantage I knew about before I got there showed up in the panel discussion. Voltage and Vertica have an extra-tight level of integration. One of the Voltage customers on the panel, who uses another database, mentioned that all they have to do is alter SQL queries, so they include a decryption call for the folks who have access to sensitive data. That’s pretty cool by itself, but the Vertica/Voltage integration is even better. A lot of people interact with data via BI tools or dashboards, not just hand-written SQL. Those dashboards generate some seriously complex SQL queries every time you click something to drill down. That’s not something you could alter for every individual. But Vertica looks at the authorization of each person combined with the SQL. If that person is authorized to see that column and that row in the clear, it automatically triggers the decryption. No altered SQL required.

For example, if I have access to see something like salaries, because I work in payroll, and I click to drill down on executives in the northeast region, I’ll see the salaries. If I don’t have access, and I click the same thing, I will only see encrypted data.

Nice.

Learn more about the power of Vertica + Voltage.

Big Data Page by Paige

Thoughts on Analytics, Software and Data Management

Data-Centric Security Rules

Who’s got you?

Related

Big Data Page by Paige

Thoughts on Analytics, Software and Data Management

Data-Centric Security Rules

Who’s got you?

Share this:

Related

Share This Post

Related Posts