DRY HiveQL

DRY (don’t repeat yourself) is one of the fundamental principles of software engineering. The main idea is to avoid duplicating business/processing logic throughout the code. However, I rarely see it being applied when writing SQL queries; making it difficult to understand and maintain them. Below are few tips on making HiveQL DRY. Quick Summary Use […]

Continue reading


Qlik Plots Course to Big Data, Cloud and ‘AI’ Innovation

Qlik highlights upgrades and the roadmap to high-scale, hybrid cloud and ‘augmented intelligence.’ Here’s my take on the long-range plans. Big data scalability, hybrid cloud flexibility and smart “augmented” intelligence. These are the three plans that business intelligence and analytics vendor Qlik officially put on its roadmap at the May 15-18 Qonnections conference in Orlando, Florida. Qlik also […]

Continue reading


Teradata Transition to Cloud and Consulting Continues

Teradata simplifies pricing, executes on business consulting and hybrid cloud strategy. A look at next steps in the company’s ongoing transition. “Business outcome led, technology enabled.” This was the theme at the May 8-10 Teradata Third-Party Influencers Summit in San Diego, and it reflected a two-to-one ratio of consulting-oriented presentations to technology updates. Teradata has […]

Continue reading


Efficient Textual Similarity Across Millions of Web Queries

Computing textual similarity (such as Jaccard similarity coefficient) between millions of search queries can be an arduous task. The main challenge is the number of pairs that one needs to consider; a relatively small dataset containing ten thousands queries leads to more than 49 million possible query pairs (). Based on Vernica, et.al. paper, I show […]

Continue reading


SAS Takes Next Steps to Cloud Analytics

SAS Viya is now available as the cloud-friendly platform for SAS Visual apps and, soon, SAS 9. Next up should be more cloud-based services options. SAS, like many well-established tech vendors, has to keep one eye on the future and one eye on the past. At the April 2-5 SAS Global Forum in Orlando, FL, the […]

Continue reading


Cloudera Focuses Message, Takes Fifth On Pending Moves

Cloudera executives can’t talk about IPO or cloud-services rumors. Here what’s on the record from the Cloudera Analyst Conference. There were a few elephants in the room at the March 21-22 Cloudera Analyst Conference in San Francisco. But between a blanket “no comment” about IPO rumors and non-disclosure demands around cloud plans — even whether […]

Continue reading


Spark Gets Faster for Streaming Analytics

Spark Summit East highlights progress on machine learning, deep learning and continuous applications combining batch and streaming workloads. Despite challenges including a new location and a nasty Nor’easter that put a crimp on travel, Spark Summit East managed to draw more than 1,500 attendees to its February 7-9 run at the John B. Hynes Convention […]

Continue reading