John Meagher

Job Search

Posted on March 2024

End of an era and looking for what’s next Last month, I bid farewell to my coworkers at Instacart after an amazing six-year journey. I was fortunate to witness the company’s most significant period of growth and was consistently impressed by everyone’s ability to adapt swiftly to a changing market. After taking a few weeks off, I am now ready to start the search for my next big adventure. In the coming weeks, I am actively seeking opportunities in Site Reliability Engineer or Distributed Engineering roles. [Read More]

Hive UDFs in Ruby and Other Languages

Posted on February 2014

Apache Hive is a very powerful tool for processing data stored in Apache Hadoop. Structured and unstructured data can be accessed, processed, and manipulated using a SQL-like query language. This architecture allows anyone with reasonable SQL knowledge to write complex jobs with little to no knowledge of Hadoop, HDFS, and Hive.

[Read More]

hive sql

Old Scuba Pictures Live on Again

Posted on October 2013

I finally got around to doing a big batch upload of scuba pics to Flickr. The big writeup of hurricane Wilma is available again too.

photography scuba hurricane

Complex Counts in Hive

Posted on March 2012

This came up on the Hive mailing list and I’m putting it here as a reminder to try it out. Here’s how to do complex count statements to simplify queries.

SELECT
    type
  , count(*)
  , count(DISTINCT u)
  , count(CASE WHEN plat=1 THEN u ELSE NULL END)
  , count(DISTINCT CASE WHEN plat=1 THEN u ELSE NULL END)
  , count(CASE WHEN (type=2 OR type=6) THEN u ELSE NULL END)
  , count(DISTINCT CASE WHEN (type=2 OR type=6) THEN u ELSE NULL END)
FROM
    t
WHERE
    dt in ("2012-1-12-02", "2012-1-12-03")
GROUP BY
    type
ORDER BY
    type
;

hive sql