Update on my job search
I found something and am happily employed again. If you’re looking to hire someone with experience and skills like mine please reach out. I have a lot of contacts with former coworkers who are still looking.
[Read More]Welcome to my site. Check the links at the bottom for where else you can find me.
I found something and am happily employed again. If you’re looking to hire someone with experience and skills like mine please reach out. I have a lot of contacts with former coworkers who are still looking.
[Read More]Apache Hive is a very powerful tool for processing data stored in Apache Hadoop. Structured and unstructured data can be accessed, processed, and manipulated using a SQL-like query language. This architecture allows anyone with reasonable SQL knowledge to write complex jobs with little to no knowledge of Hadoop, HDFS, and Hive.
[Read More]I finally got around to doing a big batch upload of scuba pics to Flickr. The big writeup of hurricane Wilma is available again too.
This came up on the Hive mailing list and I’m putting it here as a reminder to try it out. Here’s how to do complex count statements to simplify queries.
SELECT
type
, count(*)
, count(DISTINCT u)
, count(CASE WHEN plat=1 THEN u ELSE NULL END)
, count(DISTINCT CASE WHEN plat=1 THEN u ELSE NULL END)
, count(CASE WHEN (type=2 OR type=6) THEN u ELSE NULL END)
, count(DISTINCT CASE WHEN (type=2 OR type=6) THEN u ELSE NULL END)
FROM
t
WHERE
dt in ("2012-1-12-02", "2012-1-12-03")
GROUP BY
type
ORDER BY
type
;