The first Covid case in Italy was found on February 21st 2020. A couple of weeks later we were entering the lockdown with this number of new daily cases.
The number of Covid-19 new cases was growing really fast every day. We had no clue about what was going to …
While studying for the Azure Data Scientist Associate certification, I took notes from Building AI Solution with Azure ML course. In this single page, you'll find the entire content of the course (as of 18th August, 2020). This page is a small support for those preparing for earning the certification …
This is an error I encountered when I have a Spark Streaming job running on Databricks 6.1. Consider the case I have to update a running streaming query. Databricks recommends to always start (and restart too?) a streaming query on a new dedicated cluster. However, in some scenario you …
My side project atacmonitor features a new guise. Data is now being collected for all bus and tram lines in Rome. Data pull is achieved via Python functions running on AWS Lambda. Data is then stored in MongoDB hosted in MongoDB Atlas. Atlas also provides the charts in the page …
Rather than construction, software is more like gardening— it is more organic than concrete. You plant many things in a garden according to an initial plan and conditions. Some thrive, others are destined to end up as compost. [...] You constantly monitor the health of the garden, and make adjustments (to …
The Signal and The Noise by Nate Silver is a must-read book for those interested in predictions. It is not a technical book. You will not learn any algorithm. However, it presents a series of real-world scenarios when predictions did work and where predictions did not work. The book is …
What does it mean to work as a data scientist in manufacturing? What is the value behind data? Data science has gained popularity in domains like internet, but the industrial production domain has specific requirements.
I received valuable feedbacks by Jim Nasby regarding the post about weighted random sampling with PostgreSQL. I will report here Jim's email.
Sadly, Common Table Expressions (CTE)s are insanely expensive, because
each one must be fully materialized. So in your example, you're
essentially creating 5 temp tables (one for …