Skip to content

Spark 3.0: First hands-on approach with Adaptive Query Execution (Part 1)

Apache Spark is a distributed data processing framework that is suitable for any Big Data context thanks to its features. Despite being a relatively recent product (the first open-source BSD license was
Read More

The world is real-time, not batch – White Paper

WHITE PAPERTHE WORLD IS REAL TIME  NOT BATCH An overview of Data Streaming scenario, its stages of evolution and benefits. Are you getting your data fast enough? Why is streaming data
Read More

AWS Partnership

We are proud to announce that we have been recognized “AWS Select Consulting Partner” within the Amazon Partner Network (APN). The Select status achievement demonstrates our commitment in delivering top technology
Read More

How to create an Apache Spark 3.0 development cluster on a single machine using Docker

Apache Spark is the most widely used in-memory parallel distributed processing framework in the field of Big Data advanced analytics. The main reasons for its success are the simplicity of use
Read More

A unified data management platform

From days to minutes: one of the world’s top-five insurance companies has improved its end-to-end delivery of data thanks to cloud services OVERVIEW   SCENARIO Many sub-companies based on different
Read More

Darwin, Avro schema evolution made easy!

Hi everybody! I’m a Big Data Engineer @ Agile Lab, a remote-first Big Data engineering and R&D firm located in Italy. Our main focus is to build Big Data and
Read More

Scala ‘fun’ error handling

Hi everybody! I’m Antonio Murgia, a Big Data Architect @ Agile Lab, a remote-first Big Data engineering and R&D firm located in Italy. Our main focus is to build Big
Read More

Master Data Management: challenges and basics

A Master Data Management system is the single point of truth of all data company-wide. The problem we want to manage is related to unifying and harmonizing ambiguous and discordant
Read More

A Data Lake new era

Data Lake and Data Warehouse in real-time and low cost   “A data lake is a centralized repository that allows you to store all your structured and unstructured data at
Read More

Managed Services for Mission Critical Big Data environment: Banca Popolare di Sondrio con Agile Lab garantisce l’affidabilità dei propri sistemi – WEBINAR

Il prossimo 25 giugno 2020, alle ore 11 parleremo di servizi per la gestione affidabile di ambienti Big Data, che per loro natura, richiedono performance di servizio eccellenti, garantite da
Read More