Bulk processing of pharmaceutical data

Health & Pharmaceutical Sector

Customer

German pharmaceutical company of international importance.

Description

Big performance problem in data ingestion with Spark. Volume of several TB of data per day.

Results

Complete redesign of the ingest pipelines, reducing the computation time from several days to just a few hours.

Technology

Spark with Scala for data processing. Flume and Sqoop for ingestion. HDFS storage available using Hive SQL engine. Big Data cluster with MapR technology.

Bulk processing of pharmaceutical data

User experience in mobile networks

Demand forecasting