Spark Services

Apache Spark consulting, implementation, optimisation and support

Our Spark Specialisation


Apache Spark is a fast and general engine for Big Data processing, with built-in modules for streaming, SQL, Machine Learning and graph processing.

Apache_Spark_logo.svg

At WHISHWORKS we have worked extensively with Apache Spark in many Big Data projects:

  • Implementation of robust production data pipelines at scale.
  • Implementation of multiple "Spark and NiFi" based IoT pipelines.
  • Numerous projects requiring Spark applications to perform efficiently on Yarn clusters.
  • Introduction of SMACK (Spark, Mesos, Akka, Cassandra, and Kafka) stack into our Big Data roadmap.
  • Development of reusable component registries, based on our extensive production experience to help reduce development time for building enterprise grade search solutions using Spark and Apache Solr, by almost 50%.
  • Extensive experience into building and running production grade Data pipelines on cloud platforms like AWS and Azure.
  • Multiple use cases involving streaming data processing, interactive analytics, batch processing and Machine Learning.

Our expert Spark Services

Consulting

  • Needs Analysis 
  • Architectural Consulting 
  • Spark Cluster Architecture Review & Design
  • Identification of Use Cases

Managed services

  • Cluster Administration & Optimisation
  • Tailored Services
  • Staff Augmentation

Application support

  • High performance Spark applications implementation. Real-time / batch / streaming / offline analytics
  • Full Spark stack delivery: Spark SQL, SparkML, Spark Streaming, Spark GraphX
  • Deliver high quality SQLs that run seamlessly on Spark engine backed by AWS (S3 and Redshift) or Azure Blob/Table Storage
  • Deliver high performance Spark-based data pipelines by strictly following Test Driven Development approach

Deployment and Application delivery

  • Support and Issues Management on Existing Open Source Spark clusters.
  • Support to maintain Spark SLAs and SLOs consistently. Spark SQL read/writes speed optimisation.
  • Spark multi-user, cluster sharing.
  • 24x7 or tailored support packages, Incident management and reporting.

Want to know more about our Kafka services?

Get in touch
Mulesoft
MapR Technologies
Qlik
Cloudera
Databricks