Big Data System Engineer
Complex problems require the right expertise! Today, one of the biggest challenges lies in data storage and processing, across these three main domains (3Vs): Volume, Velocity and Variety.
At Xpand IT, the Big Data technological area develops and implements architectures and software solutions that represent the state of the art on capturing, ingesting, storing and managing critical data from huge clusters where the 3Vs are always present. As concerns technology stack, we take advantage of almost every state-of-the-art framework in Big Data ecosystem such as Spark, Kafka, Hive/Impala, Azure Data Services or MongoDB using Java and Scala as programming languages to interact with them.
As a Big Data System Engineer, you’ll have a vital role in several phases of adopting the Big Data platform, participating in the analysis, outlining and sizing distributed storage and/or computing systems, setup, upgrade, securitisation and tuning. For these critical systems, a particular focus on performance and security is crucial, as well as implementing the best service development practices to serve as the basis for monitoring tools. Usually, this role also works closely with the development teams in designing, developing and installing application solutions, large-scale data processing and storage.
Your daily activities will include:
- Setting up / upgrading / securitising / tuning big data platforms on a large scale in critical environments
- Implementing security rules and policies on Big Data platforms
- Recommending and periodically updating Big Data platforms (hot fixes, patches, etc.)
- Configuring best practices for monitoring the infrastructure of Big Data clusters
- Analysing hardware and software requirements for each project to be implemented
- Designing and developing new processes for better stability and performance maintenance of environments
- Participating and helping solve performance, scalability and security issues
// Stacks: Shell Linux (e.g., bash), Cloudera, Confluent, Azure Data Services, MongoDB; MIT Kerberos ou Windows Active Directory
SKILLS YOU NEED TO HAVE
- MSc / BSc in Information Systems and Computer Engineering and/or Computer Science
- Good knowledge of Linux operating systems
- Good knowledge of Shell Scripting
- Knowledge of High Availability/Distributed systems and its goals and terminology
- Sound communication skills (written and spoken)
- Team player and problem-solving skills
- Fluent English (written and spoken)
// Will be a nice plus if you have:
- Hands-on experience in setting up Spark, Hive, or other Apache Hadoop
- Curiosity about Big Data technologies such as Hadoop, Kafka and MongoDB
// Learn more about Big Data area:
Pedro Martins, Big Data System Engineer
As a Big Data System Engineer, every day is challenging: between the clusters deployments from scratch and the optimization of distributed systems, it is absolutely crucial to have a deep understanding of the ecosystem’s multiple services, dealing with many different contexts.