Velotio Technologies is a product engineering company working with innovative startups and enterprises. We are a certified Great Place to Work® and recognized as one of the best companies to work for in India. We have provided full-stack product development for 110+ startups across the globe building products in the cloud-native, data engineering, B2B SaaS, IoT & Machine Learning space. Our team of 220+ elite software engineers solves hard technical problems while transforming customer ideas into successful products.
As a Technical Lead - Data Engineer, you'll contribute to the design and development of Data Analytics platform using latest tools and cloud technologies for a variety of workloads including real time analytics and batch data. You will also lead a team of 2-10 engineers.
Roles & Responsibilities:
- Design and build scalable data infrastructure with efficiency, reliability, and consistency to meet rapidly growing data needs
- Build the applications required for optimal extraction, cleaning, transformation, and loading data from disparate data sources and formats using the latest big data technologies
- Building ETL/ELT pipelines and work with other data infrastructure components, like Data Lakes, Data Warehouses and BI/reporting/analytics tools
- Work with various cloud services like AWS, GCP, Azure to implement highly available, horizontally scalable data processing and storage systems and automate manual processes and workflows
- Implement processes and systems to monitor data quality, to ensure data is always accurate, reliable, and available for the stakeholders and other business processes that depend on it
- Work closely with different business units and engineering teams to develop a long-term data platform architecture strategy and thus foster data-driven decision-making practices across the organisation
- Help establish and maintain a high level of operational excellence in data engineering
- Evaluate, integrate, and build tools to accelerate Data Engineering, Data Science, Business Intelligence, Reporting, and Analytics as needed
- Focus on building test-driven development by writing unit/integration tests
- Contribute to design documents and engineering wiki
You will enjoy this role if you...
- Like building elegant well-architected software products with enterprise customers
- Want to learn to leverage public cloud services & cutting-edge big data technologies, like Spark, Airflow, Hadoop, Snowflake, and Redshift
- Work collaboratively as part of a close-knit team of geeks, architects, and leads
Desired Skills & Experience:
- 4+ years of data engineering or equivalent knowledge and ability
- 4+ years software engineering or equivalent knowledge and ability
- Strong proficiency in at least one of the following programming languages: Python, Scala, or Java
- Experience designing and maintaining at least one type of database (Object Store, Columnar, In-memory, Relational, Tabular, Key-Value Store, Triple-store, Tuple-store, Graph, and other related database types)
- Good understanding of star/snowflake schema designs
- Extensive experience working with big data technologies like Spark, Hadoop, Hive
- Experience building ETL/ELT pipelines and working on other data infrastructure components like BI/reporting/analytics tools
- Experience working with workflow orchestration tools like Apache Airflow, Oozie, Azkaban, NiFi, Airbyte, etc.
- Experience building production-grade data backup/restore strategies and disaster recovery solutions
- Hands-on experience with implementing batch and stream data processing applications using technologies like AWS DMS, Apache Flink, Apache Spark, AWS Kinesis, Kafka, etc.
- Knowledge of best practices in developing and deploying applications that are highly available and scalable
- Experience with or knowledge of Agile Software Development methodologies
- Excellent problem-solving and troubleshooting skills
- Process-oriented with excellent documentation skills
Bonus points if you:
- Have hands-on experience using one or multiple cloud service providers like AWS, GCP, Azure and have worked with specific products like EMR, Glue, DataProc, DataBricks, DataStudio, etc
- Have hands-on experience working with either Redshift, Snowflake, BigQuery, Azure Synapse, or Athena and understand the inner workings of these cloud storage systems
- Have experience building DataLakes, scalable data warehouses, and DataMarts
- Have familiarity with tools like Jupyter Notebooks, Pandas, NumPy, SciPy, sci-kit learn, Seaborn, SparkML, etc.
- Have experience building and deploying Machine Learning models to production at scale
- Possess excellent cross-functional collaboration and communication skills
- We have an autonomous and empowered work culture encouraging individuals to take ownership and grow quickly
- Flat hierarchy with fast decision making and a startup-oriented “get things done” culture
- A strong, fun & positive environment with regular celebrations of our success. We pride ourselves in creating an inclusive, diverse & authentic environment
Note: Currently, all interview and onboarding processes at Velotio will be carried out remotely through virtual meetings until further notice.