About The Position
Our team’s mission statement:
We play a critical role in ensuring that our customers receive the best experience deploying the SparkBeyond Platform.
Using our technical expertise, combined with our entrepreneurial drive and passion we partner with our clients to balance business and partner needs with technical constraints.
While continuously engaging with our community of customers and users, we enable their success by providing technical solutions, flawlessly leading deployments, implementations, integrations and much more.
What will you do?
- Lead the creation of ETL pipelines to collect, store, and normalize both structured data (e.g. government forms or tabular time series data) and unstructured data (e.g. company websites, online reviews, or incoming emails)
- Help drive optimization, testing, and tools to improve data quality and availability for downstream use, e.g. in machine learning models
- Span languages as needed, depending on the data processing framework in use
- Collaborate with software engineers, machine learning experts, and others, taking learning and leadership opportunities that arise
- Build data expertise and own data quality for the pipelines you build
- Degree in Computer Science, Engineering, Mathematics, Physics, or a related quantitative field and preferably 2+ years relevant experience
- Industry experience as a Data Engineer or related speciality (e.g., Software Engineer, Business Intelligence Engineer, Data Scientist) with a track record of manipulating, processing, and extracting value from large datasets
- Experience building and/or operating systems for extraction, ingestion, integration, and ETL of large datasets from multiple sources
- Experience with sourcing and modelling data from application APIs
- Experience with Docker Container Management or Virtualization Technologies
- Experience using machine learning and statistical tools such as Python/Pandas, R etc.
- Knowledge of SQL and databases, with a strong intuition for how to model, transform, and store data
- Knowledge of Python, version control, and product development processes
- Experience with Amazon Web Services (AWS, EC2, S3)
- Willingness and ability to travel to client sites across APAC