
Senior Core Data Engineer - Real Time
- Amsterdam, Noord-Holland
- Vast
- Voltijds
- Be accountable for timely and high-quality delivery of core data products to the rest of the organization: Market Data, Trades/Orders/Quotes/Inquiries, Application Logs with a specific focus on low-latency analytics
- Develop and own the end-to-end lifecycle of these data products: build and maintain pipelines, develop quality checks, correct day-to-day issues, and systematize/automate manual processing steps
- Define and manage project plans to assume ownership of these data products
- Create and refine a common processing model (esp. fast vs slow path, use of alternative T3 solutions) across these data products
- Investigate currently available sources of data and ensure that Flow extracts maximum information and value from them
- Explore, evaluate, and onboard additional sources of available data (both independently and at the request of stakeholders)
- Work closely with other teams to: automate existing workflows, develop real-time capabilities as platform offering, ensure pipelines leverage standard Flow technologies, provide guidance on data-related architecture/design topics, ensure licensing and access control are appropriate
- Organized with strong attention to detail and quality
- Innately driven to generalize, systematize, and automate
- Strong interpersonal skills, works very well with both technical and non-technical stakeholders
- Ability to balance individual stakeholder requirements in order to preserve quality and wide applicability of data
- Ability to define, manage, and execute projects (often several in parallel) while effectively triaging operational concerns
- Primarily technical, but domain-literate (exchange market data, network traffic data, security reference/pricing data, ETF compositions, application log data; should not be solely exposed to Flow conventions)
- Domain expertise a huge plus (but not strictly necessary)
- Excellent programming skills (any language, Python preferred)
- Working proficiency with Linux, SQL and relational databases, and version control systems
- Proven experience with real-time OLAP databases like ClickHouse, Apache Pinot, StarRocks required
- Experience with Airflow, Kafka, and GCP (esp. GKE, Dataproc, and Big Query) highly desired
- Experience with Pandas, Spark, Apache Beam, Apache NiFi a plus