Staff Replication Development Engineer

🕒 May 7

Apply Now
Find Similar Remote Jobs

📊 Check your resume score for this job

Improve your chances of getting an interview by checking your resume score before you apply.

Logo of DDN

DDN

1001 - 5000 employees

Founded 1998

🤖 Artificial Intelligence

💰 $10M Funding Round on 2011-06

Artificial Intelligence • Data Center and Cloud Computing • High Performance Computing

DDN is a global leader in AI data intelligence solutions, providing high-performance computing and sophisticated data management technologies. With a focus on accelerating AI deployments and advanced data analytics, DDN's products, including the Data Intelligence Platform and advanced storage systems, serve diverse sectors such as healthcare, financial services, and government. DDN is committed to transforming enterprise data infrastructure to leverage the full potential of AI and drive operational efficiency.

📋 Description

• Lead the design and development of the replication engine for the Infinia AI Data Platform • Focus on building enterprise-grade asynchronous replication capabilities that enable reliable and secure disaster recovery for large-scale data systems • Develop high-performance replication pipelines, efficient data synchronization mechanisms, and secure data transfer systems • Design and develop multi-threaded asynchronous replication systems with parallel streaming capabilities • Build object-level delta replication with checkpointing and resume functionality • Develop replication engines supporting bucket/share-level replication controls • Implement secure data transfer mechanisms using TLS 1.3 with mutual authentication • Ensure end-to-end data integrity through checksum validation and verification pipelines • Design and implement manual failover workflows for disaster recovery scenarios • Build and maintain REST APIs for replication configuration, control, and automation • Develop metadata tracking and change detection systems to enable efficient replication • Implement RPO visibility, alerting, and operational insights for replication status • Contribute to monitoring dashboards focused on replication health and performance • Ensure systems are designed for high availability, fault tolerance, and scalability • Partner with QA teams to drive performance, resiliency, and scale validation • Collaborate with backend, security, and platform teams to deliver end-to-end replication workflows • Participate in debugging, production issue resolution, and continuous improvement of replication reliability • Provide technical leadership, architectural guidance, and mentorship to the engineering team

🎯 Requirements

• 8+ years of experience in distributed systems, storage systems, or backend software engineering • Strong programming skills in one or more languages: C++, Go, Java, or Rust • Experience designing and building data replication systems, data pipelines, or distributed data services • Deep understanding of distributed systems concepts (consistency, availability, scalability, fault tolerance) • Strong expertise in multi-threading, concurrency, and parallel processing • Knowledge of networking protocols and secure communication (TCP/IP, HTTP/HTTPS, TLS) • Experience implementing data integrity mechanisms (checksums, validation, consistency checks) • Experience designing and building REST APIs and service-based architectures • Familiarity with checkpointing, failure recovery, and retry mechanisms in distributed systems • Basic understanding of observability concepts (metrics, logging, alerting) • Strong debugging, problem-solving, and system design skills

🏖️ Benefits

• Dynamic and driven team environment • Opportunity for hands-on involvement • Engineering excellence at the heart of everything we do

Apply Now

Similar Jobs

🕒 May 7

GitLab

1001 - 5000

🤖 Artificial Intelligence

🏢 Enterprise

☁️ SaaS

Distinguished Engineer pioneering and scaling autonomous SDLC capabilities at GitLab. Collaborating with cross-functional teams to integrate AI into workflows and enhance productivity.

Distributed Systems

Open Source

SDLC

🕒 May 7

SOLV Energy

501 - 1000

Fire Protection Engineer providing technical expertise in design and implementation of fire protection systems for Battery Energy Storage Systems. Ensuring compliance with safety standards while supporting innovative energy projects.

🕒 May 7

Agilent Technologies

10,000+ employees

🔬 Science

🧬 Biotechnology

💊 Pharmaceuticals

Product Development Engineer driving innovation within Agilent's Systems Solutions. Collaborate across business units to enhance product lifecycle and implement AI-enabled improvements.

🕒 May 7

Wing

51 - 200

🛍️ eCommerce

🚀 Aerospace

Navigation Engineer at Wing creating advanced autonomy algorithms for drone delivery systems. Collaborating cross-functionally to ensure successful integration of guidance and control systems in autonomous aircraft.

Python

🕒 May 7

Barnes Aerospace

1001 - 5000

🚀 Aerospace

Subject Matter Expert in CNC programming and manufacturing engineering with 15+ years experience. Leading technical innovation and standardization for aerospace manufacturing.