GenAI Solutions Architect

وصف الوظيفة

About the Role

We are in search of a GenAI Solutions Architect to enhance our expanding AI delivery team. You will be tasked with designing and constructing large language model (LLM) systems that transition from experimentation to full-scale production, enabling capabilities such as search, summarization, knowledge assistance, and automation for enterprise-level clients.

This position is characterized by a hands-on, execution-driven focus. You will collaborate closely with product managers, engineers, and AI experts to deliver scalable solutions. Rather than being confined to research or theorizing, your efforts will result in deploying operational systems that users depend on daily.

What You’ll Do

  • Architect comprehensive GenAI systems, encompassing prompt chaining, memory strategies, token budgeting, and embedding pipelines.
  • Design and refine Retrieval-Augmented Generation (RAG) workflows utilizing tools including LangChain, LlamaIndex, and vector databases (FAISS, Pinecone, Qdrant).
  • Assess trade-offs involving zero-shot prompting, fine-tuning, LoRA/QLoRA, and hybrid strategies, ensuring solutions align with user objectives and limitations.
  • Incorporate LLMs and APIs (OpenAI, Anthropic, Cohere, Hugging Face) into real-time products and services, focusing on latency, scalability, and observability.
  • Engage with cross-functional teams to translate intricate GenAI architectures into reliable, maintainable features that facilitate product roll-out.
  • Draft and evaluate technical design documents while actively participating in implementation choices.
  • Deploy to production using industry best practices concerning version control, API lifecycle management, and monitoring (e.g., hallucination detection, prompt drift).

What You’ll Bring

  • Proven expertise in developing and deploying GenAI-driven applications, particularly in enterprise or regulated settings.
  • Strong comprehension of LLMs, vector searches, embeddings, and GenAI design paradigms (e.g., RAG, prompt injection protection, tool usage with agents).
  • Proficiency in Python and familiarity with frameworks and libraries like LangChain, Transformers, Hugging Face, and OpenAI SDKs.
  • Experience with vector databases such as FAISS, Qdrant, or Pinecone.
  • Knowledge of cloud infrastructures (AWS, GCP, or Azure) and foundational MLOps principles (CI/CD, monitoring, containerization).
  • A results-driven mentality, skilled at balancing speed, quality, and feasibility in dynamic projects.

Nice to Have

  • Experience in constructing multi-tenant GenAI platforms.
  • Awareness of enterprise-level AI governance and security standards.
  • Familiarity with multi-modal architectures (e.g., text + image or audio).
  • Insights on cost-optimization strategies for LLM performance and token utilization.

This Role Is Not For

  • ML researchers concentrating on academic model development devoid of practical delivery experience.
  • Data scientists lacking knowledge in vector search, LLM prompt engineering, or system architecture.
  • Engineers who have not deployed GenAI products in production settings.

Benefits & Growth Opportunities:

  • Competitive salary and performance bonuses
  • Comprehensive health insurance
  • Support for professional development and certifications
  • Opportunities to engage in pioneering AI projects
  • International exposure and potential travel
  • Flexible working arrangements
  • Clear paths for career advancement in a fast-growing AI organization.

This role offers a distinctive chance to influence the landscape of AI integration while collaborating with a skilled team at the cutting edge of technological progress. The chosen candidate will be instrumental in advancing our company’s mission to deliver transformative AI solutions to our clients.

إمتيازات الوظيفة

Benefits & Growth Opportunities:

·       Competitive salary and performance bonuses

·       Comprehensive health insurance

·       Professional development and certification support

·       Opportunity to work on cutting-edge AI projects

·       International exposure and travel opportunities

·       Flexible working arrangements

·       Career advancement opportunities in a rapidly growing AI company

This position offers a unique opportunity to shape the future of AI implementation while working with a talented team of professionals at the forefront of technological innovation. The successful candidate will play a crucial role in driving our company's success in delivering transformative AI solutions to our clients.

متطلبات الوظيفة

What You’ll Do

  • Architect end-to-end GenAI systems, including prompt chaining, memory strategies, token budgeting, and embedding pipelines
  • Design and optimize RAG (Retrieval-Augmented Generation) workflows using tools like LangChain, LlamaIndex, and vector databases (FAISS, Pinecone, Qdrant)
  • Evaluate tradeoffs between zero-shot prompting, fine-tuning, LoRA/QLoRA, and hybrid approaches, aligning solutions with user goals and constraints
  • Integrate LLMs and APIs (OpenAI, Anthropic, Cohere, Hugging Face) into real-time products and services with latency, scalability, and observability in mind
  • Collaborate with cross-functional teams—translating complex GenAI architectures into stable, maintainable features that support product delivery
  • Write and review technical design documents and remain actively involved in implementation decisions
  • Deploy to production with industry best practices around version control, API lifecycle management, and monitoring (e.g., hallucination detection, prompt drift)

What You’ll Bring

  • Proven experience building and deploying GenAI-powered applications, ideally in enterprise or regulated environments
  • Deep understanding of LLMs, vector search, embeddings, and GenAI design patterns (e.g., RAG, prompt injection protection, tool use with agents)
  • Proficiency in Python and fluency with frameworks and libraries like LangChain, Transformers, Hugging Face, and OpenAI SDKs
  • Experience with vector databases such as FAISS, Qdrant, or Pinecone
  • Familiarity with cloud infrastructure (AWS, GCP, or Azure) and core MLOps concepts (CI/CD, monitoring, containerization)
  • A delivery mindset—you know how to balance speed, quality, and feasibility in fast-moving projects

Nice to Have

  • Experience building multi-tenant GenAI platforms
  • Exposure to enterprise-grade AI governance and security standards
  • Familiarity with multi-modal architectures (e.g., text + image or audio)
  • Knowledge of cost-optimization strategies for LLM inference and token usage

This Role Is Not For

  • ML researchers focused on academic model development without delivery experience
  • Data scientists unfamiliar with vector search, LLM prompt engineering, or system architecture
  • Engineers who haven’t shipped GenAI products into production environments