AI Evaluation Specialist

Hong Kong +1
Full time
Remote
Compensation is not specified
Role
Data Scientist
Description

Binance is a renowned global blockchain ecosystem that operates the largest cryptocurrency exchange platform worldwide, serving over 280 million individuals across 100+ countries. Our solid reputation is built on industry-leading security standards, transparent user fund management, fast trading engine, extensive liquidity, and an unparalleled array of digital asset products. Binance's diverse offerings span trading, financial services, education, research, payments, institutional support, Web3 features, and more. We harness the potential of digital assets and blockchain technology to establish an inclusive financial ecosystem that promotes financial freedom and enhances financial inclusion on a global scale.

We are currently seeking a dedicated AI Evaluation Specialist tasked with designing, implementing, and overseeing comprehensive evaluation frameworks covering every stage in the lifespan of LLM agents. Your role will play a pivotal part in Binance's AI integration journey by ensuring the dependability, adaptability, and regulatory compliance of AI agents deployed across various domains such as Customer Service, Growth, and Compliance.

Responsibilities:

  • Engage in the complete software development lifecycle, encompassing requirements analysis, test planning, execution, defect tracking, product release, and maintenance.
  • Serve as a primary contact for A.I Agents evaluation and continuous monitoring.
  • Develop effective test strategies and conduct hands-on testing to guarantee the accuracy, reliability, and performance of AI and data applications.
  • Conduct root cause analysis of test failures and product issues, and facilitate optimization for future enhancements.
  • Design and implement internal tools utilizing AI technology to enhance engineering and testing efficiency.

Requirements:

  • Bachelor’s or Master’s degree in Computer Science, Artificial Intelligence, Data Science, or a related field.
  • Profound understanding of Large Language Models (LLMs), autonomous AI agents, and their system architectures.
  • Experience with AI evaluation methodologies encompassing offline benchmarking, online monitoring, and hybrid human-AI evaluation approaches.
  • Knowledge of software engineering best practices such as Test-Driven Development (TDD) and Behavior-Driven Development (BDD) in AI contexts.
  • Ability to craft adaptive, lifecycle-spanning evaluation frameworks incorporating quantitative and qualitative metrics.
  • Prior experience with evaluation tools and frameworks is beneficial.
  • Proficiency in analyzing complex system-level behaviors, including reasoning pipelines, tool integrations, and agent actions.
  • Strong analytical abilities with a background in data-driven diagnostics and root cause analysis.
  • Excellent communication skills for clear documentation of evaluation plans, results, and recommendations.
  • Experience collaborating in cross-functional teams and managing feedback loops between evaluation and development.
  • Previous exposure to working with infrastructure or platform teams to enhance AI tooling and automation platforms.

Binance offers a dynamic work environment where you can:

  • Play a pivotal role in shaping the future within the foremost blockchain ecosystem globally.
  • Collaborate with top-tier professionals in a user-centric, globally-distributed organization with a flat structure.
  • Engage in unique, fast-paced projects autonomously in an innovative setting.
  • Grow your career and continuously learn within a results-oriented workplace.
  • Enjoy competitive compensation and company benefits.
  • Benefit from remote work arrangements depending on team-specific work requirements.

At Binance, we are committed to fostering an inclusive work environment by promoting diversity within our workforce as we believe it is essential for our continued success. By applying for a job at Binance, you acknowledge that you have reviewed and agreed to our Candidate Privacy Notice.

Skills Required
Avatar
Binance
Website
Not specified
Company size
Not specified
Location
United States
Description
Not specified

More Full-time Jobs

Show more

Architect designers

Sydney, Australia
Sydney, Australia
Part time
Remote
AutoCAD
Looking for 2D AutoCAD specialists to prepare detailed architectural and engineering drawings.
Payment: AUD $500–$1,200 per project
 
Payment in Crypto
16,667-25,000
Monthly
See details

Senior Frontend Developer

Full time
Remote
At Tenderly, we're not just shaping the future of blockchain technology – we're defining it. As a full-stack Web3 infrastructure platform, we provide multi-network support and a comprehensive ecosystem that covers every aspect of the blockchain journey, from initial encounters to advanced Node solutions and serverless cloud compute functions.Our mission is crystal clear: to empower developers worldwide to harness the full potential of blockchain technology for creating innovative products. With our robust Web3 infrastructure platform, we've already enabled tens of thousands of developers to work more efficiently with blockchain. We thrive on tackling complex challenges head-on, crafting solutions that push the boundaries of what's possible in the blockchain world. From developing bespoke Ethereum Virtual Machine (EVM) implementations to designing proprietary storage layers optimized for our advanced blockchain node solutions, our team is at the forefront of industry innovation. About the RoleAre you ready to revolutionize the user experience in blockchain technology? As a Frontend Engineer at Tenderly, you will play a key role in shaping our platform’s user experience while ensuring the highest standards of quality and performance. Your role will be pivotal in driving excellence in our frontend development. What you’ll do✦ Develop and Maintain: Create and sustain frontend components and features using JavaScript, TypeScript, and React.✦ Collaborate: Work closely with across engineering and product teams to align technical solutions with company objectives, ensuring seamless integration of design and functionality.✦ Architect: Contribute to architectural decisions that drive scalability, performance, and alignment with business priorities, keeping us at the forefront of Web3 innovation.✦ Contribute: Provide technical expertise and support testing and validation efforts to ensure the success of our platform. Our Technology StackAt Tenderly, we leverage a sophisticated technology stack to deliver cutting-edge solutions. As a Frontend Engineer, you’ll work extensively with:- JavaScript- TypeScript- React- Golang (for full-stack collaboration)- Google Cloud Platform- Amazon Web Services- Supporting technologies including PostgreSQL, Kafka, Cassandra, and more About YouWe’re looking for a highly skilled and innovative frontend developer with a passion for creating exceptional user experiences. You should have:✦ Proven Experience: Demonstrated experience in technical roles, showcasing your ability to deliver high-quality software products.✦ Technical Expertise: Mastery of JavaScript, TypeScript, and React, with a deep understanding of modern frontend development practices and tools.✦ Architectural Vision: Strong decision-making skills to drive scalability, extensibility, and performance in our platform.✦ Effective Communication: Excellent communication and teamwork abilities, ensuring effective collaboration and transparent communication channels.What we offerWhy Tenderly?Join us at Tenderly, where our values are not just words on a page but the driving force behind our innovative journey.Fanatic DriveWe're passionately driven to solve our users' problems, tackling challenges with intensity and speed. Our work is our obsession, and we thrive on the fast-paced world of Web3.Underdog MentalityWe love defying the odds, rejecting the status quo, and achieving the impossible. We're on a mission to revolutionize blockchain technology, constantly pushing boundaries.Inward ResponsibilityWe hold ourselves accountable, always striving to improve. We embrace feedback, knowing it drives personal growth and collective success.Intentional CollaborationOur success relies on individual contributions and mutual support. We share ideas openly, trust each other completely, and continuously push each other to excel.Tenderly Transparency™️We believe in open, honest communication. We share knowledge and opinions freely, fostering a culture of accountability and trust.Tenderly is committed to creating a diverse environment and is proud to be an equal-opportunity employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, gender, gender identity or expression, sexual orientation, national origin, genetics, disability, age, or veteran status. To find out more: https://tenderly.co/careers

Full Stack Developer (Front End Leaning)

Full time
Remote
Job Description:
Winnables.com is looking for a creative front-end-leaning Full Stack Developer who loves bringing slick, engaging user experiences to life while still caring about the behind-the-scenes details.
You’ll collaborate closely with designers and product leads, iterate quickly on new ideas, and see the direct impact of your work in the hands of players around the world.
If you enjoy mixing imaginative front-end polish with just enough back-end know-how to make the magic happen, we’d love to hear from you.
Key Responsabilities:
Transform design mock-ups and product concepts into polished, responsive game screens, animations, and flows.
Collaborate daily with designers, product leads, and blockchain engineers to prototype, ship, and iterate on new games or new features.
Continuously test, debug, and optimize for performance across devices and network conditions.
Review code, share best practices, and help keep our small codebase clean, secure, and easy to extend.
Use player feedback and analytics to refine existing features and guide new ideas from concept to launch.
Champion accessibility and web-standards compliance, ensuring every player—regardless of device or ability—has a first-class experience.
What You’ll Be Building
Build interactive mini-games that deliver smooth, responsive gameplay.
Create real-time multiplayer features and live game mechanics.
Develop pack-opening systems with eye-catching, engaging animations.
Engineer raffle and lottery-style games that are fully on-chain and provably fair.
Implement Plinko and other physics-based games, tuned for fun and balance.
Craft user-friendly interfaces that perform flawlessly across mobile, tablet, and desktop.
Qualifications:
Tech Stack:
Core: TypeScript · React & Next.js · Node.js · Zustand · WebSockets & REST
Data & Infra: PostgreSQL (Supabase) · Redis · Docker · GitHub
Observability: Grafana · PostHog
Design & Web3: Figma · viem / wagmi
What We're Looking For
Strong experience with TypeScript and modern JavaScript
Solid understanding of React and Next.js ecosystem
Experience building real-time applications with WebSockets
Familiarity with Node.js backend development
Knowledge of PostgreSQL and database design
Experience with responsive design and mobile-first development
Understanding of state management patterns (Zustand experience preferred)
Understanding of web security fundamentals (CSRF protection, secure authentication flows, input validation)
Interest in gaming mechanics and interactive user experience.
Nice to Have
Experience with gaming or interactive application development
Knowledge of blockchain technologies and Web3 integration
Familiarity with monitoring tools (Grafana, PostHog)
Experience with Docker and containerization
Understanding of real-time multiplayer game architecture
 
Company Description:
Winnables.com is on a mission to re-imagine iGaming through full on-chain transparency. We believe players deserve provably fair odds, transparent math, and no black-box RNGs. Every spin and ticket purchase is recorded immutably on Ethereum, so anyone can audit the outcome, track prize pools, or even build their own analytics on top of our contracts.
All of our games are designed and engineered 100% in-house by a small, senior team that blends game design, cryptography, and smart-contract security. We currently operate two flagship experiences:
Raffles – transparent prize drawings where every ticket lives on-chain, letting players verify fairness before and after the draw.
Loot Packs – blockchain-powered loot boxes that reveal randomized rewards with engaging pack-opening animations.
We’re still just getting started. The roadmap is full of green-field challenges—and real, measurable impact on an industry ripe for change. Join us and help craft the next generation of fair, fun, and truly player-owned gaming.
3,500-5,833
Monthly
See details

UI/UX Designer

Full time
Remote
UI/UX Designer for SaaS Dashboard (Figma)
 
Job Summary:
We're building ClearTrack, a lightweight SaaS tool for tracking tasks and performance in remote teams. We're looking for a UI/UX designer to design 6–8 clean, modern dashboard screens (desktop + mobile).
Scope of Work:
1. Design dashboard UI (Task List, Time Tracker, Reports, Settings)
 
 
2. Create responsive mockups in Figma
 
 
3. Deliver clickable prototype + style guide
 
 
4. Collaborate with our developer for smooth handoff
 
 
Budget:
Fixed: $800–$1200 (negotiable based on experience)
Requirements:
1. SaaS/dashboard UI design experience
 
 
2. Proficiency in Figma
 
 
3. Strong communication & fast turnaround
 
 
To Apply:
If you're interested, please:
1. Send a short proposal letter explaining why you're a good fit
 
 
2. Share your portfolio (SaaS/dashboard preferred)
 
 
3. Include your email address
 
 
4. Mention your availability and estimated delivery time
 
 
Bonus if you have:
Experience designing admin panels or productivity tools
Payment in Crypto
1,000
Monthly
See details

Back End Developer in AI projects

Singapore
Singapore
Full time
Remote
We are seeking a skilled and proactive AI Integrator to join our team and lead the integration of artificial intelligence technologies into our business operations, products, and services. This role is pivotal in bridging the gap between data science, engineering, and business units, ensuring the seamless deployment, scaling, and optimization of AI-driven solutions.
You will work closely with stakeholders across product, engineering, data science, and operations to identify opportunities for AI integration, manage implementation, and maintain AI systems post-deployment.
 
Key Responsibilities:
Collaborate with cross-functional teams to identify and prioritize opportunities for AI integration.
Evaluate and select appropriate AI/ML tools, APIs, and frameworks that align with business objectives.
Deploy and integrate AI models into production environments and existing systems (e.g., CRM, ERP, customer support platforms).
Develop automation workflows using AI tools (e.g., ChatGPT, Claude, Azure AI, Google Vertex AI).
Ensure data pipelines, model outputs, and integrations are reliable, secure, and scalable.
Translate business needs into technical requirements for AI systems.
Monitor, troubleshoot, and optimize deployed AI systems for accuracy, performance, and ethical alignment.
Maintain documentation and provide training or support to internal teams on AI tools and capabilities.
Stay current with AI trends, emerging technologies, and best practices.
 
Required Qualifications:
Certified specialist in AI or Bachelors or Masters degree in Computer Science, Engineering, Data Science, or a related field.
Familiarity with cloud platforms (AWS, GCP, Azure) and their AI services.
Strong understanding of APIs, microservices architecture, and RESTful integration.
Excellent communication skills and ability to translate complex AI concepts for non-technical stakeholders.
 
Preferred Qualifications:
Experience with GenAI tools like OpenAI (GPT-4), Anthropic Claude, or Google Gemini.
Knowledge of prompt engineering and small language model (SLM) orchestration.
Experience integrating AI into SaaS or enterprise software products.
Background in data security, compliance, and AI ethics.
Payment in Crypto
4,000-8,000
Monthly
See details