Open source consulting Paris · Remote · On-site

Data science consulting
that ships to production.

We work on the full stack — from problem framing and data strategy to model development and deployment. Open-source tools only. Your code, your models, your infrastructure.

our_process.sh copy
# how every engagement works

$ git clone your_problem
 framing done  # what's the actual business question?
 data audit    # what do you have, what's missing?
 model built   # python / R / open LLM, reproducible
 deployed      # fastapi · docker · gitlab CI · on-premise
 handed over   # docs + training, team is autonomous

$ echo "result: yours to own and extend"

Four phases, one objective: production

Every engagement follows the same discipline — from the first conversation to the last line of documentation.

01
🎯

Strategy & framing

We start by identifying the real business problem — not the data problem someone thinks they have. What's the decision this model needs to support? What does success look like in production? What data exists, and what's missing?

What to build? What data exists? What's the ROI? What's the risk?
02
🔍

Analysis & modelling

Exploratory analysis, feature engineering, model selection and validation. We use Python (scikit-learn, PyTorch, statsmodels) or R (tidymodels, brms) depending on what fits your problem. Experiments tracked with MLflow from day one.

What's the insight? Which model? How to validate?
03
⚙️

Development & deployment

Production-grade code with uv-managed environments, GitLab CI/CD pipelines, containerised with Docker, exposed via FastAPI or served via Streamlit/Dash. On-premise or your cloud — your call. No SaaS dependency.

FastAPI / Docker GitLab CI/CD On-premise
04
🎓

Transfer & autonomy

Documentation, code review sessions and hands-on training so your team can maintain, extend and re-train the system. The engagement ends when your team doesn't need us — and comes back for the next challenge.

Full docs Team training Handover

What we build

Every service uses open-source tooling, ships with reproducible environments and is handed over with documentation.

Everything we use is open source

No black boxes. Every tool is inspectable, forkable, and replaceable. You're never locked in.

🐍Python
ML & AI
scikit-learnPyTorchHuggingFaceLangChainLlamaIndexOllamavLLM
📦Python
Packaging & quality
uvruffmypypytestpre-commit
📊R
Stats & viz
tidyversetidymodelsggplot2brmsforecastrenv
🔧DevOps
CI/CD & deployment
GitLab CI/CDDockerMLflowDVCFastAPIPrefect
🖥️Apps
Dashboards & apps
StreamlitDashShinyPlotlyQuarto
🏗️Infra
R environment
RStudio ServerPosit WorkbenchPosit ConnectPositron

Why open source only

A constraint that protects you — not us.

No lock-in
You own the models and the code
Everything we build runs on open tools. Swap models, change infrastructure, extend the codebase — no proprietary runtime, no licence fee, no renegotiation.
Auditability
Explain every decision
Open models are inspectable. You can satisfy internal auditors, GDPR requirements and EU AI Act obligations. Proprietary black-box APIs cannot.
Data sovereignty
Your data stays on your infrastructure
We deploy on-premise or in your private cloud. Sensitive data — financial records, health data, legal documents — never touches a third-party API.
Reproducibility
Environments that work everywhere
uv lockfiles for Python, renv for R, Docker for deployment. Your GitLab pipeline runs the same code in dev, staging and production — no drift.

200+ projects delivered since 2012

Energy, finance, healthcare, retail, research, telecoms — production systems running at scale.

EDFEngie AXADior L'OréalOrange LCLOCDE UbisoftNestlé ThalesScania CNRSCerba Healthcare CofaceInstitut Curie Société GénéraleIfremer

Tell us about your project.

A few lines about your data, your problem and your constraints — we respond within 48 hours with an honest assessment of what's feasible and what it takes.

Phone +33 1 72 25 40 82 Email info@stat4decision.com Response within 48h