A daily multi-channel news workflow built on Mistral Workflows. Pulls the top tech and world headlines from three sources, filters them with an LLM, enriches sparse titles, and delivers them by SMS, email, or both — with an optional human-in-the-loop preview when invoked from Le Chat.
Find a file
2026-05-24 23:07:46 +00:00
data feat: add HITL + Gmail support 2026-05-24 22:50:44 +00:00
scripts feat: add HITL + Gmail support 2026-05-24 22:50:44 +00:00
src feat: add HITL + Gmail support 2026-05-24 22:50:44 +00:00
Makefile add workflow project 2026-05-24 20:36:31 +00:00
pyproject.toml add workflow project 2026-05-24 20:36:31 +00:00
README.md fix: simplify README.md 2026-05-24 23:07:46 +00:00
uv.lock add workflow project 2026-05-24 20:36:31 +00:00
worker.py add workflow project 2026-05-24 20:36:31 +00:00

tech-news-retreiver

Triggered every day at 11:30am or manually from Le Chat, the workflow:

  1. Fetches stories in parallel thanks to a custom Mistral Studio agent doing web search with strict freshness rules.
  2. Deduplicates and classifies in a single LLM call: groups similar stories, drops anything that isn't an "event news" (no project showcases, opinion pieces, Ask-HN questions, addressed-to-LLM jokes).
  3. Filters against a 72h sent-history so the same story doesn't ship twice, and prioritizes sources/stories curated by the Studio agent.
  4. Enriches sparse titles into self-contained event sentences via an LLM (e.g. Deno 2.8Open source JavaScript runtime Deno releases version 2.8).
  5. Formats each item for SMS (<title> [<source>], ≤160 chars GSM-7, URL-less to bypass France's MAN regulation).
  6. =(Optional, interactive mode only) Sends a preview to the user via send_assistant_message, then wait_for_input to let them choose: SMS only, email only, both, or cancel.
  7. Dispatches to the chosen channels:
    • SMS via OVH (intro SMS with emojis in UCS-2, then 1 SMS per item in GSM-7)
    • Email via Gmail SMTP (HTML + plain text digest with full clickable URLs)

End-to-end takes ~30-60 seconds (longer if waiting for human input).


Three execution types

Trigger When Input source
Cron schedule Daily at 11:30 Paris CEST (= 09:30 UTC) Hardcoded in daily_schedule.input in tech_news.py
Studio Console When you click "Start workflow" in the Console UI Raw JSON you paste in the form
Le Chat When a user invokes the assistant in a Le Chat conversation Provided by Le Chat (handles ConfirmationInput buttons natively)

The "mode" is not a persistent setting — it's just a field in each execution's input. The scheduled cron always uses what's in the code's `daily_schedule.i>


Architecture

                ┌───────────────────────────────────────┐
                │  Mistral Workflows (Temporal-backed)  │
                │  cron 30 9 * * * UTC  +  manual/Le Chat │
                └───────────────────────────────────────┘
                                  │
          ┌───────────────────────┼───────────────────────┐
          ▼                       ▼                       ▼
   fetch_hackernews          fetch_korben       fetch_tech_news_agent
   (Firebase API)            (RSS feed)         (Mistral REST API)
          │                       │                       │
          └───────────────────────┼───────────────────────┘
                                  ▼
                            normalize_items
                    (FR→EN translation via LLM)
                                  │
                                  ▼
                            dedupe_and_rank
                (LLM clusters + drops evergreen content)
                                  │
                                  ▼
                          filter_already_sent
                  (JSON history store, 72h lookback)
                                  │
                                  ▼
                  _select_with_agent_priority
              (agent items first, then HackerNews/Korben fill)
                                  │
                                  ▼
                            enrich_titles
            (LLM rewrites sparse titles into event sentences)
                                  │
                                  ▼
                            format_for_sms
              ("title [source]", GSM-7 ASCII, ≤160 chars)
                                  │
                                  ▼
                  ┌──────── interactive? ────────┐
                  │                              │
                  ▼ yes                          ▼ no
        send_assistant_message            use input.default_channels
        + wait_for_input                  directly
        (HITL channel choice)                    │
                  │                              │
                  └──────────────┬───────────────┘
                                 │
                  ┌──────────────┼──────────────┐
                  ▼ "sms"        ▼ "email"      ▼ "both"
            send_sms_batch    send_email_digest    both
            (OVH)             (Gmail SMTP)
                  │              │                  │
                  └──────────────┴──────────────────┘
                                 │
                                 ▼
                       persist_sent_history
                  (writes to data/history.json)

What content format you receive (SMS or Email)

🚨🌍 News for May 24th 2026

US and Iran negotiators hint at progress in final phase of an interim peace deal [livemint]
Trump administration says green card seekers must leave the US to apply [nytimes]
Amazon stops supporting older Kindle e-readers, upsetting loyal users [reuters]
Scammers misuse a Microsoft internal account to send spam links [techcrunch]
Microsoft releases the oldest known DOS source code as open-source [arstechnica]

Stack

  • Mistral Workflows — durable execution engine (built on Temporal), HITL via wait_for_input, Le Chat assistant integration
  • mistralai-workflows-plugins-mistralai — plugin for mistralai_chat_complete, ChatAssistantWorkflowOutput, ConfirmationInput
  • ovh — official OVH Python SDK for SMS
  • smtplib (stdlib) — Gmail SMTP for email digests
  • Python 3.14 with uv for dependency management
  • Deployed via systemd on a personal VPS

Project layout

tech-news-retreiver/
├── .env                          # secrets and config (gitignored)
├── pyproject.toml                # uv-managed deps
├── Makefile                      # scaffold-provided commands
├── data/
│   └── history.json              # sent items hashes, last 72h+
├── ovh_sms_test.py               # standalone OVH connectivity test
├── ovh_sms_check.py              # OVH credit / sender status check
└── src/
    ├── activities/
    │   ├── fetchers.py           # fetch_hackernews, fetch_korben, fetch_tech_news_agent
    │   ├── processing.py         # normalize_items, dedupe_and_rank, enrich_titles
    │   ├── filtering.py          # filter_already_sent, persist_sent_history
    │   ├── sms.py                # format_for_sms, send_sms_batch (OVH)
    │   └── email.py              # send_email_digest (Gmail SMTP)
    ├── clients/
    │   ├── history_store.py      # JSON file with fcntl lock
    │   └── ovh_client.py         # OVH SDK wrapper with verbose response logging
    ├── entrypoints/              # dev.py, worker.py, start.py (scaffold)
    └── workflows/
        └── tech_news.py          # InteractiveWorkflow definition + cron schedule

Execution modes

The workflow accepts these input fields:

class TechNewsInput(BaseModel):
    top_k: int = 5                          # how many news to send
    history_lookback_hours: int = 72        # don't re-send within this window
    dry_run: bool = False                   # preview only, don't send/persist
    interactive: bool = False               # show HITL preview + choice
    default_channels: list[str] = ["sms"]   # channels when not interactive

Mode 1 — Scheduled (the daily automatic run)

Defined in tech_news.py:

daily_schedule = ScheduleDefinition(
    input={
        "top_k": 5,
        "history_lookback_hours": 72,
        "dry_run": False,
        "interactive": False,
        "default_channels": ["sms"],
    },
    cron_expressions=["30 9 * * *"],   # UTC → 11:30 Paris in summer
)

Fires automatically every day, no human in the loop. To change what gets sent or via which channel, edit default_channels and restart the worker.

Mode 2 — Dry-run preview (manual, from Console)

{ "dry_run": true }

Runs the entire pipeline, returns the 5 formatted SMS strings as the workflow output, sends nothing, doesn't write history. Useful for testing changes to the code.

Mode 3 — Interactive HITL (manual, from Le Chat or Console)

{ "interactive": true }

After the fetch+dedup+enrich+format steps, the workflow:

  1. Pushes the preview as an assistant message
  2. Pauses and waits for the user to pick a channel (📱 SMS / 📧 Email / 📱📧 Both / Cancel)
  3. Dispatches to the chosen channel(s)

In Le Chat, the choices render as buttons. In the Studio Console, the workflow pauses at "En attente d'entrée" and shows a raw JSON form — you'd submit:

{"choice": "sms"}

(or "email", "both", "cancel"). The button labels are just UI hints for Le Chat.

Mode 4 — Direct non-interactive single channel (manual, from Console)

{ "interactive": false, "default_channels": ["email"], "dry_run": false }

Skips the human prompt, dispatches directly via the channel(s) listed.


Prerequisites

  • A VPS or any always-on Linux machine with Python 3.14
  • uv for dependency management
  • A Mistral API key
  • A custom tech_news Studio agent with web search enabled (see Agent setup below)
  • A OVH SMS account with credits and a validated sender (only if using SMS channel)
  • A Gmail account with 2-factor auth and an app password (only if using email channel)
  • A French phone number to receive SMS (the workflow uses OVH FR codings)

Setup

1. Clone and install

git clone https://git.poudlar.do/Poudlardo/tech_news_watch_workflow.git
cd tech_news_watch_workflow
uv sync

2. Create the tech_news Studio agent

In Mistral Studio Console → Agents → Create:

  • Tools: enable web_search_premium
  • Model: Mistral Large recommended for the strict freshness reasoning
  • Instructions: a system prompt instructing the agent to return ONLY JSON with this schema:
    {
      "headline_time_et": "<current time in ET>",
      "items": [
        {
          "category": "POLITICS|TECH|HEALTH|...",
          "title": "<headline>",
          "summary": "<2-3 sentence summary>",
          "url": "<canonical article URL>",
          "hours_ago": <int>,
          "importance": <int 1-100>,
          "additional_sources": [...]
        }
      ],
      "stories_confirmed_count": <int>,
      "note": "<optional>"
    }
    
  • The agent should enforce: ≤12h freshness, cross-source corroboration, drop evergreen content, output strictly the JSON object with no prose

Copy its ID (looks like ag_019e4b31e8dd734e82873d2017110983) for .env.

3. Set up OVH SMS (if using SMS channel)

  1. Subscribe to an OVH SMS pack at https://www.ovhtelecom.fr/sms/
  2. Generate API credentials with /sms scope at: https://api.ovh.com/createToken/index.cgi?GET=/sms&GET=/sms/*&POST=/sms/*&PUT=/sms/*
  3. Add a validated sender in OVH Manager → Telecom → SMS → Senders (alphanumeric 3-11 chars)
  4. Note the service name (e.g. sms-ab12345-1)

4. Set up Gmail (if using email channel)

  1. Enable 2-step verification at https://myaccount.google.com/security
  2. Generate an app password at https://myaccount.google.com/apppasswords (name: "tech-news-retreiver")
  3. Copy the 16-character password (without spaces) for .env

5. Configure .env

# Mistral
MISTRAL_API_KEY=...
MISTRAL_LLM_MODEL=mistral-small-latest
TECH_NEWS_AGENT_ID=ag_...
DEPLOYMENT_NAME=tech-news-retreiver

# History store
HISTORY_STORE_PATH=/opt/apps/tech-news-retreiver/data/history.json

# OVH SMS (only needed for "sms" channel)
OVH_ENDPOINT=ovh-eu
OVH_APPLICATION_KEY=...
OVH_APPLICATION_SECRET=...
OVH_CONSUMER_KEY=...
OVH_SERVICE_NAME=sms-xx12345-1
OVH_SENDER=YourSender
OVH_NO_STOP_CLAUSE=true
SMS_RECIPIENTS=+33XXXXXXXXX

# Gmail (only needed for "email" channel)
GMAIL_ADDRESS=you@gmail.com
GMAIL_APP_PASSWORD=xxxxxxxxxxxxxxxx
EMAIL_RECIPIENTS=you@example.com,partner@example.com

6. (Optional) Test OVH and Gmail before first run

set -a; source .env; set +a
uv run python ovh_sms_test.py   # sends one test SMS

7. Run the worker

Local development with hot-reload:

make start-worker   # calls `uv run python -m entrypoints.dev`

Production via systemd:

# /etc/systemd/system/tech-news-retreiver.service
[Unit]
Description=Tech news retreiver Mistral worker
After=network-online.target

[Service]
Type=simple
User=tech-news
WorkingDirectory=/opt/apps/tech-news-retreiver
EnvironmentFile=/opt/apps/tech-news-retreiver/.env
ExecStart=/usr/bin/uv run python -m entrypoints.worker
Restart=always
RestartSec=10

[Install]
WantedBy=multi-user.target

Then sudo systemctl enable --now tech-news-retreiver.


Sample output

On a normal scheduled run:

{
  "fetched_count": 49,
  "per_source_counts": {
    "hackernews": 30,
    "korben": 15,
    "tech_news_agent": 4
  },
  "after_dedupe_count": 14,
  "after_history_filter_count": 14,
  "agent_priority_count": 4,
  "channels_used": ["sms"],
  "items_sent": 5,
  "sms_sent": 6,
  "email_sent": false,
  "sent_titles": [
    "US and Iran negotiators hint at progress in final phase of an interim peace deal",
    "Trump administration says green card seekers must leave the US to apply",
    "Amazon stops supporting older Kindle e-readers, upsetting loyal users",
    "Scammers misuse a Microsoft internal account to send spam links",
    "Microsoft releases the oldest known DOS source code as open-source"
  ]
}

Coming soon

  • Google Contacts integration — pull recipient numbers/emails from a labeled Google contact group instead of .env. Needs OAuth setup (Google Cloud Project + People API).
  • Per-user preferences — JSON or DB-backed user profiles: each user picks their channels, languages, source mix, and personal schedule.
  • More channels — Slack via webhooks (trivial), Signal via signal-cli (medium), WhatsApp via Twilio (paid).
  • Approval-flow scheduling — cron triggers a "preview" interactive run sent to Slack, user approves, then a separate "send" workflow runs. Cleaner separation than the current single workflow.
  • Custom news categories — let the user say in Le Chat "give me only AI/security news today" and filter dynamically.

Acknowledgments