| data | ||
| scripts | ||
| src | ||
| Makefile | ||
| pyproject.toml | ||
| README.md | ||
| uv.lock | ||
| worker.py | ||
tech-news-retreiver
Triggered every day at 11:30am or manually from Le Chat, the workflow:
- Fetches stories in parallel thanks to a custom Mistral Studio agent doing web search with strict freshness rules.
- Deduplicates and classifies in a single LLM call: groups similar stories, drops anything that isn't an "event news" (no project showcases, opinion pieces, Ask-HN questions, addressed-to-LLM jokes).
- Filters against a 72h sent-history so the same story doesn't ship twice, and prioritizes sources/stories curated by the Studio agent.
- Enriches sparse titles into self-contained event sentences via an LLM (e.g.
Deno 2.8→Open source JavaScript runtime Deno releases version 2.8). - Formats each item for SMS (
<title> [<source>], ≤160 chars GSM-7, URL-less to bypass France's MAN regulation). - =(Optional, interactive mode only) Sends a preview to the user via
send_assistant_message, thenwait_for_inputto let them choose: SMS only, email only, both, or cancel. - Dispatches to the chosen channels:
- SMS via OVH (intro SMS with emojis in UCS-2, then 1 SMS per item in GSM-7)
- Email via Gmail SMTP (HTML + plain text digest with full clickable URLs)
End-to-end takes ~30-60 seconds (longer if waiting for human input).
Three execution types
| Trigger | When | Input source |
|---|---|---|
| Cron schedule | Daily at 11:30 Paris CEST (= 09:30 UTC) | Hardcoded in daily_schedule.input in tech_news.py |
| Studio Console | When you click "Start workflow" in the Console UI | Raw JSON you paste in the form |
| Le Chat | When a user invokes the assistant in a Le Chat conversation | Provided by Le Chat (handles ConfirmationInput buttons natively) |
The "mode" is not a persistent setting — it's just a field in each execution's input. The scheduled cron always uses what's in the code's `daily_schedule.i>
Architecture
┌───────────────────────────────────────┐
│ Mistral Workflows (Temporal-backed) │
│ cron 30 9 * * * UTC + manual/Le Chat │
└───────────────────────────────────────┘
│
┌───────────────────────┼───────────────────────┐
▼ ▼ ▼
fetch_hackernews fetch_korben fetch_tech_news_agent
(Firebase API) (RSS feed) (Mistral REST API)
│ │ │
└───────────────────────┼───────────────────────┘
▼
normalize_items
(FR→EN translation via LLM)
│
▼
dedupe_and_rank
(LLM clusters + drops evergreen content)
│
▼
filter_already_sent
(JSON history store, 72h lookback)
│
▼
_select_with_agent_priority
(agent items first, then HackerNews/Korben fill)
│
▼
enrich_titles
(LLM rewrites sparse titles into event sentences)
│
▼
format_for_sms
("title [source]", GSM-7 ASCII, ≤160 chars)
│
▼
┌──────── interactive? ────────┐
│ │
▼ yes ▼ no
send_assistant_message use input.default_channels
+ wait_for_input directly
(HITL channel choice) │
│ │
└──────────────┬───────────────┘
│
┌──────────────┼──────────────┐
▼ "sms" ▼ "email" ▼ "both"
send_sms_batch send_email_digest both
(OVH) (Gmail SMTP)
│ │ │
└──────────────┴──────────────────┘
│
▼
persist_sent_history
(writes to data/history.json)
What content format you receive (SMS or Email)
🚨🌍 News for May 24th 2026
US and Iran negotiators hint at progress in final phase of an interim peace deal [livemint]
Trump administration says green card seekers must leave the US to apply [nytimes]
Amazon stops supporting older Kindle e-readers, upsetting loyal users [reuters]
Scammers misuse a Microsoft internal account to send spam links [techcrunch]
Microsoft releases the oldest known DOS source code as open-source [arstechnica]
Stack
- Mistral Workflows — durable execution engine (built on Temporal), HITL via
wait_for_input, Le Chat assistant integration - mistralai-workflows-plugins-mistralai — plugin for
mistralai_chat_complete,ChatAssistantWorkflowOutput,ConfirmationInput - ovh — official OVH Python SDK for SMS
- smtplib (stdlib) — Gmail SMTP for email digests
- Python 3.14 with uv for dependency management
- Deployed via systemd on a personal VPS
Project layout
tech-news-retreiver/
├── .env # secrets and config (gitignored)
├── pyproject.toml # uv-managed deps
├── Makefile # scaffold-provided commands
├── data/
│ └── history.json # sent items hashes, last 72h+
├── ovh_sms_test.py # standalone OVH connectivity test
├── ovh_sms_check.py # OVH credit / sender status check
└── src/
├── activities/
│ ├── fetchers.py # fetch_hackernews, fetch_korben, fetch_tech_news_agent
│ ├── processing.py # normalize_items, dedupe_and_rank, enrich_titles
│ ├── filtering.py # filter_already_sent, persist_sent_history
│ ├── sms.py # format_for_sms, send_sms_batch (OVH)
│ └── email.py # send_email_digest (Gmail SMTP)
├── clients/
│ ├── history_store.py # JSON file with fcntl lock
│ └── ovh_client.py # OVH SDK wrapper with verbose response logging
├── entrypoints/ # dev.py, worker.py, start.py (scaffold)
└── workflows/
└── tech_news.py # InteractiveWorkflow definition + cron schedule
Execution modes
The workflow accepts these input fields:
class TechNewsInput(BaseModel):
top_k: int = 5 # how many news to send
history_lookback_hours: int = 72 # don't re-send within this window
dry_run: bool = False # preview only, don't send/persist
interactive: bool = False # show HITL preview + choice
default_channels: list[str] = ["sms"] # channels when not interactive
Mode 1 — Scheduled (the daily automatic run)
Defined in tech_news.py:
daily_schedule = ScheduleDefinition(
input={
"top_k": 5,
"history_lookback_hours": 72,
"dry_run": False,
"interactive": False,
"default_channels": ["sms"],
},
cron_expressions=["30 9 * * *"], # UTC → 11:30 Paris in summer
)
Fires automatically every day, no human in the loop. To change what gets sent or via which channel, edit default_channels and restart the worker.
Mode 2 — Dry-run preview (manual, from Console)
{ "dry_run": true }
Runs the entire pipeline, returns the 5 formatted SMS strings as the workflow output, sends nothing, doesn't write history. Useful for testing changes to the code.
Mode 3 — Interactive HITL (manual, from Le Chat or Console)
{ "interactive": true }
After the fetch+dedup+enrich+format steps, the workflow:
- Pushes the preview as an assistant message
- Pauses and waits for the user to pick a channel (📱 SMS / 📧 Email / 📱📧 Both / ❌ Cancel)
- Dispatches to the chosen channel(s)
In Le Chat, the choices render as buttons. In the Studio Console, the workflow pauses at "En attente d'entrée" and shows a raw JSON form — you'd submit:
{"choice": "sms"}
(or "email", "both", "cancel"). The button labels are just UI hints for Le Chat.
Mode 4 — Direct non-interactive single channel (manual, from Console)
{ "interactive": false, "default_channels": ["email"], "dry_run": false }
Skips the human prompt, dispatches directly via the channel(s) listed.
Prerequisites
- A VPS or any always-on Linux machine with Python 3.14
uvfor dependency management- A Mistral API key
- A custom
tech_newsStudio agent with web search enabled (see Agent setup below) - A OVH SMS account with credits and a validated sender (only if using SMS channel)
- A Gmail account with 2-factor auth and an app password (only if using email channel)
- A French phone number to receive SMS (the workflow uses OVH FR codings)
Setup
1. Clone and install
git clone https://git.poudlar.do/Poudlardo/tech_news_watch_workflow.git
cd tech_news_watch_workflow
uv sync
2. Create the tech_news Studio agent
In Mistral Studio Console → Agents → Create:
- Tools: enable
web_search_premium - Model: Mistral Large recommended for the strict freshness reasoning
- Instructions: a system prompt instructing the agent to return ONLY JSON with this schema:
{ "headline_time_et": "<current time in ET>", "items": [ { "category": "POLITICS|TECH|HEALTH|...", "title": "<headline>", "summary": "<2-3 sentence summary>", "url": "<canonical article URL>", "hours_ago": <int>, "importance": <int 1-100>, "additional_sources": [...] } ], "stories_confirmed_count": <int>, "note": "<optional>" } - The agent should enforce: ≤12h freshness, cross-source corroboration, drop evergreen content, output strictly the JSON object with no prose
Copy its ID (looks like ag_019e4b31e8dd734e82873d2017110983) for .env.
3. Set up OVH SMS (if using SMS channel)
- Subscribe to an OVH SMS pack at https://www.ovhtelecom.fr/sms/
- Generate API credentials with
/smsscope at: https://api.ovh.com/createToken/index.cgi?GET=/sms&GET=/sms/*&POST=/sms/*&PUT=/sms/* - Add a validated sender in OVH Manager → Telecom → SMS → Senders (alphanumeric 3-11 chars)
- Note the service name (e.g.
sms-ab12345-1)
4. Set up Gmail (if using email channel)
- Enable 2-step verification at https://myaccount.google.com/security
- Generate an app password at https://myaccount.google.com/apppasswords (name: "tech-news-retreiver")
- Copy the 16-character password (without spaces) for
.env
5. Configure .env
# Mistral
MISTRAL_API_KEY=...
MISTRAL_LLM_MODEL=mistral-small-latest
TECH_NEWS_AGENT_ID=ag_...
DEPLOYMENT_NAME=tech-news-retreiver
# History store
HISTORY_STORE_PATH=/opt/apps/tech-news-retreiver/data/history.json
# OVH SMS (only needed for "sms" channel)
OVH_ENDPOINT=ovh-eu
OVH_APPLICATION_KEY=...
OVH_APPLICATION_SECRET=...
OVH_CONSUMER_KEY=...
OVH_SERVICE_NAME=sms-xx12345-1
OVH_SENDER=YourSender
OVH_NO_STOP_CLAUSE=true
SMS_RECIPIENTS=+33XXXXXXXXX
# Gmail (only needed for "email" channel)
GMAIL_ADDRESS=you@gmail.com
GMAIL_APP_PASSWORD=xxxxxxxxxxxxxxxx
EMAIL_RECIPIENTS=you@example.com,partner@example.com
6. (Optional) Test OVH and Gmail before first run
set -a; source .env; set +a
uv run python ovh_sms_test.py # sends one test SMS
7. Run the worker
Local development with hot-reload:
make start-worker # calls `uv run python -m entrypoints.dev`
Production via systemd:
# /etc/systemd/system/tech-news-retreiver.service
[Unit]
Description=Tech news retreiver Mistral worker
After=network-online.target
[Service]
Type=simple
User=tech-news
WorkingDirectory=/opt/apps/tech-news-retreiver
EnvironmentFile=/opt/apps/tech-news-retreiver/.env
ExecStart=/usr/bin/uv run python -m entrypoints.worker
Restart=always
RestartSec=10
[Install]
WantedBy=multi-user.target
Then sudo systemctl enable --now tech-news-retreiver.
Sample output
On a normal scheduled run:
{
"fetched_count": 49,
"per_source_counts": {
"hackernews": 30,
"korben": 15,
"tech_news_agent": 4
},
"after_dedupe_count": 14,
"after_history_filter_count": 14,
"agent_priority_count": 4,
"channels_used": ["sms"],
"items_sent": 5,
"sms_sent": 6,
"email_sent": false,
"sent_titles": [
"US and Iran negotiators hint at progress in final phase of an interim peace deal",
"Trump administration says green card seekers must leave the US to apply",
"Amazon stops supporting older Kindle e-readers, upsetting loyal users",
"Scammers misuse a Microsoft internal account to send spam links",
"Microsoft releases the oldest known DOS source code as open-source"
]
}
Coming soon
- Google Contacts integration — pull recipient numbers/emails from a labeled Google contact group instead of
.env. Needs OAuth setup (Google Cloud Project + People API). - Per-user preferences — JSON or DB-backed user profiles: each user picks their channels, languages, source mix, and personal schedule.
- More channels — Slack via webhooks (trivial), Signal via signal-cli (medium), WhatsApp via Twilio (paid).
- Approval-flow scheduling — cron triggers a "preview" interactive run sent to Slack, user approves, then a separate "send" workflow runs. Cleaner separation than the current single workflow.
- Custom news categories — let the user say in Le Chat "give me only AI/security news today" and filter dynamically.
Acknowledgments
- Built on Mistral Workflows public preview
- Uses the tech_news Studio agent pattern with
web_search_premium - News sources: Hacker News, Korben