Files
apix-mvp/docs/arc42/08-crosscutting-concepts.md
T
Carsten Rehfeld b2a16a8be7 Implement apix-registry with IoT sunset/decommission lifecycle and full BDD suite
- REST API: register, patch, O-level, replacements, history, search endpoints
- IoT lifecycle validations: future sunset, lock-before-release, sunset-passed-before-decommission
- DB schema: Liquibase changesets 001–008 (services, versions, replacements, sunset-at column)
- @ColumnTransformer(write="?::jsonb") on bsm_payload fields to avoid JDBC varchar→jsonb rejection
- Jandex plugin on apix-common + quarkus.index-dependency so @NotBlank validators resolve at runtime
- quarkus-logging-json extension added; quarkus.log.console.json=false is now a recognised key
- Fix requireSunsetBeforeLockRelease: Boolean.TRUE.equals instead of !Boolean.FALSE.equals (null guard)
- BDD suite: 27 scenarios / 213 steps across 5 feature files (sunset-lock, decommission, replacement, discovery, anonymity)
- Test infrastructure: JDBC TRUNCATE in @Before for DB isolation, Arc.container() for clock control — no test endpoints in production code
- sunsetAt truncated to microseconds in BDD steps to match Postgres timestamptz precision
- Cucumber step fixes: singular/plural candidate(s), lastResponse propagation in replacementsReturnsNCandidates

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-08 09:13:26 +02:00

7.8 KiB

arc42, status
arc42 status
8 — Crosscutting Concepts stub

8.1 Logging

  • Format: structured JSON in production (python-json-logger); human-readable in dev
  • Log levels: DEBUG (dev only), INFO (operational events), WARNING (recoverable anomalies), ERROR (failures needing attention)
  • What is logged per component:
Component INFO events WARNING events ERROR events
API request method+path+status+duration validation failure (BSM) DB connection failure
Spider check start/end, service_id, liveness result, duration response time > 3s fetch timeout, DB write failure
Portal form submission received API call failure
  • Never log: email addresses, API keys, DB credentials, any PII
  • Request IDs: generate UUID per request, include in all log lines for that request

8.2 Error Handling

  • All API errors return structured JSON: { "error": "string", "detail": "string", "code": "APIX-ERR-XXX" }
  • HTTP status codes:
    • 400 — malformed request (not JSON, missing content-type)
    • 422 — BSM validation failure (Pydantic; include field-level errors)
    • 401 — missing or invalid API key on write endpoints
    • 404 — service not found
    • 429 — rate limit exceeded
    • 500 — internal server error (never expose stack trace to client)
  • Spider errors are logged but do not crash the scheduler; failed service → liveness=unreachable

8.3 Security Hygiene (MVP-grade)

Control Implementation What this is NOT
HTTPS Caddy auto-TLS; HTTP redirects to HTTPS Not HSTS with long max-age (add post-MVP)
Write endpoint auth X-API-Key header checked against env var Not per-user keys (add post-MVP)
Rate limiting on writes Caddy rate_limit directive: 10 req/min per IP on /api/register Not full DDoS protection
No secrets in Git .env.example only; .env in .gitignore Not secret scanning CI (add post-MVP)
No PII in logs Enforced by convention; no log of registrant_email field Not automated PII detection
Non-root containers All Dockerfiles use USER appuser Not read-only filesystem (add post-MVP)

8.4 BSM Validation

  • Validation layer: Pydantic v2 model in models/bsm.py
  • Required fields (per Internet-Draft): name, version, description, capabilities[], endpoint, contact_email
  • Optional fields validated if present: olevel, slevel, pricing, regulatory
  • On validation failure: 422 with field-level error list
  • Re-registration (same endpoint URL): treated as update (UPSERT); BSM version must be >= existing version
  • Schema version stored with each record; enables future migration

8.5 Liveness Check

  • "Live" = HTTP 2xx response within 5 seconds from the registered endpoint URL
  • "Degraded" = HTTP 2xx but response time > 3 seconds
  • "Unreachable" = timeout, connection refused, or non-2xx response
  • Status transitions: any state → any state on each check (no hysteresis in MVP)
  • Check frequency: 15 min in prod, 2 min in dev
  • last_checked_at timestamp always exposed in API response

8.6 Idempotency

  • POST /api/register with the same endpoint URL: UPSERT (update BSM, reset liveness to pending)
  • Spider re-check: always overwrites previous liveness status — idempotent by design
  • DB migrations (Liquibase): each changeset is forward-only; re-running skips already-applied changesets (Liquibase tracks applied changesets in DATABASECHANGELOG table)

8.7 Internationalisation (i18n)

See ADR-013 for the full decision and rationale.

Locale resolution order (highest priority first):

  1. apix-locale cookie (set by the language switcher via POST /locale)
  2. Accept-Language request header (browser preference)
  3. Default: en (English)

String externalisation:

  • All user-visible strings in Qute templates are referenced via {inject:msg.<key>} — not hardcoded
  • Messages.java (@MessageBundle) declares all keys; Quarkus compiler verifies usage at build time
  • messages.properties — English; messages_de.properties — German; adding a locale requires only a new properties file
  • Keys follow the pattern <section>.<element> (e.g. nav.register, service.oLevel.label, admin.pending.title)

Help / tour content:

  • Tour titles, step headings, and step body text are defined in HelpContentService using Messages keys, resolved to the request locale
  • The resolved tour data is serialized to JSON and embedded in each page as window.PAGE_TOURS + window.PAGE_HELP — no client-side translation lookup at runtime
  • Adding a translated tour step requires only adding the key to Messages.java + both properties files

What is not localised in MVP:

  • Error messages from Bean Validation (return as-is in EN; acceptable for API-layer errors)
  • Log messages (always EN)
  • BSM content submitted by registrants (stored as-is; not translated)

Language switcher:

  • <form method="post" action="/locale"> with <input name="lang" value="de|en"> in the base layout
  • POST /locale: validates lang against ["en", "de"]; sets apix-locale cookie (path /, SameSite=Lax, HttpOnly); redirects to Referer header
  • Language switcher is rendered in the base layout; available on every portal page

8.8 Human-Readable Service Detail (Index Level 2 Entry)

The machine-readable service entry (GET /api/services/{id} returning JSON) and the human-readable portal page (GET /services/{id} returning HTML) represent the same data. The HTML version is designed for a human making a go/no-go decision about using a service — not for a machine parsing a schema.

Design principle: answer four questions in order, above the fold where possible:

  1. Who is this? — name, description
  2. Can I trust them? — O-level with plain-English explanation, liveness uptime, last-verified date
  3. What exactly does it do? — capabilities, pricing
  4. How do I call it? — endpoint, spec links, example snippet

Trust level presentation:

  • O-level and S-level are never shown as bare codes (O-2, S-1) to human visitors — always rendered as badge + level name + 2-sentence explanation
  • The explanation is locale-resolved from Messages (keys service.oLevel.N.description) — not hardcoded in the template
  • O-level badge color conveys confidence tier at a glance: grey (O-0), blue (O-1/O-2/O-3), green (O-4/O-5)
  • "Reference entry by BSF" badge is shown prominently when isReferenceEntry=true — prevents a human from mistaking a BSF-registered third-party service for one that has self-registered

Liveness presentation:

  • Status displayed as colored dot + label (LIVE / DEGRADED / UNREACHABLE) — not as an enum string
  • Uptime percentage and average response time are formatted human values ("98.4%", "142 ms") computed by ServiceDetailViewModelFactory, not raw floats
  • Last-checked timestamp shown relative ("8 minutes ago") with absolute ISO date in a <title> tooltip — humans read relative time faster; machines read absolute

Separation of concerns:

  • ServiceDetailViewModelFactory (portal module) owns all human-readable computation: relative timestamps, color class selection, O-level description lookup, GLEIF LEI URL construction
  • The Qute template (service.html) contains no business logic — it renders what the view model provides
  • The registry API is not changed for this feature; the portal fetches the existing full-detail endpoint and enriches the response client-side in the portal

Integration section (collapsible):

  • The raw endpoint URL and a minimal HTTP example are provided for developers who discover the service through the portal rather than via agent query
  • Link to GET /api/services/{id} (machine-readable JSON) is included — a developer can use the portal as a discovery UI and then switch to the machine API
  • This collapsible is closed by default to keep the human trust signals prominent